New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] tsbootstrap
adapter
#5887
Conversation
|
||
Parameters | ||
---------- | ||
tsbootstrapper : Bootstrapper from tsbootstrap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too long variable name imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is bootstrapper short enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe bootstrap
?
(that's the shorthand and scitype name I've used in the refactor,, and less strange in English than "bootstrapper")
"enforce_index_type": None, # index type that needs to be enforced in X/y | ||
"fit_is_empty": True, # is fit empty and can be skipped? Yes = True | ||
"transform-returns-same-time-index": False, | ||
"python_dependencies": "tsbootstrap", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>=0.1.0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean something like: "tsbootstrap>=0.1.0"? I tested it with "tsbootstrap>=0.0.32 and this is failing...
Not sure, but I think the dependency check is not correct here: https://github.com/sktime/sktime/blob/63a95f2c29251575b7963ad269044654cbe9d96c/sktime/utils/validation/_dependencies.py#L177C1-L178C1
bootstrapped_samples, index=multi_index, columns=X.columns | ||
) | ||
|
||
X.index = pd.MultiIndex.from_product([["original"], X.index]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we always want to include the original series?
I think it should be at most optional, parameter controlled, default=no
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm.. I used the behavior of the STLBootstrap
as template. This bootstrapper is including the original values (called actual
, I would rename to actual
in this PR)
from sktime.transformations.bootstrap import STLBootstrapTransformer
from sktime.datasets import load_airline
from sktime.utils.plotting import plot_series # doctest: +SKIP
y = load_airline() # doctest: +SKIP
transformer = STLBootstrapTransformer(10) # doctest: +SKI
y_hat = transformer.fit_transform(y)
Number of airline passengers
series_id time_index
actual 1949-01 112.000000
1949-02 118.000000
1949-03 132.000000
1949-04 129.000000
1949-05 121.000000
... ...
synthetic_9 1960-08 630.055938
1960-09 505.544319
1960-10 459.539478
1960-11 407.284782
1960-12 469.909948
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm.. I used the behavior of the
STLBootstrap
as template. This bootstrapper is including the original values (calledactual
, I would rename toactual
in this PR)
Yes, but I think that is a special choice rather than user expected, and it complicates the interface. It is also not sth that tsbootstrap
does (and imo it's the better design choice) - FYI @astrogilda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look on the other bootstrapped in sktime and check how they are behaving. But in that case, I think it would be good to streamline the behavior using deprecation cycles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are probably right. Safest perhaps is add a parameter include_actual
or similar and switch its default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we open an issue? I am guessing that all bootstrapping transformers are in the bootstrap
folder.
f"synthetic_{i}" | ||
for i in range(self.tsbootstrapper.config.n_bootstraps) | ||
], | ||
X.index, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will break, since in genera the bootstrap samples are shorter.
I would leave range index if we cannot come up with sth better.
How should we deal with this failure?
|
c3dcede
to
7cdcbf5
Compare
updated source after |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made few comments, only serious one is the __init__
one.
""" | ||
from sktime.utils.validation._dependencies import _check_soft_dependencies | ||
|
||
deps = cls.get_class_tag("python_dependencies") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just observing it's probably being done directly against the package name in other adapters. But this may be better, as future changes in dependencies won't affect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we should do _check_estimator_deps(self)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self is not available since get_test_params is a classmethod. However, we can instanitate a tsbootstrap for testing.. But I do not really like to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, then cls
works too, as arg. No need to instantiate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments were addressed, thanks.
I don't understand the failed job. The status is failure, but it appears it's still running. I'm very confused, do you understand what's happening?
Yes, that's weird. It displayed as still running, but GHA behaved as if it were failed. I restarted, and now it's gone. |
Reference Issues/PRs
Do not merge before release related issues are fixed in astrogilda/tsbootstrap
closes #5880
closes astrogilda/tsbootstrap#45
What does this implement/fix? Explain your changes.
Add an adapter for tsbootstrap
Does your contribution introduce a new dependency? If yes, which one?
tsbootstrap
What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
Any other comments?
PR checklist
For all contributions
How to: add yourself to the all-contributors file in the
sktime
root directory (not theCONTRIBUTORS.md
). Common badges:code
- fixing a bug, or adding code logic.doc
- writing or improving documentation or docstrings.bug
- reporting or diagnosing a bug (get this pluscode
if you also fixed the bug in the PR).maintenance
- CI, test framework, release.See here for full badge reference
For new estimators
docs/source/api_reference/taskname.rst
, follow the pattern.Examples
section.python_dependencies
tag and ensureddependency isolation, see the estimator dependencies guide.