Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] FourierFeatures with pd.DatetimeIndex #3605

Closed
eenticott-shell opened this issue Oct 18, 2022 · 5 comments · Fixed by #3606
Closed

[BUG] FourierFeatures with pd.DatetimeIndex #3605

eenticott-shell opened this issue Oct 18, 2022 · 5 comments · Fixed by #3606
Labels
bug Something isn't working module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing

Comments

@eenticott-shell
Copy link
Collaborator

eenticott-shell commented Oct 18, 2022

Describe the bug
When using data with a pd.DatetimeIndex in the FourierFeatures transformer, it gets converted into nanoseconds. This means you have to specify all of your sp in terms of nanoseconds which is not clear.

To Reproduce

from sktime.datasets import load_airline
from sktime.transformations.series.fourier import FourierFeatures
from sktime.utils.plotting import plot_series

df = load_airline()
df.index = df.index.to_timestamp(freq="M")
from sktime.transformations.series.fourier import FourierFeatures
F_transform = FourierFeatures(sp_list = [12], fourier_terms_list=[2]) # should be yearly
sp1 = F_transform.fit_transform(df)

# Convert seasonality to nanoseconds to get correct result
F_transform = FourierFeatures(sp_list = [12*30*24*3600000000000], fourier_terms_list=[2])
sp2 = F_transform.fit_transform(df)

plot_series(sp1["sin_12_1"], sp2["sin_31104000000000000_1"])

Expected behavior
Would expect it to work from the frequency of the index in the DatetimeIndex case.

Additional context
Have added functionality for the expected behaviour in #3606.

@eenticott-shell eenticott-shell added the bug Something isn't working label Oct 18, 2022
@TonyBagnall TonyBagnall added the module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing label Oct 19, 2022
@ltsaprounis
Copy link
Contributor

Thanks for spotting this @eenticott-shell.
This is indeed an annoying bug!

On a side note @fkiraly, isn't the "enforce_index_type" tag, supposed to enforce the index type?
Cause it looks like although in the tags we have "enforce_index_type": [pd.PeriodIndex] an evil pd.Datatime index manages to wiggle its way into _fit and _predict 😮.

FYI - @topher-lo and @danbartl for our discussion on enforcing index types earlier.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 21, 2022

@ltsaprounis, the enforce_index_type is coming from ancient code that was refactored into the tag system to have 1:1 logic parity.

It does not coerce (which it probably should, from a design perspective) but instead acts as a "pass list": the boilerplate is supposed to raise an exception for index types not on the pass list - at least this is my understanding and what the contract description in the extension template says.

If it lets other things pass (that are not subtypes), the actual logic does not agree with the contract, and it would be a bug.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 21, 2022

This means you have to specify all of your sp in terms of nanoseconds which is not clear.

All is quantum physics

@ltsaprounis
Copy link
Contributor

OK - we have ourselves a bug then!

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 24, 2022

🐛 😱

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants