[BUG] `Detrender` returns incorrect result when using `inverse_transform` #4144

KishManani · 2023-01-24T22:14:47Z

Describe the bug
The inverse_transform is not returned correctly when passing a Series type which has a different name from the DataFrame or Series used during fit. Instead two columns which are null are returned. Another condition to trigger this bug appears to be when the user supplies data with a different time index during fit and inverse_transform.

To Reproduce

from sktime.datasets import load_airline
from sktime.transformations.series.detrend import Detrender
df = load_airline().rename("y").to_frame()

detrender = Detrender()
# Fit on some time period
detrender.fit(df.loc[:"1949-03", "y"])
# De-trend on a different time period
y_detrended = detrender.transform(df.loc["1949-04":, "y"])
# Inverse transform (i.e., add the trend back)
detrender.inverse_transform(y_detrended)

This behaviour only incurs when the input is a Series. If I were to wrap "y" above in a list, ["y"], so that a DataFrame is instead passed to fit and transform, then we do not get this bug.

Expected behavior
A single column with the trend added back resulting in the same data as the original dataframe.

Versions

System:
python: 3.8.7 (default, May 6 2021, 21:53:45) [Clang 12.0.0 (clang-1200.0.32.29)]
executable: /Users/kishan_manani/.pyenv/versions/3.8.7/envs/udemy-ts/bin/python3.8
machine: macOS-11.6.6-x86_64-i386-64bit

Python dependencies:
pip: 22.3.1
setuptools: 49.2.1
sklearn: 1.2.0
sktime: 0.15.1
statsmodels: 0.13.2
numpy: 1.21.6
scipy: 1.8.1
pandas: 1.5.2
matplotlib: 3.6.2
joblib: 1.2.0
numba: 0.56.4
pmdarima: 1.8.5
tsfresh: 0.19.0

The text was updated successfully, but these errors were encountered:

fkiraly · 2023-01-26T18:42:09Z

Confirmed on python 3.10, windows.

I think it is unclear whether there is expected behaviour - i.e., whether a series of different name should be, by default treated as a different column.

The "logical root case" imo is that the name gets lost after transform - is it?

fkiraly · 2023-01-26T18:43:37Z

Also, slightly confused - I thought we test for all forecasters that they preserve the name attribute for pd.Series - this has been a source of problems previously:
#3323

fkiraly · 2023-01-26T18:59:25Z

hm, reason is that convert_MvS_to_UvS_as_Series, in line 89, deletes the name of any pd.Series output after conversion brom pd.DataFrame.
That's interesting, and I vaguely remember this had to be added due to some conversion breakage.

Let's see what happens if I remove this...
#4150
(fixes the problem in this issue, but may have other side effects, e.g., when converting pd.DataFrame to pd.Series and back)

fkiraly · 2023-01-26T19:03:53Z

Another condition for trigger this bug appears to be when the user supplies data with a different time index during fit and inverse_transform.

Can you give an example for this? This should not happen, as the Detrender is tested indirectly via pipelines.

fkiraly · 2023-01-27T17:30:42Z

Another attempted fix is to remove the name at the start and potentially add it back at the end, so internal usage never sees the pd.Series name attr:
#4155

fkiraly · 2023-01-27T22:30:50Z

You've opened a can of worms @KishManani.
Looks like the name attribute is inconsistently handled, the problem is that tests did not cover this properly (but I thought they were).

See here for an explanation why it looked like it was covered but wasn't, and a test that adds this coverage:
#4157

See here for various related fixes:
#4150
#4155
#4161

if you have time, would be interested to hear which approach looks more promising.

And, whether there will be unintended side effects of the fix...

KishManani · 2023-01-30T22:39:48Z

Can you give an example for this? This should not happen, as the Detrender is tested indirectly via pipelines.

I'm unable to reproduce the error so this is likely a mistake on my end.

fkiraly · 2023-01-30T23:29:33Z

I think the combination of PR merged together in #4150 now fixes this.

Adds a test, `test_predict_series_name_preserved`, to the forecasting test suite, to test that the `name` attribute of `pd.Series` is preserved in `fit`/`predict`. A test for detecting the general issue at the root of #4144, which currently was not covered by the forecasting test suite, as the scenarios containing `pd.Series` with non-None `name` attribute did not interact with the tests in which the time index and name were checked for consistency via `_assert_correct_pred_time_index`. If this works, this should surface a number of cases with the problem, among them `NaiveForecaster` (where this surfaced through manual testing in the context of #4150 failures)

…4161) This PR collects fixes for a number of forecasters to preserve the `name` attribute in `predict`. See #4144. The issues were found by #4157. Does *not* contain changes to the conversion logic, or tests (e.g., #4157). Testing is via merging this PR into the options dealing with the `name` attribute. Contains fixes for: * `ARDL` * `ARIMA` and `AutoARIMA` * `AutoEnsembleForecaster` * `AutoETS` * `Croston` * `NaiveForecaster` * `OnlineEnsembleForecaster` * reducers * `SquaringResiduals` * `StackingForecaster` * `StatsForecastAutoARIMA` * `STLForecaster` * `ThetaModularForecaster` * `TrendForecaster` and `PolynomialTrendForecaster` As tempting as it might be to add the fix to `BaseForecaster` somewhere in `predict`: whilt it might be DRY-er, I think it would muddle concerns. If at all, this should go in `convert`-like boilerplate rather than in the base class itself, but unclear how that would look like.

…to/from `pd.DataFrame` and `np.ndarray`, as `Series` scitype (#4150) This PR ensures that the `pd.Series` `name` attribute is preserved when converting to and back from `pd.DataFrame` or `np.ndarray` under the `Series` scitype. Fixes #4144. Currently, the back-conversion intentionally always reset the `name` attribute to `None`, which could lead to unexpected behaviour of some estimators, and user frustration, see #4144 Simply removing the "set to None" breaks other things, in the case of conversion `pd.DataFrame` to `pd.Series` and back. (experimental PR until tests are added and it is ensured that nothing else breaks) Depends on: * #4157 for testing * #4161 for fixing uncovered bugs via #4157

KishManani added the bug Something isn't working label Jan 24, 2023

fkiraly added this to Needs triage & validation in Bugfixing via automation Jan 26, 2023

fkiraly mentioned this issue Jan 26, 2023

[BUG] ensure pd.Series name attribute is preserved in conversion to/from pd.DataFrame and np.ndarray, as Series scitype #4150

Merged

fkiraly added module:transformations transformations module: time series transformation, feature extraction, pre-/post-processing module:datatypes datatypes module: data containers, checkers & converters labels Jan 26, 2023

fkiraly moved this from Needs triage & validation to Investigating in Bugfixing Jan 26, 2023

fkiraly mentioned this issue Jan 27, 2023

[ENH] consistent internal removal of pd.Series name #4155

Closed

This was referenced Jan 27, 2023

[ENH] test that forecasters preserve name attr of pd.Series #4157

Merged

[ENH] fixes for forecasters to retain name attribute in predict #4161

Merged

fkiraly closed this as completed in #4150 Feb 1, 2023

Bugfixing automation moved this from Investigating to Fixed/resolved Feb 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] `Detrender` returns incorrect result when using `inverse_transform` #4144

[BUG] `Detrender` returns incorrect result when using `inverse_transform` #4144

KishManani commented Jan 24, 2023 •

edited

fkiraly commented Jan 26, 2023

fkiraly commented Jan 26, 2023

fkiraly commented Jan 26, 2023 •

edited

fkiraly commented Jan 26, 2023

fkiraly commented Jan 27, 2023

fkiraly commented Jan 27, 2023 •

edited

KishManani commented Jan 30, 2023

fkiraly commented Jan 30, 2023

[BUG] Detrender returns incorrect result when using inverse_transform #4144

[BUG] Detrender returns incorrect result when using inverse_transform #4144

Comments

KishManani commented Jan 24, 2023 • edited

fkiraly commented Jan 26, 2023

fkiraly commented Jan 26, 2023

fkiraly commented Jan 26, 2023 • edited

fkiraly commented Jan 26, 2023

fkiraly commented Jan 27, 2023

fkiraly commented Jan 27, 2023 • edited

KishManani commented Jan 30, 2023

fkiraly commented Jan 30, 2023

[BUG] `Detrender` returns incorrect result when using `inverse_transform` #4144

[BUG] `Detrender` returns incorrect result when using `inverse_transform` #4144

KishManani commented Jan 24, 2023 •

edited

fkiraly commented Jan 26, 2023 •

edited

fkiraly commented Jan 27, 2023 •

edited