New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Detrender
returns incorrect result when using inverse_transform
#4144
Comments
Confirmed on python 3.10, windows. I think it is unclear whether there is expected behaviour - i.e., whether a series of different name should be, by default treated as a different column. The "logical root case" imo is that the name gets lost after |
Also, slightly confused - I thought we test for all forecasters that they preserve the name attribute for |
hm, reason is that Let's see what happens if I remove this... |
Can you give an example for this? This should not happen, as the |
Another attempted fix is to remove the |
You've opened a can of worms @KishManani. See here for an explanation why it looked like it was covered but wasn't, and a test that adds this coverage: See here for various related fixes: if you have time, would be interested to hear which approach looks more promising. And, whether there will be unintended side effects of the fix... |
I'm unable to reproduce the error so this is likely a mistake on my end. |
I think the combination of PR merged together in #4150 now fixes this. |
Adds a test, `test_predict_series_name_preserved`, to the forecasting test suite, to test that the `name` attribute of `pd.Series` is preserved in `fit`/`predict`. A test for detecting the general issue at the root of #4144, which currently was not covered by the forecasting test suite, as the scenarios containing `pd.Series` with non-None `name` attribute did not interact with the tests in which the time index and name were checked for consistency via `_assert_correct_pred_time_index`. If this works, this should surface a number of cases with the problem, among them `NaiveForecaster` (where this surfaced through manual testing in the context of #4150 failures)
…4161) This PR collects fixes for a number of forecasters to preserve the `name` attribute in `predict`. See #4144. The issues were found by #4157. Does *not* contain changes to the conversion logic, or tests (e.g., #4157). Testing is via merging this PR into the options dealing with the `name` attribute. Contains fixes for: * `ARDL` * `ARIMA` and `AutoARIMA` * `AutoEnsembleForecaster` * `AutoETS` * `Croston` * `NaiveForecaster` * `OnlineEnsembleForecaster` * reducers * `SquaringResiduals` * `StackingForecaster` * `StatsForecastAutoARIMA` * `STLForecaster` * `ThetaModularForecaster` * `TrendForecaster` and `PolynomialTrendForecaster` As tempting as it might be to add the fix to `BaseForecaster` somewhere in `predict`: whilt it might be DRY-er, I think it would muddle concerns. If at all, this should go in `convert`-like boilerplate rather than in the base class itself, but unclear how that would look like.
…to/from `pd.DataFrame` and `np.ndarray`, as `Series` scitype (#4150) This PR ensures that the `pd.Series` `name` attribute is preserved when converting to and back from `pd.DataFrame` or `np.ndarray` under the `Series` scitype. Fixes #4144. Currently, the back-conversion intentionally always reset the `name` attribute to `None`, which could lead to unexpected behaviour of some estimators, and user frustration, see #4144 Simply removing the "set to None" breaks other things, in the case of conversion `pd.DataFrame` to `pd.Series` and back. (experimental PR until tests are added and it is ensured that nothing else breaks) Depends on: * #4157 for testing * #4161 for fixing uncovered bugs via #4157
Describe the bug
The
inverse_transform
is not returned correctly when passing aSeries
type which has a differentname
from theDataFrame
orSeries
used duringfit
. Instead two columns which are null are returned. Another condition to trigger this bug appears to be when the user supplies data with a different time index duringfit
andinverse_transform
.To Reproduce
This behaviour only incurs when the input is a
Series
. If I were to wrap"y"
above in a list,["y"]
, so that aDataFrame
is instead passed tofit
andtransform
, then we do not get this bug.Expected behavior
A single column with the trend added back resulting in the same data as the original dataframe.
Versions
System:
python: 3.8.7 (default, May 6 2021, 21:53:45) [Clang 12.0.0 (clang-1200.0.32.29)]
executable: /Users/kishan_manani/.pyenv/versions/3.8.7/envs/udemy-ts/bin/python3.8
machine: macOS-11.6.6-x86_64-i386-64bit
Python dependencies:
pip: 22.3.1
setuptools: 49.2.1
sklearn: 1.2.0
sktime: 0.15.1
statsmodels: 0.13.2
numpy: 1.21.6
scipy: 1.8.1
pandas: 1.5.2
matplotlib: 3.6.2
joblib: 1.2.0
numba: 0.56.4
pmdarima: 1.8.5
tsfresh: 0.19.0
The text was updated successfully, but these errors were encountered: