Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] FallbackForecaster fails with ForecastByLevel when nan_predict_policy='raise' #6231

Merged
merged 13 commits into from Apr 11, 2024

Conversation

ninedigits
Copy link
Contributor

Fixes error that occurs when FallbackForecaster is combined with ForecastByLevel and nan_predict_policy='raise'
#6230

CONTRIBUTORS.md Outdated Show resolved Hide resolved
@ninedigits ninedigits changed the title [BUG] FallbackForecaster fails with ForecastByLevel when nan_predict_… [BUG] FallbackForecaster fails with ForecastByLevel when nan_predict_policy='raise' Mar 29, 2024
@fkiraly fkiraly added module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting bugfix Fixes a known bug or removes unintended behavior labels Apr 2, 2024
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major concerns, but a quick question:

why is the original line not working?

I'm concerned that the tests did not catch this.

@ninedigits
Copy link
Contributor Author

None of the original tests were designed to handle MultiIndex DataFrames. In scenarios where y_pred is typically a Series, it functions as expected; however, it becomes a DataFrame when the input series has a MultiIndex structure.

In the first case, where y_train is a DataFrame with a single index, calling y_pred.isna().any() returns a boolean, which works fine for conditional checks. However, in the second case, where y_train has a MultiIndex, as in with hierarchical dataframes, the same call results in a Series instead of a single boolean value. This causes an exception because boolean logic operations can't be directly applied to a Series.

@ninedigits
Copy link
Contributor Author

On a side note, I see a notification for changes requested but I don't see any actual change requests to the code. Am I missing something?

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see! I understand now.

The logic attached to nan_predict_policy required pd.DataFrame format.

Then this indicates a bigger, unreported bug, which we should try to fix, let me explain.

For y_inner_mtype, X_inner_mtype, we allowed ALL_TIME_SERIES_MTYPES, because the exception handling logic did not require any specific type. E.g., 3D numpy arrays, weird nested pd.DataFrame, and dask array were all fine.

When you added the nan handling policy, you assumed pd.DataFrame, and this was fine since the regular testing scenarios only resulted in few types to be passed.

There are a few ways to fix the issue:

  1. if nan_handling_policy is nof ignore, in __init__, we set the X_inner_mtype and y_inner_mtype via self.set_tags to the types supported, presumbaly ["pd.DataFrame", "pd-multiindex", "pd_multiindex_hier"]. This ensures that in _predict and _fit, you only ever see these types.
  2. or, leave the types as they are, and ensure the logic is correct for all data types.

I would say that 2. is infeasible, so we ought to do 1.?

Regarding test coverage, this shows a gap in our testing framework - estimators aren't tested with all data types they could get.

Are you certain that the hierarchical type causes the issues? Then we may have to add a forecaster tests with a hierarchical example. I thought this was covered.

On a side note, I see a notification for changes requested but I don't see any actual change requests to the code. Am I missing something?

No, it is possible to make change requests without reference to code lines, e.g., in general questions like these.

@ninedigits
Copy link
Contributor Author

ninedigits commented Apr 2, 2024

Are you certain that the hierarchical type causes the issues? Then we may have to add a forecaster tests with a hierarchical example. I thought this was covered.

Pretty sure, I'm attaching a screenshot of the debug window to highlight:

Screenshot 2024-04-02 at 2 50 09 PM

Forecasting hierarchical data (in the form of a dataframe) using ForecastByLevel, the method forecaster.predict() produces a dataframe as the output. For series inputs (eg when forecasting nonhierarchical data), y_pred will be returned as a series.

The original error was that the boolean logic I was applying to determine the presence of nans works differently on dataframes than series. When you say:

Then this indicates a bigger, unreported bug, which we should try to fix, let me explain.

Do you mean with FallbackForecaster or with other estimators in general?

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 2, 2024

Do you mean with FallbackForecaster or with other estimators in general?

Only with FallbackForecaster, and a possible solution is discussed above.

Pretty sure, I'm attaching a screenshot of the debug window to highlight:

Could you either post the code that reproduces the issue (assuming you have it handy), or add it directly as a test case to TestAllForecasters? Using estimator_instance only as a variable, and preferable in a separate PR, as this will trigger tests on all forecasters. That way, we can also see whether it picks up FallbackForecaster.

(and, perhaps with a smaller example, perhaps only 2 x 2 x 2 levels?)

@ninedigits
Copy link
Contributor Author

Only with FallbackForecaster, and a possible solution is discussed above.

Unless I'm misunderstanding something here I think the error is a pretty simple fix.

This test reproduces the error:

def test_forecastbylevel_nan_predict():
    from sktime.forecasting.compose import ForecastByLevel
    from sktime.utils._testing.hierarchical import _make_hierarchical

    df = _make_hierarchical(
        hierarchy_levels=(2, 2, 2),
        max_timepoints=10,
        min_timepoints=10,
        same_cutoff=True,
        n_columns=1,
        all_positive=True,
        index_type="period",
        random_state=0,
        add_nan=False,
    )
    forecaster1 = DummyForecaster(raise_at=None, predict_nans=True)
    forecaster2 = NaiveForecaster()
    forecaster = ForecastByLevel(
        FallbackForecaster(
            [
                ("forecaster1_pred_nans", forecaster1),
                ("forecaster2_expected_", forecaster2),
            ],
            nan_predict_policy="raise",
        )
    )
    fh = [1, 2, 3]
    forecaster.fit(y=df, fh=fh)
    y_pred_actual = forecaster.predict()

    forecaster2.fit(y=df, fh=[1, 2, 3])
    y_pred_expected = forecaster2.predict()

    pd.testing.assert_frame_equal(y_pred_expected, y_pred_actual)

Error:

>           self._validate_y_pred(y_pred)

X          = None
fh         = ForecastingHorizon([1, 2, 3], dtype='int64', is_relative=True)
self       = FallbackForecaster(forecasters=[('forecaster1_pred_nans',
                                 DummyForecaster(predict_nan...                          ('forecaster2_expected_', NaiveForecaster())],
                   nan_predict_policy='raise')
y_pred     =                c0
2000-11       NaN
2000-12  3.963588
2001-01  3.963588

../_fallback.py:269: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../_fallback.py:160: in _validate_y_pred
    if has_nans:
        has_nans   = c0    True
dtype: bool
        self       = FallbackForecaster(forecasters=[('forecaster1_pred_nans',
                                 DummyForecaster(predict_nan...                          ('forecaster2_expected_', NaiveForecaster())],
                   nan_predict_policy='raise')
        y_pred     =                c0
2000-11       NaN
2000-12  3.963588
2001-01  3.963588
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = c0    True
dtype: bool

    @final
    def __nonzero__(self) -> NoReturn:
>       raise ValueError(
            f"The truth value of a {type(self).__name__} is ambiguous. "
            "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
        )
E       ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 3, 2024

Unless I'm misunderstanding something here I think the error is a pretty simple fix.

I think you are, the issue is that users can pass any of 10 data formats to forecasters, and they are converted if not internally supported. This PR fixes the issue for one additional format; the programmatic way would be to set the tag.

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 3, 2024

Here's another option: instead of the custom nan checking logic, we could use check_is_mtype against the expected mtype, and request only the has_nans property.

The expected mtype is in self._y_mtype_last_seen.

This might be the simplest way to fix this?

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 3, 2024

Please let me know if this sounds confusing and I can try to add the fix. It's a few lines, given the discussion.

@ninedigits
Copy link
Contributor Author

Please let me know if this sounds confusing and I can try to add the fix. It's a few lines, given the discussion.

There's a few suggestions here, let me start with the earlier one

When you added the nan handling policy, you assumed pd.DataFrame, and this was fine since the regular testing scenarios only resulted in few types to be passed.

Just to clarify, we expected a series or a dataframe here, the hiearchical dataframe is where issues arose. If we were to set the mtypes in init, as in your initial suggestion, would it look something like this?

    def __init__(self, forecasters, verbose=False, nan_predict_policy="ignore"):
        super().__init__()

        self.forecasters = forecasters
        self.current_forecaster_ = None
        self.current_name_ = None
        self.verbose = verbose
        self.nan_predict_policy = _check_nan_policy_option(nan_predict_policy)
        if self.nan_predict_policy != "ignore":
            allowed_mtypes = ["pd.DataFrame", "pd-multiindex", "pd_multiindex_hier"]
            self.set_tags(**dict(
                y_inner_mtype=allowed_mtypes,
                x_inner_mtype=allowed_mtypes
            ))

I think I'm missing something, because it doesn't seem to change the behavior from my point of view, eg if I remove pd_multiindex_hier, the code still executes when using hierarchical data. I might be getting ahead of myself here.

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 3, 2024

If we were to set the mtypes in init, as in your initial suggestion, would it look something like this?

Yes, that's how it would look like.

I think my last solution is better though, using check_is_mtype for the "has nan" check, that covers all data types and avoids conversions.

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 3, 2024

the code still executes when using hierarchical data.

which code did you expect not to execute?

@ninedigits
Copy link
Contributor Author

the code still executes when using hierarchical data.

which code did you expect not to execute?

I think I understand now. I was assuming that somewhere in the code, an exception would be raised if the wrong data type was passed in, eg within BaseForecaster. So I was playing around with set_tag with the different datatypes, making it more restrictive to see if exceptions were raised. But I'm looking through the code now and I think what you're getting at is that we'd have an additional check within validate_y_pred and the exception would need to be programmed there. That way, as the system evolves, and new data types are passed in, it points to that specific area to review and add those test cases in. Is that right?

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 4, 2024

yes, exactly.
Would you like me to add the logic with check_is_mtype, or would you like to give it a try?

@ninedigits
Copy link
Contributor Author

yes, exactly. Would you like me to add the logic with check_is_mtype, or would you like to give it a try?

I think I get it. Would it look something like this?

    def _validate_y_pred(self, y_pred):
        if self.nan_predict_policy in ("warn", "raise"):
            last_mtype = self._y_mtype_last_seen
            expected_mtypes = ["pd.DataFrame", "pd-multiindex", "pd_multiindex_hier"]
            passed_mcheck = check_is_mtype(
                y_pred,
                expected_mtypes
            )
            if not passed_mcheck:
                msg = "`nan_predict_policy` expects mtype data to be one of " \
                      f"{expected_mtypes} but intead got {last_mtype}. If you think " \
                      f"FallbackForecaster's nan_predict_policy should be able to " \
                      f"handle this datatype, raise an issue on sktime"
                raise NotImplementedError(msg)

I'm not sure what's the best way to add a test case though. What other types of data are available to forecast with? I didn't see any obvious resource online to read more about it.

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 4, 2024

not exactly, I was thinking to use the check to get the metadata field has_nans, i.e.,

    def _validate_y_pred(self, y_pred):
        if self.nan_predict_policy in ("warn", "raise"):
            last_mtype = self._y_mtype_last_seen
            _, _, metadata = check_is_mtype(
                y_pred,
                mtype=last_mtype,
                return_metadata=["has_nans"],
            )
            has_nans = metadata["has_nans"]

Now you have a variable has_nans that tells you - for any data container that you might see, pd.DataFrame, pd.Series, or anything exotic, whether the container has NA.

You can now use this in further logic in place of isnan.all and many case distinctions.

Noting, we know that the mtpe should be _y_mtype_last_seen, if the base class interface is not buggy (because "same mtype" is guaranteed).
We are "abusing" the data checker to get the metadata which we are interested in.

@ninedigits
Copy link
Contributor Author

ninedigits commented Apr 4, 2024

So I tried that and it's failed the below unit test:

def test_fallbackforecaster_predict_nan():
    """Test FallbackForecaster raise if nans in predict"""
    y = make_forecasting_problem(random_state=42)
    forecaster1 = DummyForecaster(raise_at="fit")
    forecaster2 = DummyForecaster(raise_at=None, predict_nans=True)
    forecaster3 = PolynomialTrendForecaster()
    forecaster4 = NaiveForecaster()

    forecaster = FallbackForecaster(
        [
            ("forecaster1_fails_fit", forecaster1),
            ("forecaster2_pred_nans", forecaster2),
            ("forecaster3_succeeded", forecaster3),
            ("forecaster4_notcalled", forecaster4),
        ],
        nan_predict_policy="raise",
    )
    forecaster.fit(y=y, fh=[1, 2, 3])
    y_pred_actual = forecaster.predict()

    forecaster3.fit(y=y, fh=[1, 2, 3])
    y_pred_expected = forecaster3.predict()

    # Assert correct forecaster name
    name = forecaster.current_name_
    assert name == "forecaster3_succeeded"

    # Assert correct y_pred
    pd.testing.assert_series_equal(y_pred_expected, y_pred_actual)

    # Assert correct number of expected exceptions
    exceptions_raised = forecaster.exceptions_raised_
    assert len(exceptions_raised) == 2

    # Assert the correct forecasters failed
    names_raised_actual = [
        vals["forecaster_name"] for vals in exceptions_raised.values()
    ]
    names_raised_expected = ["forecaster1_fails_fit", "forecaster2_pred_nans"]
    assert names_raised_actual == names_raised_expected

Looks like the training data is a pandas series and the prediction is a pandas dataframe:

Screenshot 2024-04-04 at 12 51 22 PM

Is this a larger issue?

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 5, 2024

perhaps - could you kindly check, which forecaster returns the unexpected type?

@ninedigits
Copy link
Contributor Author

ninedigits commented Apr 5, 2024

perhaps - could you kindly check, which forecaster returns the unexpected type?

I think there's a bug either in DummyForecaster or FallbackForecaster. I have some time to investigate today, will update this comment when I have more information.

UPDATE:

Found the bug, it was because I had left the following programmed into the init.

    def __init__(self, forecasters, verbose=False, nan_predict_policy="ignore"):
        super().__init__()

        self.forecasters = forecasters
        self.current_forecaster_ = None
        self.current_name_ = None
        self.verbose = verbose
        self.nan_predict_policy = _check_nan_policy_option(nan_predict_policy)
        if self.nan_predict_policy in ("ignore", "warn"):
            allowed_mtypes = [
                "pd.DataFrame", "pd-multiindex", "pd_multiindex_hier", 
                # "pd.Series" # This line was missing from the init
            ]
            self.set_tags(**dict(
                y_inner_mtype=allowed_mtypes,
                x_inner_mtype=allowed_mtypes
            ))

I'm not very familiar with the handling mechanisms for mtype data, so I'm uncertain if the behavior observed with estimator.predict() when the input data is a pd.Series and y_inner_mtype is set to "pd.DataFrame", "pd-multiindex", or "pd_multiindex_hier" is expected. It seems that setting these options coerces the output of estimator.predict() to pd.DataFrame, and adding in a pd.Series fixed the bug. Additionally, removing those lines altogether also resolved the bug.

@fkiraly, would it be appropriate to remove this line and maintain the logic as described in your suggestion, i.e., by adding the check_mtype in the validation check method?

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 5, 2024

Ah, of course, that was it - the two fixes should not be applied at the same time.

@fkiraly, would it be appropriate to remove this line and maintain the logic as described in your suggestion, i.e., by adding the check_mtype in the validation check method?

Yes, that is imo indeed the fix to the secondary bug - simply remove any additional logic from __init__.

@ninedigits
Copy link
Contributor Author

@fkiraly Any more updates needed on my end or are we good to go? I saw that the check failed earlier but I wasn't sure if it was an actual issue with the code.

@fkiraly
Copy link
Collaborator

fkiraly commented Apr 9, 2024

let's see - I'll restart the tests

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@fkiraly fkiraly merged commit 105bd55 into sktime:main Apr 11, 2024
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Fixes a known bug or removes unintended behavior module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants