Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed defaults in make_forecasting_problem #1477

Merged
merged 15 commits into from
Oct 3, 2021
Merged

Changed defaults in make_forecasting_problem #1477

merged 15 commits into from
Oct 3, 2021

Conversation

aiwalter
Copy link
Contributor

@aiwalter aiwalter commented Oct 2, 2021

No description provided.

@aiwalter aiwalter requested a review from mloning as a code owner October 2, 2021 13:00
@aiwalter aiwalter requested a review from fkiraly October 2, 2021 13:01
@aiwalter aiwalter closed this Oct 2, 2021
@aiwalter aiwalter reopened this Oct 2, 2021
Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the right idea, but problematic in the specific execution.
A blocker is that the docstring is no longer correct.

I would also strongly recommend not making a change that breaks the interface, in general, but especially in this case specifically where we are already dealing with broken tests due to confusion about arguments of this specific function.

Here's my suggestion on how to deal with this:

  • introduce two new arguments, n_y_columns and n_X_columns with the obvious meanings.
  • the inner _make_series take these two arguments in the n_columns place
  • n_y_columns defaults to n_columns if that is passed, otherwise 1. This ensures downwards compatibility. n_columns gets a deprecation note/warning.
  • n_X_columns defaults to 2.
  • the docstrings are updated.

@fkiraly
Copy link
Collaborator

fkiraly commented Oct 3, 2021

Also, always include a descriptive PR message please.

@aiwalter
Copy link
Contributor Author

aiwalter commented Oct 3, 2021

Here quick answers for your points, in short my changes run well and I already merged them into #1083 which also passes now.

introduce two new arguments, n_y_columns and n_X_columns with the obvious meanings.

then we need to add this new params also to the other functions such as make_classification_problem, make_regression_problem, make_clustering_problem as they will raise otherwise an error, I faced this already with make_clustering_problem which didnt have n_columns param and couldnt handle the **kwargs. When we use **kwargs and iterate over different functions, the given args need to be used in the end anywhere in the downstream calls, otherwise there will be an error of unexpected param given.

the inner _make_series take these two arguments in the n_columns place

There was not any test which didnt default to n_columns=2 when returning X.

n_columns gets a deprecation note/warning.

In my opinion make_forecasting_problem is a private function, as it lives also in a private file from sktime.utils._testing.forecasting. Probably best to actually rename it also to _make_forecasting_problem. Is it therefore necessary to deprecate it with warning?

n_X_columns defaults to 2.

Yes, now it always defaults to 2, as there was not any test that actually used another value. So when anyone wants to create such a test, then it would be the time to change sth, but I would prefer to go with my slim solution for now, as it also does not affect other code/logic. Then we could do the bigger refactor in your PR?

the docstrings are updated.

Done, sry just forgot it.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, makes sense to me.

Is it correct that all tests that currently change n_columns have - incorrectly - made the assumption that this would in fact affect the number of columns of y, not of X?

@aiwalter
Copy link
Contributor Author

aiwalter commented Oct 3, 2021

I think n_columns was not even used before we started with multivariate y. At least I could not find such a test. So it always defaulted.

@fkiraly fkiraly merged commit 2a3c4b6 into sktime:main Oct 3, 2021
@fkiraly
Copy link
Collaborator

fkiraly commented Oct 3, 2021

looks good (skipped appveyor since #1083 contains this and tests passed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants