Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Introducing test file for polars support for estimators #370

Merged
merged 27 commits into from
Jun 22, 2024

Conversation

julian-fong
Copy link
Contributor

Reference Issues/PRs

#342 is the linked issue that overlays the entire workflow for polars support. Opening this thread mainly for the test file I am planning to implement for polars support in skpro

What does this implement/fix? Explain your changes.

Implements a test file for testing various fit/predict functions using polars dataframes. Ideas for any tests regarding testing polars support in skpro is much appreciated!

Does your contribution introduce a new dependency? If yes, which one?

No, it does not introduce a new dependency

What should a reviewer concentrate their feedback on?

I was going through a couple regression estimators glm , and quickly realized that the underlying dependency statsmodels currently do not support polars dataframes as inputs for their fit functions. Since that is the case, it seems like conversion between polars to pandas must be necessary in order to allow fitting (and I would assume predicting) to happen.

@fkiraly Would this be the correct approach to deal with estimators that do not potentially have underlying polars support in them?

@julian-fong
Copy link
Contributor Author

a follow up question: what is the expectation if currently some of the estimators are coded to return pandas dataframes. As an example, in regression.mapie, the _predict function turns y_pred into a pandas Dataframe and returns that as an output.

Should we approach writing this test file under the assumption that all estimators have the capability/functionality to return polars Dataframes?

skpro/tests/test_polars.py Outdated Show resolved Hide resolved
@fkiraly
Copy link
Collaborator

fkiraly commented May 31, 2024

what is the expectation if currently some of the estimators are coded to return pandas dataframes. As an example, in regression.mapie, the _predict function turns y_pred into a pandas Dataframe and returns that as an output.

afaik, the boilerplate layer, i.e., BaseProbaRegerssor.predict, will turn it back into a polars.DataFrame if that was passed initially to fit. In the tests, I would check that this is indeed the case.

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 4, 2024

let me know if/when you would like a review (even if not finished)

skpro/tests/test_polars.py Outdated Show resolved Hide resolved
skpro/tests/test_polars.py Outdated Show resolved Hide resolved
skpro/tests/test_polars.py Outdated Show resolved Hide resolved
@julian-fong
Copy link
Contributor Author

julian-fong commented Jun 8, 2024

Some of the CLI failures look weird - they are just long empty pages.

@julian-fong
Copy link
Contributor Author

If i am trying to test an estimator say in test_polars_eager_regressor_in_fit_predict, what is the best approach to do so?

skpro/tests/test_polars.py Outdated Show resolved Hide resolved
@fkiraly fkiraly added enhancement module:datatypes datatypes module: data containers, checkers & converters labels Jun 22, 2024
@fkiraly fkiraly merged commit 7808d0c into sktime:main Jun 22, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement module:datatypes datatypes module: data containers, checkers & converters
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants