Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Use of private sklearn methods causing issues #4669

Closed
ngupta23 opened this issue Jun 2, 2023 · 22 comments
Closed

[BUG] Use of private sklearn methods causing issues #4669

ngupta23 opened this issue Jun 2, 2023 · 22 comments
Labels
bug Something isn't working maintenance Continuous integration, unit testing & package distribution
Projects

Comments

@ngupta23
Copy link
Contributor

ngupta23 commented Jun 2, 2023

Describe the bug

The following code in sktime calls a provate module in sklearn. This is causing sporadic issues in pycaret.

https://github.com/sktime/sktime/blob/b1a6c0d68a66f779aa0dffcea8ac2fc685613715/sktime/forecasting/compose/_bagging.py#L196-L197C76

To Reproduce

I have not been able to reproduce this myself, but several users have reported this issue recently.

image

Expected behavior
Would expect the imports to go through. Can we somehow avoid using private sklearn modules in sktime code in this instance.

Additional context

Versions

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 3, 2023

Thanks for reporting!

I thought we had eliminated all private imports of sklearn methods already, e.g., #2631

I suppose this remained under the radar since it's not a private function but a private module, so it won't be caught by a search-all for import _sthsth.

I've opened a general issue for detection and testing here: #4670

And let's deal with the specific bugs you reported in this issue, @ngupta23.

@fkiraly fkiraly added the maintenance Continuous integration, unit testing & package distribution label Jun 3, 2023
@fkiraly fkiraly added this to Needs triage & validation in Bugfixing via automation Jun 3, 2023
@fkiraly
Copy link
Collaborator

fkiraly commented Jun 3, 2023

@ngupta23, question: I'm not sure how to reproduce or test this. Do you have any insights from your users who experience problems, e.g., is there a piece of code that sporadically fails, or similar? What dependency conditions does this occur under?

@fkiraly fkiraly moved this from Needs triage & validation to Reproducing/confirming in Bugfixing Jun 3, 2023
@fkiraly
Copy link
Collaborator

fkiraly commented Jun 3, 2023

started here, @ngupta23: #4672

know any other instances?

@ngupta23
Copy link
Contributor Author

ngupta23 commented Jun 4, 2023

Thanks. That is the only one I am aware of right now. If I encounter anything else, I will let you know.

@ngupta23
Copy link
Contributor Author

@fkiraly Is this issue closed since the PR is merged?

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 16, 2023

there are other, unresolved instances of private imports:

  • _partition_estimators imports
  • from sklearn.ensemble._forest import-s

That's probably why it hasn't been so clear to me whether to close this or not.

Maybe we should close this as it originally related to the random state and open a tracker issue for the others?

@ngupta23
Copy link
Contributor Author

Yes, it would be better to get this in the next release (and address others as needed later). We have now seen 4 users report this error on pycaret (were 2 when I opened it)

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 16, 2023

sorry for that - have you checked whether the bugfix would solve it, e.g., by prerelease testing? Because if not, we need to drill deeper.

Btw, I remember now: the actual reason why I kept this open is that we need confirmation that this indeed resolves your (users') issue, @ngupta23 - we have/had no means of testing as there is no clear failure case.

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 16, 2023

Btw, @ngupta23, in the new bug reports, is there anything we could use as a reproducible failure case and a regression test?

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 16, 2023

& FYI, 0.19.2 is scheduled for the next days if all goes well.

@ngupta23
Copy link
Contributor Author

in the new bug reports, is there anything we could use as a reproducible failure case and a regression test?

No, I have been unable to do it. See pycaret/pycaret#3569 (comment)

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 16, 2023

have you been able to do a prerelease test whether current sktime main fixes it?

If you're not sure how, here's how I (ab)use the CI for testing with prereleases of sklearn and skbase:

Or, is it with users and not in the CI?

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 16, 2023

Why I'm asking: there's a few days until 0.19.2, so if it doesn't fix it, we lose 2-3 weeks or so until it gets fixed (assuming the best, i.e., that we can quickly find the actual error once we have the information that it's not this).

Is there a way for one of your users to install sktime main to see whether that resolves it?

@ngupta23
Copy link
Contributor Author

I have posted the request to the original posters to verify the failure by installing sktime from git

pycaret/pycaret#3605 (comment)
pycaret/pycaret#3569 (comment)
pycaret/pycaret#3582 (comment)

@ngupta23
Copy link
Contributor Author

See pycaret/pycaret#3569 (comment) for a potential solution.

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 23, 2023

Is this resolved with 0.19.2, @ngupta23?

@ngupta23
Copy link
Contributor Author

Don't know. None of the OPs have responded.

@fkiraly
Copy link
Collaborator

fkiraly commented Jun 26, 2023

well, that's usually a cautiously good sign if people stop complaining...

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 3, 2023

@ngupta23, any updates? Should/can we close this? Are the issues addressed?

@ngupta23
Copy link
Contributor Author

ngupta23 commented Jul 3, 2023

Out of the three who reported this issue, a couple responded and said that the upgrade did not resolve the issue (although did not provide the new error message).

pycaret/pycaret#3582 (comment) - No response
pycaret/pycaret#3569 (comment) - Incomplete response
pycaret/pycaret#3605 (comment) - Incomplete response

In the same threads, a couple of them reported the issue as resolved once they upgraded the threadpoolctl package

pip install --upgrade threadpoolctl

I think that the upgrade will at least resolve the previously reported error in this issue, so maybe you can close this for now. If it persists, we can evaluate it and reopen another issue.

@ngupta23
Copy link
Contributor Author

ngupta23 commented Jul 3, 2023

Summary of Possible Solutions:

  • Upgrade threadpool: pip install --upgrade threadpoolctl
  • Upgrade sktime: pip install sktime >= 0.19.2

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 3, 2023

ok, thanks for the update!

@fkiraly fkiraly closed this as completed Jul 3, 2023
Bugfixing automation moved this from Reproducing/confirming to Fixed/resolved Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working maintenance Continuous integration, unit testing & package distribution
Projects
Bugfixing
Fixed/resolved
Development

No branches or pull requests

2 participants