Skip to content

FIX update create_memmap_backed_data to work on any estimator object via joblib 1.2.0 and later #27614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 22, 2023

Conversation

ogrisel
Copy link
Member

@ogrisel ogrisel commented Oct 18, 2023

Fixes #27613.

Note that on recent versions of joblib (after version 1.2.0) joblib.load(filepath, mmap_model="r") automatically allocates aligned memmap arrays (thanks to @lesteve's work), so there is no need for special casing anymore.

@jeromedockes can you please check that this fixes the problem for you? If so I will complete the PR to clean-up old code, set JOBLIB_MIN_VERSION to 1.2.0 instead of 1.1.1 and document it all in whats new.

EDIT: I simplified the code to always rely on joblib to allocate aligned memory mapped arrays. This made it possible to delete the broken fallback helper functions that did not work as intended in the specific runtime environment of Jerome.

@github-actions
Copy link

github-actions bot commented Oct 18, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: e10f003. Link to the linter CI: here

@jeromedockes
Copy link
Contributor

Thanks! yes I can confirm that this fixes the problem for me

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@glemaitre
Copy link
Member

A bit to quick at approving. Apparently it does not work in Windows:

NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\VSSADM~1\\AppData\\Local\\Temp\\sklearn_test_pickle__f2x5od6\\estimator.pkl'

but it might be more linked to the following:

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\VSSADM~1\\AppData\\Local\\Temp\\sklearn_test_pickle__f2x5od6\\estimator.pkl'

@ogrisel ogrisel marked this pull request as draft October 18, 2023 13:34
@ogrisel ogrisel self-assigned this Oct 18, 2023
@ogrisel ogrisel marked this pull request as ready for review October 18, 2023 15:09
@ogrisel ogrisel changed the title FIX do not call create_memmap_backed_data on an estimator object FIX update create_memmap_backed_data to work on any estimator object via joblib 1.2.0 and later Oct 18, 2023
@ogrisel
Copy link
Member Author

ogrisel commented Oct 18, 2023

@glemaitre this should be all-green now. Feel free to re-review.

@ogrisel ogrisel added Waiting for Reviewer Quick Review For PRs that are quick to review labels Oct 18, 2023
@glemaitre glemaitre merged commit c6d0d29 into scikit-learn:main Oct 22, 2023
@glemaitre
Copy link
Member

LGTM. Thanks @ogrisel

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2023
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError when calling check_estimator
3 participants