Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX avoid unecessary/duplicated if copy branch for sparse arrays/matrix #27336

Merged
merged 4 commits into from
Sep 11, 2023

Conversation

glemaitre
Copy link
Member

This should solve the issue observed in the scipy-dev build.

#27171 (comment)

In a later PR, we should make a better test for check_array.

@github-actions
Copy link

github-actions bot commented Sep 11, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 7458957. Link to the linter CI: here

@@ -962,6 +962,19 @@ def is_sparse(dtype):
allow_nan=force_all_finite == "allow-nan",
)

if copy:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above if take care about the SciPy sparse copy already. This should only be done in the else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the call to _ensure_sparse_format already takes care of handling the copy argument so there is no need to do it twice for sparse inputs.

@glemaitre
Copy link
Member Author

glemaitre commented Sep 11, 2023

The remaining failures are in the docstring and a related to the way information is displayed.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please fix the title of the PR? It's seems grammatically invalid and I am not sure what you mean.

doc/modules/model_evaluation.rst Outdated Show resolved Hide resolved
@ogrisel
Copy link
Member

ogrisel commented Sep 11, 2023

I think this PR would be easier to review if it were to focus on a single problem (e.g. the check_array fix).

doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved
@glemaitre glemaitre changed the title FIX avoid calling on sparse matrix and array in check_array FIX avoid calling np.may_share_memory on sparse arrays Sep 11, 2023
@glemaitre glemaitre changed the title FIX avoid calling np.may_share_memory on sparse arrays FIX avoid unecessary/duplicated if copy branch for sparse arrays/matrix Sep 11, 2023
@glemaitre
Copy link
Member Author

Let me revert the problem of the representation

@lesteve
Copy link
Member

lesteve commented Sep 11, 2023

I kind of think this dok array issue is a scipy regression and I was hoping it would get fixed in scipy.

I was planning to follow up on scipy/scipy#18929 (comment) for directions.

Having said that, if there is an easy work-around that avoids the issue in scikit-learn, why not ...

@glemaitre
Copy link
Member Author

I was planning to follow up on scipy/scipy#18929 (comment) for directions.

I open the following in scipy to kind having something working: scipy/scipy#19220

Having said that, if there is an easy work-around that avoids the issue in scikit-learn, why not ...

Actually this is a silent bug where all stars were aligned.

np.may_share_memory does not work as expected on sparse matrices. It will always returns False even if .data is shared. But we don't care of the copy here because it was taken of in the specific sparse function

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the previous comments, the fix itself LGTM!

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Copy link
Member

@lesteve lesteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, putting this on auto-merge

@lesteve lesteve enabled auto-merge (squash) September 11, 2023 13:46
@lesteve lesteve merged commit e7ae63f into scikit-learn:main Sep 11, 2023
25 checks passed
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…-learn#27336)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants