Skip to content

FIX Run common tests on SparseCoder#32077

Merged
adrinjalali merged 12 commits into
scikit-learn:mainfrom
FrancoisPgm:SparseCoder_common_tests
Sep 9, 2025
Merged

FIX Run common tests on SparseCoder#32077
adrinjalali merged 12 commits into
scikit-learn:mainfrom
FrancoisPgm:SparseCoder_common_tests

Conversation

@FrancoisPgm
Copy link
Copy Markdown
Contributor

@FrancoisPgm FrancoisPgm commented Sep 2, 2025

Reference Issues/PRs

Towards #26482
See also #27724 and #26691 which appear to be stalled.
Closes #27724
Closes #26691

What does this implement/fix? Explain your changes.

This runs the common estimator tests on SparseCoder.

To match the behavior expected by the common tests, n_components_ and n_features_in_ are changed from properties to attributes initialized in the fitmethod. validate_data is run in fit.

Specific dictionary arguments for checks are added in PER_ESTIMATOR_CHECK_PARAMS.

To be able to use a specific dictionary in check_set_output_transform, _yield_instances_for_check is used in test_set_output_transform.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 2, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: fbb9dbf. Link to the linter CI: here

Comment thread sklearn/decomposition/_dict_learning.py Outdated
Comment thread sklearn/decomposition/_dict_learning.py
Comment thread sklearn/decomposition/_dict_learning.py Outdated
Comment thread sklearn/decomposition/_dict_learning.py Outdated
Comment thread sklearn/decomposition/_dict_learning.py
Comment thread sklearn/decomposition/_dict_learning.py Outdated
FrancoisPgm and others added 2 commits September 2, 2025 15:42
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Comment on lines -1397 to -1401
@property
def n_features_in_(self):
"""Number of features seen during `fit`."""
return self.dictionary.shape[1]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewers: With this modification, n_features_in_ is now only set if fit is called. It makes this estimator follow our API and in line with this discussion #27724 (comment)

Copy link
Copy Markdown
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of suggestions for the docstrings. Looks good otherwise. Please also add a changelog entry. You can say that SparseCoder now follows the transformer API of scikit-learn and that now parameter and input validation is executed in fit (you can follow these instructions to write a changelog fragment https://github.com/scikit-learn/scikit-learn/blob/main/doc/whats_new/upcoming_changes/README.md)

Comment thread sklearn/decomposition/_dict_learning.py Outdated
Comment thread sklearn/decomposition/_dict_learning.py Outdated
FrancoisPgm and others added 3 commits September 3, 2025 09:59
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Comment thread doc/whats_new/upcoming_changes/sklearn.decomposition/32077.enhancement.rst Outdated
Copy link
Copy Markdown
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks @FrancoisPgm

@jeremiedbb jeremiedbb added the Waiting for Second Reviewer First reviewer is done, need a second one! label Sep 3, 2025
@jeremiedbb
Copy link
Copy Markdown
Member

maybe @adrinjalali for a second review since you reviewed the previous attempt #27724.

…ancement.rst

Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
},
SkewedChi2Sampler: {"check_dict_unchanged": dict(n_components=1)},
SparseCoder: {
"check_estimators_dtypes": dict(dictionary=rng.normal(size=(5, 5))),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a single case where we can set to have it pass all the tests?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dictionary is not a very friendly parameter because it needs to have a shape compatible with X, but all the checks have different Xs

Comment thread sklearn/tests/test_common.py Outdated
f"Skipping check_set_output_transform for {name}: Does not support"
" set_output API"
)
estimator = next(_yield_instances_for_check(check_set_output_transform, estimator))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to be a nice catch, but I think we should then go through all instances for this test, instead of only the first one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed, for the sparsecoder right now it's always one instance but it can be more. I'll add a for loop to go through all instances.

@@ -0,0 +1,2 @@
- :class:`decomposition.SparseCoder` now follows the transformer API of scikit-learn.
In addition, the :meth:`fit` method now validates the input and parameters.
Copy link
Copy Markdown
Member

@jeremiedbb jeremiedbb Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reference yourself as author of the PR: "By :user:`your name <your handle>`".

Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@adrinjalali adrinjalali merged commit 68218f7 into scikit-learn:main Sep 9, 2025
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants