New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Deprecated 'copy' in TfidfVectorizer.transform #14520
Conversation
sklearn/feature_extraction/text.py
Outdated
# FIXME Remove copy parameter support in 0.23 ------------------------- | ||
msg = ("'copy' param has been deprecated since version 0.21. Backward compatibility" | ||
"for 'copy' will be removed in 0.23") | ||
warnings.warn(msg, eprecationWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warnings.warn(msg, eprecationWarning) | |
warnings.warn(msg, DeprecationWarning) |
Pleas see https://scikit-learn.org/dev/developers/contributing.html#deprecation |
I can't see that on the documentation @amueller But it makes sense, we avoid a useless warning. |
@GuillemGSubies this is the example from the docs: def example_function(n_clusters=8, k='not_used'):
if k != 'not_used':
warnings.warn("'k' was renamed to n_clusters in version 0.13 and "
"will be removed in 0.15.", DeprecationWarning)
n_clusters = k instead of "not_used" I'd suggest using "deprecated". So you set |
Coverage seems to have decreased, but I don't really know if it is worth if making a test that checks if a deprecation warning has been triggered. |
Thanks @GuillemGSubies ! Yes, we need to add a test that this deprecation warning is raised. That could be done by adding a test with,
with pytest.warns(DeprecationWarning, match="some message"):
# run deprecation code here
@@ -1681,13 +1681,20 @@ def transform(self, raw_documents, copy=True): | |||
Whether to copy X and operate on the copy or perform in-place | |||
operations. | |||
.. deprecated:: 0.21 | |||
The `copy` parameter was deprecated in version 0.21 and will | |||
be removed in 0.23. This parameter will be ignored | |||
Returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a whitespace
sklearn/feature_extraction/text.py
Outdated
" Backward compatibility for 'copy' will be removed in" | ||
" 0.23.") | ||
warnings.warn(msg, DeprecationWarning) | ||
# --------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this unnecessary formatting.
sklearn/feature_extraction/text.py
Outdated
Returns | ||
------- | ||
X : sparse matrix, [n_samples, n_features] | ||
Tf-idf-weighted document-term matrix. | ||
""" | ||
check_is_fitted(self, '_tfidf', 'The tfidf vector is not fitted') | ||
|
||
# FIXME Remove copy parameter support in 0.23 ------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem for ---
here
Co-Authored-By: Andreas Mueller <t3kcit@gmail.com>
It should be deprecated in 0.22 and removed in 0.24 (unless we want to deprecate it in a 0.21 bug fix release but it seems weird).
sklearn/feature_extraction/text.py
Outdated
@@ -1681,13 +1681,23 @@ def transform(self, raw_documents, copy=True): | |||
Whether to copy X and operate on the copy or perform in-place | |||
operations. | |||
.. deprecated:: 0.21 | |||
The `copy` parameter was deprecated in version 0.21 and | |||
will be removed in 0.23. This parameter will be ignored |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be removed in 0.23. This parameter will be ignored | |
will be removed in 0.23. This parameter will be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done
lgtm. Please add a what's new entry. |
sklearn/feature_extraction/text.py
Outdated
Returns | ||
------- | ||
X : sparse matrix, [n_samples, n_features] | ||
Tf-idf-weighted document-term matrix. | ||
""" | ||
check_is_fitted(self, '_tfidf', 'The tfidf vector is not fitted') | ||
|
||
# FIXME Remove copy parameter support in 0.24 | ||
if copy != "deprecated": | ||
msg = ("'copy' param has been deprecated since version 0.22." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe again it would be good to say it's unused
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
This should be ready to merge |
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
Thanks! |
Reference Issues/PRs
Fixes #14501
What does this implement/fix? Explain your changes.
I just deprecated the param as discussed in the issue
Any other comments?
It was my first deprecation in scikit-learn, I don't know if I might be missing anything.