Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Deprecated 'copy' in TfidfVectorizer.transform #14520

Merged
merged 21 commits into from Aug 7, 2019

Conversation

GuillemGSubies
Copy link
Contributor

@GuillemGSubies GuillemGSubies commented Jul 30, 2019

Reference Issues/PRs

Fixes #14501

What does this implement/fix? Explain your changes.

I just deprecated the param as discussed in the issue

Any other comments?

It was my first deprecation in scikit-learn, I don't know if I might be missing anything.

# FIXME Remove copy parameter support in 0.23 -------------------------
msg = ("'copy' param has been deprecated since version 0.21. Backward compatibility"
"for 'copy' will be removed in 0.23")
warnings.warn(msg, eprecationWarning)
Copy link
Member

@amueller amueller Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
warnings.warn(msg, eprecationWarning)
warnings.warn(msg, DeprecationWarning)

@amueller
Copy link
Member

@amueller amueller commented Jul 30, 2019

Pleas see https://scikit-learn.org/dev/developers/contributing.html#deprecation
You should only warn if the parameter was changed from its default.

@GuillemGSubies
Copy link
Contributor Author

@GuillemGSubies GuillemGSubies commented Jul 30, 2019

I can't see that on the documentation @amueller But it makes sense, we avoid a useless warning.
So I sould only put the fixme comment and the docstring for the method?

@amueller
Copy link
Member

@amueller amueller commented Jul 30, 2019

@GuillemGSubies this is the example from the docs:

def example_function(n_clusters=8, k='not_used'):
    if k != 'not_used':
        warnings.warn("'k' was renamed to n_clusters in version 0.13 and "
                      "will be removed in 0.15.", DeprecationWarning)
        n_clusters = k

instead of "not_used" I'd suggest using "deprecated".

So you set copy='deprecated' and if the user changed it from deprecated to anything else you raise the warning.

@GuillemGSubies GuillemGSubies changed the title [WIP] Deprecated 'copy' in TfidfVectorizer.transform [MRG] Deprecated 'copy' in TfidfVectorizer.transform Jul 31, 2019
@GuillemGSubies
Copy link
Contributor Author

@GuillemGSubies GuillemGSubies commented Jul 31, 2019

Coverage seems to have decreased, but I don't really know if it is worth if making a test that checks if a deprecation warning has been triggered.

Copy link
Member

@rth rth left a comment

Thanks @GuillemGSubies ! Yes, we need to add a test that this deprecation warning is raised. That could be done by adding a test with,

with pytest.warns(DeprecationWarning, match="some message"):
    # run deprecation code here

@@ -1681,13 +1681,20 @@ def transform(self, raw_documents, copy=True):
Whether to copy X and operate on the copy or perform in-place
operations.
.. deprecated:: 0.21
The `copy` parameter was deprecated in version 0.21 and will
be removed in 0.23. This parameter will be ignored
Returns
Copy link
Member

@rth rth Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a whitespace

" Backward compatibility for 'copy' will be removed in"
" 0.23.")
warnings.warn(msg, DeprecationWarning)
# ---------------------------------------------------------------------
Copy link
Member

@rth rth Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this unnecessary formatting.

Returns
-------
X : sparse matrix, [n_samples, n_features]
Tf-idf-weighted document-term matrix.
"""
check_is_fitted(self, '_tfidf', 'The tfidf vector is not fitted')

# FIXME Remove copy parameter support in 0.23 -------------------------
Copy link
Member

@rth rth Jul 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem for --- here

GuillemGSubies and others added 2 commits Aug 1, 2019
Copy link
Member

@jeremiedbb jeremiedbb left a comment

It should be deprecated in 0.22 and removed in 0.24 (unless we want to deprecate it in a 0.21 bug fix release but it seems weird).

@@ -1681,13 +1681,23 @@ def transform(self, raw_documents, copy=True):
Whether to copy X and operate on the copy or perform in-place
operations.
.. deprecated:: 0.21
The `copy` parameter was deprecated in version 0.21 and
will be removed in 0.23. This parameter will be ignored
Copy link
Member

@jeremiedbb jeremiedbb Aug 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
will be removed in 0.23. This parameter will be ignored
will be removed in 0.23. This parameter will be ignored.

Copy link
Contributor Author

@GuillemGSubies GuillemGSubies Aug 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done

@jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Aug 5, 2019

lgtm. Please add a what's new entry.

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
Returns
-------
X : sparse matrix, [n_samples, n_features]
Tf-idf-weighted document-term matrix.
"""
check_is_fitted(self, '_tfidf', 'The tfidf vector is not fitted')

# FIXME Remove copy parameter support in 0.24
if copy != "deprecated":
msg = ("'copy' param has been deprecated since version 0.22."
Copy link
Member

@jnothman jnothman Aug 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe again it would be good to say it's unused

GuillemGSubies and others added 3 commits Aug 6, 2019
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
@GuillemGSubies
Copy link
Contributor Author

@GuillemGSubies GuillemGSubies commented Aug 7, 2019

This should be ready to merge

Copy link
Member

@jnothman jnothman left a comment

thanks

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
GuillemGSubies and others added 3 commits Aug 7, 2019
@amueller amueller merged commit deec2ba into scikit-learn:master Aug 7, 2019
15 checks passed
@amueller
Copy link
Member

@amueller amueller commented Aug 7, 2019

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants