Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Deprecated 'copy' in TfidfVectorizer.transform #14520

Merged
merged 21 commits into from Aug 7, 2019

Conversation

@GuillemGSubies
Copy link
Contributor

GuillemGSubies commented Jul 30, 2019

Reference Issues/PRs

Fixes #14501

What does this implement/fix? Explain your changes.

I just deprecated the param as discussed in the issue

Any other comments?

It was my first deprecation in scikit-learn, I don't know if I might be missing anything.

# FIXME Remove copy parameter support in 0.23 -------------------------
msg = ("'copy' param has been deprecated since version 0.21. Backward compatibility"
"for 'copy' will be removed in 0.23")
warnings.warn(msg, eprecationWarning)

This comment has been minimized.

Copy link
@amueller

amueller Jul 30, 2019

Member
Suggested change
warnings.warn(msg, eprecationWarning)
warnings.warn(msg, DeprecationWarning)
@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jul 30, 2019

Pleas see https://scikit-learn.org/dev/developers/contributing.html#deprecation
You should only warn if the parameter was changed from its default.

@GuillemGSubies

This comment has been minimized.

Copy link
Contributor Author

GuillemGSubies commented Jul 30, 2019

I can't see that on the documentation @amueller But it makes sense, we avoid a useless warning.
So I sould only put the fixme comment and the docstring for the method?

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jul 30, 2019

@GuillemGSubies this is the example from the docs:

def example_function(n_clusters=8, k='not_used'):
    if k != 'not_used':
        warnings.warn("'k' was renamed to n_clusters in version 0.13 and "
                      "will be removed in 0.15.", DeprecationWarning)
        n_clusters = k

instead of "not_used" I'd suggest using "deprecated".

So you set copy='deprecated' and if the user changed it from deprecated to anything else you raise the warning.

@GuillemGSubies GuillemGSubies force-pushed the GuillemGSubies:master branch from cf12dd9 to 83d2e81 Jul 31, 2019
@GuillemGSubies GuillemGSubies changed the title [WIP] Deprecated 'copy' in TfidfVectorizer.transform [MRG] Deprecated 'copy' in TfidfVectorizer.transform Jul 31, 2019
@GuillemGSubies

This comment has been minimized.

Copy link
Contributor Author

GuillemGSubies commented Jul 31, 2019

Coverage seems to have decreased, but I don't really know if it is worth if making a test that checks if a deprecation warning has been triggered.

Copy link
Member

rth left a comment

Thanks @GuillemGSubies ! Yes, we need to add a test that this deprecation warning is raised. That could be done by adding a test with,

with pytest.warns(DeprecationWarning, match="some message"):
    # run deprecation code here
@@ -1681,13 +1681,20 @@ def transform(self, raw_documents, copy=True):
Whether to copy X and operate on the copy or perform in-place
operations.
.. deprecated:: 0.21
The `copy` parameter was deprecated in version 0.21 and will
be removed in 0.23. This parameter will be ignored
Returns

This comment has been minimized.

Copy link
@rth

rth Jul 31, 2019

Member

Please add a whitespace

" Backward compatibility for 'copy' will be removed in"
" 0.23.")
warnings.warn(msg, DeprecationWarning)
# ---------------------------------------------------------------------

This comment has been minimized.

Copy link
@rth

rth Jul 31, 2019

Member

Please remove this unnecessary formatting.

Returns
-------
X : sparse matrix, [n_samples, n_features]
Tf-idf-weighted document-term matrix.
"""
check_is_fitted(self, '_tfidf', 'The tfidf vector is not fitted')

# FIXME Remove copy parameter support in 0.23 -------------------------

This comment has been minimized.

Copy link
@rth

rth Jul 31, 2019

Member

Idem for --- here

GuillemGSubies and others added 2 commits Aug 1, 2019
Co-Authored-By: Andreas Mueller <t3kcit@gmail.com>
Copy link
Member

jeremiedbb left a comment

It should be deprecated in 0.22 and removed in 0.24 (unless we want to deprecate it in a 0.21 bug fix release but it seems weird).

@@ -1681,13 +1681,23 @@ def transform(self, raw_documents, copy=True):
Whether to copy X and operate on the copy or perform in-place
operations.
.. deprecated:: 0.21
The `copy` parameter was deprecated in version 0.21 and
will be removed in 0.23. This parameter will be ignored

This comment has been minimized.

Copy link
@jeremiedbb

jeremiedbb Aug 3, 2019

Member
Suggested change
will be removed in 0.23. This parameter will be ignored
will be removed in 0.23. This parameter will be ignored.

This comment has been minimized.

Copy link
@GuillemGSubies

GuillemGSubies Aug 4, 2019

Author Contributor

Ok, done

@jeremiedbb

This comment has been minimized.

Copy link
Member

jeremiedbb commented Aug 5, 2019

lgtm. Please add a what's new entry.

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
Returns
-------
X : sparse matrix, [n_samples, n_features]
Tf-idf-weighted document-term matrix.
"""
check_is_fitted(self, '_tfidf', 'The tfidf vector is not fitted')

# FIXME Remove copy parameter support in 0.24
if copy != "deprecated":
msg = ("'copy' param has been deprecated since version 0.22."

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 5, 2019

Member

Maybe again it would be good to say it's unused

GuillemGSubies and others added 3 commits Aug 6, 2019
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
@GuillemGSubies

This comment has been minimized.

Copy link
Contributor Author

GuillemGSubies commented Aug 7, 2019

This should be ready to merge

Copy link
Member

jnothman left a comment

thanks

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
GuillemGSubies and others added 3 commits Aug 7, 2019
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
@amueller amueller merged commit deec2ba into scikit-learn:master Aug 7, 2019
15 checks passed
15 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python 1 new alert
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
scikit-learn.scikit-learn Build #20190807.30 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@amueller

This comment has been minimized.

Copy link
Member

amueller commented Aug 7, 2019

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.