[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace … #8922

Merged
merged 7 commits into from May 26, 2017

Conversation

Projects
None yet
3 participants
@Attractadore
Contributor

Attractadore commented May 23, 2017

…n_topics,

which is still kept for backward compatibility.

Signed-off-by: Michael Bargatin attractadore02@gmail.com

Reference Issue

What does this implement/fix? Explain your changes.

#8799

Any other comments?

Added n_components parameter to LatentDirichletAllocation to replace …
…n_topics,

which is still kept for backward compatibility.

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
sklearn/decomposition/online_lda.py
@@ -277,13 +280,18 @@ def __init__(self, n_topics=10, doc_topic_prior=None,
self.n_jobs = n_jobs
self.verbose = verbose
self.random_state = random_state
+ if n_components == 10 and n_topics != 10:

This comment has been minimized.

@jnothman

jnothman May 23, 2017

Member

Our convention is to do this with fit so it also applies when set_params is used

@jnothman

jnothman May 23, 2017

Member

Our convention is to do this with fit so it also applies when set_params is used

This comment has been minimized.

@jnothman

jnothman May 23, 2017

Member

Although you should use an additional variable rather than overwrite the parameter. A bit bureaucratic, sorry...

@jnothman

jnothman May 23, 2017

Member

Although you should use an additional variable rather than overwrite the parameter. A bit bureaucratic, sorry...

sklearn/decomposition/online_lda.py
@@ -277,13 +280,18 @@ def __init__(self, n_topics=10, doc_topic_prior=None,
self.n_jobs = n_jobs
self.verbose = verbose
self.random_state = random_state
+ if n_components == 10 and n_topics != 10:
+ self.n_components = n_topics
+ warnings.warn("n_topics has been deprecated in favor of n_components", DeprecationWarning)

This comment has been minimized.

@jnothman

jnothman May 23, 2017

Member

Indicate when it will be removed. See developers' docs

@jnothman

jnothman May 23, 2017

Member

Indicate when it will be removed. See developers' docs

sklearn/decomposition/online_lda.py
@@ -241,6 +241,9 @@ class LatentDirichletAllocation(BaseEstimator, TransformerMixin):
n_iter_ : int
Number of passes over the dataset.
+
+ n_topics : int, optional (default=10)
+ Same as n_components, kept for backward compatibility

This comment has been minimized.

@jnothman

jnothman May 23, 2017

Member

Use the Sphinx deprecated markup

@jnothman

jnothman May 23, 2017

Member

Use the Sphinx deprecated markup

Applied a few changes
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
sklearn/decomposition/online_lda.py
def _check_params(self):
"""Check model parameters."""
+ if self.n_topics is not None:
+ self.n_components = self.n_topics

This comment has been minimized.

@jnothman

jnothman May 24, 2017

Member

I don't think we should be overwriting a parameter

@jnothman

jnothman May 24, 2017

Member

I don't think we should be overwriting a parameter

This comment has been minimized.

@Attractadore

Attractadore May 24, 2017

Contributor

So I should add a private variable that gets assigned either the value of n_topics or n_components, instead of overwriting n_components, yes?

@Attractadore

Attractadore May 24, 2017

Contributor

So I should add a private variable that gets assigned either the value of n_topics or n_components, instead of overwriting n_components, yes?

Changed info in docstring.
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 24, 2017

Member
Member

jnothman commented May 24, 2017

Added _n_components to LatentDirichletAllocation
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@jnothman

the code LGTM, but we need to modify tests and examples that use n_topics. We should also add a small test that checks the deprecation warning fires upon fit for n_topics being used.

Thanks!

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 24, 2017

Member

I didn't quite mean that the code looks good, just that it's the right shape. You have a few tests failing, including errors like:


  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/decomposition/online_lda.py", line 296, in _check_params
    if self._n_components <= 0:
TypeError: '<=' not supported between instances of 'NoneType' and 'int'
Member

jnothman commented May 24, 2017

I didn't quite mean that the code looks good, just that it's the right shape. You have a few tests failing, including errors like:


  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/decomposition/online_lda.py", line 296, in _check_params
    if self._n_components <= 0:
TypeError: '<=' not supported between instances of 'NoneType' and 'int'
Added test for n_topics deprecation
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@Attractadore

This comment has been minimized.

Show comment
Hide comment
@Attractadore

Attractadore May 25, 2017

Contributor

On which test does this happen?

Contributor

Attractadore commented May 25, 2017

On which test does this happen?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 25, 2017

Member

I'm not sure what you're asking as you look like you've modified a whole lot already. Did you use git grep n_topics?

Member

jnothman commented May 25, 2017

I'm not sure what you're asking as you look like you've modified a whole lot already. Did you use git grep n_topics?

@Attractadore

This comment has been minimized.

Show comment
Hide comment
@Attractadore

Attractadore May 25, 2017

Contributor

No I manually went though the file and changed all occurances of n_topics to n_components, then added
a test to check if a deprecation warning is thrown.

Contributor

Attractadore commented May 25, 2017

No I manually went though the file and changed all occurances of n_topics to n_components, then added
a test to check if a deprecation warning is thrown.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 25, 2017

Member
Member

jnothman commented May 25, 2017

Changed usage examples to reflect the replacement of n_topics with
n_components in LatentDirichletAllocation

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@jnothman

Otherwise LGTM

+def test_lda_n_topics_deprecation():
+ n_components, X = _build_sparse_mtx()
+ lda = LatentDirichletAllocation(n_topics=10)
+ with warnings.catch_warnings(record=True) as warning:

This comment has been minimized.

@jnothman

jnothman May 25, 2017

Member

Please use our assert_warns helper

@jnothman

jnothman May 25, 2017

Member

Please use our assert_warns helper

+
+def test_lda_n_topics_deprecation():
+ n_components, X = _build_sparse_mtx()
+ lda = LatentDirichletAllocation(n_topics=10)

This comment has been minimized.

@jnothman

jnothman May 25, 2017

Member

Use learning_method='batch' to avoid a second warning being produced.

@jnothman

jnothman May 25, 2017

Member

Use learning_method='batch' to avoid a second warning being produced.

@jnothman jnothman changed the title from [MRG] Added n_components parameter to LatentDirichletAllocation to replace … to [MRG+1] Added n_components parameter to LatentDirichletAllocation to replace … May 25, 2017

Used assert_warns to check for DeprecationWarning
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@jmschrei

This comment has been minimized.

Show comment
Hide comment
@jmschrei

jmschrei May 26, 2017

Member

LGTM as well. The only thing causing an issue is flake because some lines are using slightly more than 79 characters. Thanks for the contribution!

Member

jmschrei commented May 26, 2017

LGTM as well. The only thing causing an issue is flake because some lines are using slightly more than 79 characters. Thanks for the contribution!

@jmschrei jmschrei changed the title from [MRG+1] Added n_components parameter to LatentDirichletAllocation to replace … to [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace … May 26, 2017

@jmschrei jmschrei merged commit 890e652 into scikit-learn:master May 26, 2017

2 of 3 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details

jmschrei added a commit that referenced this pull request May 26, 2017

@Attractadore

This comment has been minimized.

Show comment
Hide comment
@Attractadore

Attractadore May 26, 2017

Contributor
Contributor

Attractadore commented May 26, 2017

@jmschrei

This comment has been minimized.

Show comment
Hide comment
@jmschrei

jmschrei May 26, 2017

Member

Looks like you'll have to resubmit the PR and fix the Flake8 issues because it was causing the build to fail. I'm waiting on CI to revert the changes.

Member

jmschrei commented May 26, 2017

Looks like you'll have to resubmit the PR and fix the Flake8 issues because it was causing the build to fail. I'm waiting on CI to revert the changes.

jmschrei added a commit that referenced this pull request May 26, 2017

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 28, 2017

Member

Sorry, @Attractadore, we forgot to ask you to add an entry in whats_new for this. We need it mentioned under API changes. Would you like us to credit you by some name other than your username? Indeed, you could submit your own PR with the change to whats_new.

Member

jnothman commented May 28, 2017

Sorry, @Attractadore, we forgot to ask you to add an entry in whats_new for this. We need it mentioned under API changes. Would you like us to credit you by some name other than your username? Indeed, you could submit your own PR with the change to whats_new.

@Attractadore

This comment has been minimized.

Show comment
Hide comment
@Attractadore

Attractadore May 28, 2017

Contributor
Contributor

Attractadore commented May 28, 2017

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 28, 2017

Member
Member

jnothman commented May 28, 2017

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 28, 2017

Member
Member

jnothman commented May 28, 2017

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman May 28, 2017

Member
Member

jnothman commented May 28, 2017

@Attractadore

This comment has been minimized.

Show comment
Hide comment
@Attractadore

Attractadore May 28, 2017

Contributor
Contributor

Attractadore commented May 28, 2017

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

AishwaryaRK added a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

AishwaryaRK added a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+1] Added n_components parameter to LatentDirichletAllocation to …
…replace … (#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment