Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace … #8922

Merged
merged 7 commits into from May 26, 2017
Merged

Conversation

Attractadore
Copy link
Contributor

…n_topics,

which is still kept for backward compatibility.

Signed-off-by: Michael Bargatin attractadore02@gmail.com

Reference Issue

What does this implement/fix? Explain your changes.

#8799

Any other comments?

…n_topics,

which is still kept for backward compatibility.

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@@ -277,13 +280,18 @@ def __init__(self, n_topics=10, doc_topic_prior=None,
self.n_jobs = n_jobs
self.verbose = verbose
self.random_state = random_state
if n_components == 10 and n_topics != 10:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our convention is to do this with fit so it also applies when set_params is used

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although you should use an additional variable rather than overwrite the parameter. A bit bureaucratic, sorry...

@@ -277,13 +280,18 @@ def __init__(self, n_topics=10, doc_topic_prior=None,
self.n_jobs = n_jobs
self.verbose = verbose
self.random_state = random_state
if n_components == 10 and n_topics != 10:
self.n_components = n_topics
warnings.warn("n_topics has been deprecated in favor of n_components", DeprecationWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indicate when it will be removed. See developers' docs

@@ -241,6 +241,9 @@ class LatentDirichletAllocation(BaseEstimator, TransformerMixin):

n_iter_ : int
Number of passes over the dataset.

n_topics : int, optional (default=10)
Same as n_components, kept for backward compatibility
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the Sphinx deprecated markup

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>

def _check_params(self):
"""Check model parameters."""
if self.n_topics is not None:
self.n_components = self.n_topics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be overwriting a parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I should add a private variable that gets assigned either the value of n_topics or n_components, instead of overwriting n_components, yes?

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@jnothman
Copy link
Member

jnothman commented May 24, 2017 via email

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code LGTM, but we need to modify tests and examples that use n_topics. We should also add a small test that checks the deprecation warning fires upon fit for n_topics being used.

Thanks!

@jnothman
Copy link
Member

jnothman commented May 24, 2017

I didn't quite mean that the code looks good, just that it's the right shape. You have a few tests failing, including errors like:


  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/decomposition/online_lda.py", line 296, in _check_params
    if self._n_components <= 0:
TypeError: '<=' not supported between instances of 'NoneType' and 'int'

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@Attractadore
Copy link
Contributor Author

On which test does this happen?

@jnothman
Copy link
Member

I'm not sure what you're asking as you look like you've modified a whole lot already. Did you use git grep n_topics?

@Attractadore
Copy link
Contributor Author

No I manually went though the file and changed all occurances of n_topics to n_components, then added
a test to check if a deprecation warning is thrown.

@jnothman
Copy link
Member

jnothman commented May 25, 2017 via email

n_components in LatentDirichletAllocation

Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

def test_lda_n_topics_deprecation():
n_components, X = _build_sparse_mtx()
lda = LatentDirichletAllocation(n_topics=10)
with warnings.catch_warnings(record=True) as warning:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use our assert_warns helper


def test_lda_n_topics_deprecation():
n_components, X = _build_sparse_mtx()
lda = LatentDirichletAllocation(n_topics=10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use learning_method='batch' to avoid a second warning being produced.

@jnothman jnothman changed the title [MRG] Added n_components parameter to LatentDirichletAllocation to replace … [MRG+1] Added n_components parameter to LatentDirichletAllocation to replace … May 25, 2017
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
@jmschrei
Copy link
Member

LGTM as well. The only thing causing an issue is flake because some lines are using slightly more than 79 characters. Thanks for the contribution!

@jmschrei jmschrei changed the title [MRG+1] Added n_components parameter to LatentDirichletAllocation to replace … [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace … May 26, 2017
@jmschrei jmschrei merged commit 890e652 into scikit-learn:master May 26, 2017
jmschrei added a commit that referenced this pull request May 26, 2017
@Attractadore
Copy link
Contributor Author

Attractadore commented May 26, 2017 via email

@jmschrei
Copy link
Member

Looks like you'll have to resubmit the PR and fix the Flake8 issues because it was causing the build to fail. I'm waiting on CI to revert the changes.

jmschrei added a commit that referenced this pull request May 26, 2017
@jnothman
Copy link
Member

Sorry, @Attractadore, we forgot to ask you to add an entry in whats_new for this. We need it mentioned under API changes. Would you like us to credit you by some name other than your username? Indeed, you could submit your own PR with the change to whats_new.

@Attractadore
Copy link
Contributor Author

Attractadore commented May 28, 2017 via email

@jnothman
Copy link
Member

jnothman commented May 28, 2017 via email

@jnothman
Copy link
Member

jnothman commented May 28, 2017 via email

@jnothman
Copy link
Member

jnothman commented May 28, 2017 via email

@Attractadore
Copy link
Contributor Author

Attractadore commented May 28, 2017 via email

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
…replace … (scikit-learn#8922)

[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants