New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+2] Added n_components parameter to LatentDirichletAllocation to replace … #8922
Conversation
…n_topics, which is still kept for backward compatibility. Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
sklearn/decomposition/online_lda.py
Outdated
@@ -277,13 +280,18 @@ def __init__(self, n_topics=10, doc_topic_prior=None, | |||
self.n_jobs = n_jobs | |||
self.verbose = verbose | |||
self.random_state = random_state | |||
if n_components == 10 and n_topics != 10: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our convention is to do this with fit so it also applies when set_params is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although you should use an additional variable rather than overwrite the parameter. A bit bureaucratic, sorry...
sklearn/decomposition/online_lda.py
Outdated
@@ -277,13 +280,18 @@ def __init__(self, n_topics=10, doc_topic_prior=None, | |||
self.n_jobs = n_jobs | |||
self.verbose = verbose | |||
self.random_state = random_state | |||
if n_components == 10 and n_topics != 10: | |||
self.n_components = n_topics | |||
warnings.warn("n_topics has been deprecated in favor of n_components", DeprecationWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indicate when it will be removed. See developers' docs
sklearn/decomposition/online_lda.py
Outdated
@@ -241,6 +241,9 @@ class LatentDirichletAllocation(BaseEstimator, TransformerMixin): | |||
|
|||
n_iter_ : int | |||
Number of passes over the dataset. | |||
|
|||
n_topics : int, optional (default=10) | |||
Same as n_components, kept for backward compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the Sphinx deprecated markup
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
sklearn/decomposition/online_lda.py
Outdated
|
||
def _check_params(self): | ||
"""Check model parameters.""" | ||
if self.n_topics is not None: | ||
self.n_components = self.n_topics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be overwriting a parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I should add a private variable that gets assigned either the value of n_topics or n_components, instead of overwriting n_components, yes?
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
yes, a local variable is one option. a private property is another
…On 24 May 2017 5:51 pm, "Attractadore" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/decomposition/online_lda.py
<#8922 (comment)>
:
>
def _check_params(self):
"""Check model parameters."""
+ if self.n_topics is not None:
+ self.n_components = self.n_topics
So I should add a private variable that gets assigned either the value of
n_topics or n_components, instead of overwriting n_components, yes?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8922 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz62q0tcDnM1RdIw1OqLk5U5zxiOFIks5r89eBgaJpZM4Nj3oW>
.
|
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code LGTM, but we need to modify tests and examples that use n_topics
. We should also add a small test that checks the deprecation warning fires upon fit for n_topics
being used.
Thanks!
I didn't quite mean that the code looks good, just that it's the right shape. You have a few tests failing, including errors like:
|
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
On which test does this happen? |
I'm not sure what you're asking as you look like you've modified a whole lot already. Did you use git grep n_topics? |
No I manually went though the file and changed all occurances of n_topics to n_components, then added |
It looks like the remaining file to edit
is examples/applications/plot_topics_extraction_with_nmf_lda.py
…On 25 May 2017 at 17:19, Attractadore ***@***.***> wrote:
No I manually went though the file and changed all occurances of n_topics
to n_components, then added
a test to check if a deprecation warning is thrown.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8922 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz60bowVHP5CudopouuN5QwptC9-Pxks5r9SuagaJpZM4Nj3oW>
.
|
n_components in LatentDirichletAllocation Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
def test_lda_n_topics_deprecation(): | ||
n_components, X = _build_sparse_mtx() | ||
lda = LatentDirichletAllocation(n_topics=10) | ||
with warnings.catch_warnings(record=True) as warning: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use our assert_warns helper
|
||
def test_lda_n_topics_deprecation(): | ||
n_components, X = _build_sparse_mtx() | ||
lda = LatentDirichletAllocation(n_topics=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use learning_method='batch'
to avoid a second warning being produced.
Signed-off-by: Michael Bargatin <attractadore02@gmail.com>
LGTM as well. The only thing causing an issue is flake because some lines are using slightly more than 79 characters. Thanks for the contribution! |
Thanks for the patience!
…On Fri, May 26, 2017 at 9:07 AM, Jacob Schreiber ***@***.***> wrote:
Merged #8922 <#8922>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8922 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Abjnipm_AD5N0xMSKSl7q05sIf1SWqUKks5r9mwugaJpZM4Nj3oW>
.
|
Looks like you'll have to resubmit the PR and fix the Flake8 issues because it was causing the build to fail. I'm waiting on CI to revert the changes. |
Sorry, @Attractadore, we forgot to ask you to add an entry in whats_new for this. We need it mentioned under API changes. Would you like us to credit you by some name other than your username? Indeed, you could submit your own PR with the change to whats_new. |
No, I don't really care about the credit (I still haven't done anything
major), but thanks for asking. By the way, do you have any other issues I
could look at?
…On Sun, May 28, 2017 at 2:18 PM, Joel Nothman ***@***.***> wrote:
Sorry, @Attractadore <https://github.com/attractadore>, we forgot to ask
you to add an entry in whats_new for this. We need it mentioned under API
changes. Would you like us to credit you by some name other than your
username? Indeed, you could submit your own PR with the change to whats_new.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8922 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Abjnivs2o54gGtaWPjYtns9HwT6LeOp4ks5r-VgTgaJpZM4Nj3oW>
.
|
Well, we need to add an entry in the API Changes section anyway, with or
without credit.
…On 28 May 2017 at 21:32, Attractadore ***@***.***> wrote:
No, I don't really care about the credit (I still haven't done anything
major), but thanks for asking. By the way, do you have any other issues I
could look at?
On Sun, May 28, 2017 at 2:18 PM, Joel Nothman ***@***.***>
wrote:
> Sorry, @Attractadore <https://github.com/attractadore>, we forgot to ask
> you to add an entry in whats_new for this. We need it mentioned under API
> changes. Would you like us to credit you by some name other than your
> username? Indeed, you could submit your own PR with the change to
whats_new.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#8922#
issuecomment-304507584>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/
Abjnivs2o54gGtaWPjYtns9HwT6LeOp4ks5r-VgTgaJpZM4Nj3oW>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8922 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz64Tu-tubEviFYS_GwMWpUhTwhDZRks5r-VtUgaJpZM4Nj3oW>
.
|
What's new has been changed.
While you managed this PR very competently, I can't say what issues needing
contributors suit your abilities and interests. There are a few Pull
Requests with the Needs Contributor label, and I'd really like a few of
them merged soon; some need only a little more work.
…On 28 May 2017 at 21:58, Joel Nothman ***@***.***> wrote:
Well, we need to add an entry in the API Changes section anyway, with or
without credit.
On 28 May 2017 at 21:32, Attractadore ***@***.***> wrote:
> No, I don't really care about the credit (I still haven't done anything
> major), but thanks for asking. By the way, do you have any other issues I
> could look at?
>
> On Sun, May 28, 2017 at 2:18 PM, Joel Nothman ***@***.***>
> wrote:
>
> > Sorry, @Attractadore <https://github.com/attractadore>, we forgot to
> ask
> > you to add an entry in whats_new for this. We need it mentioned under
> API
> > changes. Would you like us to credit you by some name other than your
> > username? Indeed, you could submit your own PR with the change to
> whats_new.
> >
> > —
> > You are receiving this because you were mentioned.
> > Reply to this email directly, view it on GitHub
> > <#8922 (comment)
> ecomment-304507584>,
> > or mute the thread
> > <https://github.com/notifications/unsubscribe-auth/Abjnivs2o
> 54gGtaWPjYtns9HwT6LeOp4ks5r-VgTgaJpZM4Nj3oW>
> > .
> >
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#8922 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAEz64Tu-tubEviFYS_GwMWpUhTwhDZRks5r-VtUgaJpZM4Nj3oW>
> .
>
|
But where we're lacking atm is in core contributors (not in number, but in
availability) to review and merge (/ reject).
…On 28 May 2017 at 22:01, Joel Nothman ***@***.***> wrote:
What's new has been changed.
While you managed this PR very competently, I can't say what issues
needing contributors suit your abilities and interests. There are a few
Pull Requests with the Needs Contributor label, and I'd really like a few
of them merged soon; some need only a little more work.
On 28 May 2017 at 21:58, Joel Nothman ***@***.***> wrote:
> Well, we need to add an entry in the API Changes section anyway, with or
> without credit.
>
> On 28 May 2017 at 21:32, Attractadore ***@***.***> wrote:
>
>> No, I don't really care about the credit (I still haven't done anything
>> major), but thanks for asking. By the way, do you have any other issues I
>> could look at?
>>
>> On Sun, May 28, 2017 at 2:18 PM, Joel Nothman ***@***.***>
>> wrote:
>>
>> > Sorry, @Attractadore <https://github.com/attractadore>, we forgot to
>> ask
>> > you to add an entry in whats_new for this. We need it mentioned under
>> API
>> > changes. Would you like us to credit you by some name other than your
>> > username? Indeed, you could submit your own PR with the change to
>> whats_new.
>> >
>> > —
>> > You are receiving this because you were mentioned.
>> > Reply to this email directly, view it on GitHub
>> > <#8922 (comment)
>> ecomment-304507584>,
>> > or mute the thread
>> > <https://github.com/notifications/unsubscribe-auth/Abjnivs2o
>> 54gGtaWPjYtns9HwT6LeOp4ks5r-VgTgaJpZM4Nj3oW>
>> > .
>> >
>>
>> —
>> You are receiving this because you commented.
>> Reply to this email directly, view it on GitHub
>> <#8922 (comment)>,
>> or mute the thread
>> <https://github.com/notifications/unsubscribe-auth/AAEz64Tu-tubEviFYS_GwMWpUhTwhDZRks5r-VtUgaJpZM4Nj3oW>
>> .
>>
>
>
|
Well I've got the whole summer ahead. Might end up helping you merge ;)
On Sun, May 28, 2017 at 3:02 PM, Joel Nothman <notifications@github.com>
wrote:
… But where we're lacking atm is in core contributors (not in number, but in
availability) to review and merge (/ reject).
On 28 May 2017 at 22:01, Joel Nothman ***@***.***> wrote:
> What's new has been changed.
>
> While you managed this PR very competently, I can't say what issues
> needing contributors suit your abilities and interests. There are a few
> Pull Requests with the Needs Contributor label, and I'd really like a few
> of them merged soon; some need only a little more work.
>
> On 28 May 2017 at 21:58, Joel Nothman ***@***.***> wrote:
>
>> Well, we need to add an entry in the API Changes section anyway, with or
>> without credit.
>>
>> On 28 May 2017 at 21:32, Attractadore ***@***.***> wrote:
>>
>>> No, I don't really care about the credit (I still haven't done anything
>>> major), but thanks for asking. By the way, do you have any other
issues I
>>> could look at?
>>>
>>> On Sun, May 28, 2017 at 2:18 PM, Joel Nothman <
***@***.***>
>>> wrote:
>>>
>>> > Sorry, @Attractadore <https://github.com/attractadore>, we forgot to
>>> ask
>>> > you to add an entry in whats_new for this. We need it mentioned under
>>> API
>>> > changes. Would you like us to credit you by some name other than your
>>> > username? Indeed, you could submit your own PR with the change to
>>> whats_new.
>>> >
>>> > —
>>> > You are receiving this because you were mentioned.
>>> > Reply to this email directly, view it on GitHub
>>> > <#8922 (comment)
>>> ecomment-304507584>,
>>> > or mute the thread
>>> > <https://github.com/notifications/unsubscribe-auth/Abjnivs2o
>>> 54gGtaWPjYtns9HwT6LeOp4ks5r-VgTgaJpZM4Nj3oW>
>>> > .
>>> >
>>>
>>> —
>>> You are receiving this because you commented.
>>> Reply to this email directly, view it on GitHub
>>> <#8922#
issuecomment-304508196>,
>>> or mute the thread
>>> <https://github.com/notifications/unsubscribe-auth/AAEz64Tu-tubEviFYS_
GwMWpUhTwhDZRks5r-VtUgaJpZM4Nj3oW>
>>> .
>>>
>>
>>
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8922 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AbjnimCd9Rwys5GBTTR7fkT7JCHzMsWQks5r-WJtgaJpZM4Nj3oW>
.
|
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…replace … (scikit-learn#8922) [MRG+2] Added n_components parameter to LatentDirichletAllocation to replace …
…tion to replace … (scikit-learn#8922)" (scikit-learn#8937) This reverts commit 890e652.
…n_topics,
which is still kept for backward compatibility.
Signed-off-by: Michael Bargatin attractadore02@gmail.com
Reference Issue
What does this implement/fix? Explain your changes.
#8799
Any other comments?