Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Improve initialization and learning rate in t-SNE #19491

Merged
merged 71 commits into from
Apr 26, 2021

Conversation

dkobak
Copy link
Contributor

@dkobak dkobak commented Feb 18, 2021

This implements suggestions from #18018 (see there for some discussion):

  1. Clarifies the documentation about the learning_rate (different from all other implementations by a factor of 4).
  2. Scales PCA initialization to have the same std as the random initialization. (Update: only issues future warning for now. Would change in v1.2.)
  3. Issues future warning that PCA initialization will become default in v1.2.
  4. Implements learning_rate='auto' that scales the learning rate with the sample size.
  5. Issues future warning that learning_rate='auto' will become default in v1.2.

I would still have to implement unit tests for future warnings (haven't done it before) and add the changes to whats_new (not quite sure which of the above changes need to be mentioned there). But I'd like to get some feedback from the core developers about whether these suggested changes are all fine. @TomDLT @ogrisel

Update: tests added, changes added.

Copy link
Member

@TomDLT TomDLT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull-request!

sklearn/manifold/_t_sne.py Outdated Show resolved Hide resolved
sklearn/manifold/_t_sne.py Outdated Show resolved Hide resolved
sklearn/manifold/_t_sne.py Show resolved Hide resolved
sklearn/manifold/_t_sne.py Outdated Show resolved Hide resolved
sklearn/manifold/_t_sne.py Outdated Show resolved Hide resolved
sklearn/manifold/_t_sne.py Outdated Show resolved Hide resolved
@dkobak
Copy link
Contributor Author

dkobak commented Feb 23, 2021

@TomDLT I fixed and added t-SNE tests and they seem to be working fine, but something is failing in doctest:

================================== FAILURES ===================================
____________________ [doctest] sklearn.manifold._t_sne.TSNE ____________________
...
UNEXPECTED EXCEPTION: FutureWarning("The default initialization in TSNE will change from 'random' to 'pca' in 1.2.")

and I don't know where to fix it :-/

@dkobak
Copy link
Contributor Author

dkobak commented Apr 7, 2021

@thomasjpfan Thanks a lot for reviewing! I pushed your suggestions and added TODO comments and docstrings as you suggested everywhere else too.

This PR is deprecating three things: two defaults value and the PCA SD change. I suggest removing the PCA SD change for now and keep the deprecation for learning_rate and init. In the user guide (doc/modules/manifold.rst), we need to describe learning_rate='auto' with references. In a future PR we can update the user guide for the PCA SD change and add the warning + tests.

Sorry, I am not sure I understand the rationale here. I already have everything regarding PCA SD implemented here, what's the point of taking it out? Also, I would be uncomfortable setting PCA init as default if it does not get scaled to correct SD. To be honest, I think these changes should go together.

Update: I mean, the SD change is currently implemented but commented out, because it should only go live in version 1.2. Not sure what's the better way to do it? I think the future warning should be happening already in this PR.

             # X_embedded = X_embedded / np.std(X_embedded[:, 0]) * 1e-4

In the user guide (doc/modules/manifold.rst), we need to describe learning_rate='auto' with references.

This I can do. Update: done!

Also, what do you think about this suggested change (not yet implemented in this PR):

Oh, there is something I forgot to mention in the original issue: after implementing the learning_rate = n/12 heuristic in openTSNE and FIt-SNE we realized that 750 iterations is enough for all practical purposes and made n_iter=750 the default over there in both implementations (see a LONG discussion here KlugerLab/FIt-SNE#88).

So we could also adopt the same convention here, cutting down the number of default iterations from 1000 to 750. This of course would need to go through a deprecation cycle, together with the learning_rate='auto'. What do you think?

In case you think it's a good idea, I am wondering if the deprecation cycle needs to be implemented via n_iter='warn'. Given that this is tied to the learning_rate change, can the learning rate FutureWarning mention that the n_iter will change to 750 together with the future learning rate change? Without an additional n_iter future warning?

PS. Not sure why the milestone check is not suddenly failing...

@dkobak
Copy link
Contributor Author

dkobak commented Apr 15, 2021

@thomasjpfan Could you clarify what you meant by "removing the PCA SD change for now"? See also my comment above for more considerations. Everything is fixed btw. Cheers!

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you meant by "removing the PCA SD change for now"? See also my comment above for more considerations. Everything is fixed btw. Cheers!

What I meant was that the commented out PCA change can be its own pull request. But looking at this again, I am okay with leaving it in.

I can see that the depredations in this PR are all related, but they could be done with three separate PRs. This makes it easier to review and helps with merging faster. In general, a PR with bigger scope increases the chances of something blocking it from merging.

So we could also adopt the same convention here, cutting down the number of default iterations from 1000 to 750. This of course would need to go through a deprecation cycle, together with the learning_rate='auto'. What do you think?

We can work on this in a follow up PR. This PR is already a net improvement as is.

As for the review, I left comments about using pytest.mark.filterwarnings instead of ignore_warnings that applies to all the tests. Otherwise this looks good to go.

Comment on lines 532 to 534
where N is the sample size, following Belkina et al. 2019 and
Kobak et al. 2019, Nature Communications (or to 50.0, if
N / early_exaggeration / 4 < 50). This will become default in 1.2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this references into the References section below and link it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, fixed.

sklearn/manifold/_t_sne.py Outdated Show resolved Hide resolved
sklearn/manifold/_t_sne.py Show resolved Hide resolved
sklearn/manifold/tests/test_t_sne.py Outdated Show resolved Hide resolved
sklearn/manifold/tests/test_t_sne.py Outdated Show resolved Hide resolved
@dkobak
Copy link
Contributor Author

dkobak commented Apr 16, 2021

@thomasjpfan Thanks a lot! I changed the handling of the future warnings. All checks pass.

We can work on this in a follow up PR. This PR is already a net improvement as is.

My suggestion is simply to replace

        if self.learning_rate == 'warn':
            # See issue #18018
            warnings.warn("The default learning rate in TSNE will change "
                          "from 200.0 to 'auto' in 1.2.", FutureWarning)

with

        if self.learning_rate == 'warn':
            # TODO: Change the n_iter to 750 in 1.2.
            # See issue #18018
            warnings.warn("The default learning rate in TSNE will change "
                          "from 200.0 to 'auto' in 1.2. At the same time, the default 
                          "number of iterations will decrease from 1000 to 750",
                          FutureWarning)

I think this does not deserve its own PR... There would nothing else to do really at this point. Or what do you think?

@TomDLT
Copy link
Member

TomDLT commented Apr 16, 2021

I am not fully convinced that changing the default number of iteration from 1000 to 750 is necessary. It would probably benefit from a dedicated discussion in a small separate PR.

@dkobak
Copy link
Contributor Author

dkobak commented Apr 16, 2021

I am not fully convinced that changing the default number of iteration from 1000 to 750 is necessary. It would probably benefit from a dedicated discussion in a small separate PR.

Fair enough. This change would not have any other consequences apart from decreasing the runtime by 25%. I'm fine merging this PR without this change if you guys prefer that.

@dkobak dkobak requested a review from thomasjpfan April 18, 2021 20:45
@thomasjpfan thomasjpfan changed the title Improve initialization and learning rate in t-SNE ENH Improve initialization and learning rate in t-SNE Apr 26, 2021
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this issue @dkobak !

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants