Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Add early exaggeration iterations as argument to t-SNE #12476

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

cciccole
Copy link

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Converting early exaggeration iterations from a private constant to a public member variable of the TSNE class. The default value for this new argument is set at 250 such that code using previous versions of sklearn will still get identical results if no action is taken to account for the change.

Being able to set this variable is an important component of running t-SNE and it shouldn't be hidden. Other implementations (e.g., LvdM's) expose it. In the LvdM case it's actually via two arguments but sklearn doesn't distinguish between momentum switch and stop lying. It doesn't seem necessary to treat the two arguments differently.

This change is motivated by recent work that shows a high quality embedding can be achieved with much fewer than 250 early exaggeration iterations.

Any other comments?

Validated that results are identical before and after change for the same inputs.

Converting early exaggeration iterations from a private variable to a public member variable of the TSNE class. The default value for this new argument is set at 250 such that code using previous versions of sklearn will still get identical results if no action is taken to account for the change. This behavior was validated.

Being able to set this variable is an important component of running t-SNE and it shouldn't be hidden. Other implementations, for example, LvdM's (https://github.com/lvdmaaten/bhtsne/blob/master/tsne.h#L44-L45) expose it. In LvdM case it's actually via two arguments but sklearn doesn't distinguish between momentum switch and stop lying. It doesn't seem necessary to treat the two arguments differently.

This change is motivated by recent work (https://doi.org/10.1101/451690) that shows high quality embeddings can be achieved with much fewer than 250 early exaggeration iterations.
@cciccole cciccole changed the title [MRG] Add early exaggeration iterations as argument [MRG] Add early exaggeration iterations as argument to t-SNE Oct 28, 2018
tighten up width in a couple places. Remove trailing whitespace.
Missed a spot.
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you able to add a test that this at least has some effect, and that the validation is now correct?

@@ -518,15 +518,27 @@ class TSNE(BaseEstimator):
learning rate is too low, most points may look compressed in a dense
cloud with few outliers. If the cost function gets stuck in a bad local
minimum increasing the learning rate may help.
Some discussion on how to set learning rate optimally can be found
at https://doi.org/10.1101/451690. Effective use of this parameter has
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the References section and ReST citation format instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the References section and ReST citation format instead.

Number of iterations out of total n_iter that t-SNE should spend
in the early exaggeration phase. If embedding quality is suffering as a
consequence of increasing number of samples being embedded, increasing
this value and n_iter proportionately can help.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it often be useful to specify this as a fraction of n_iter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is one way to do it but I think it is better to specify the number as the actual number. This is for two main reasons:

  1. Aligns with other existing implementations which have the equivalent of this argument specified as the number (see my note about LvdM implementation above).
  2. Ensures that existing client code will get the same results as before, regardless of the number of iterations they had set with the n_iter argument. If we suddenly start using a percentage as the default argument, code with n_iter set to something different than 1000 will suddenly use a different early exaggeration phase length.

@eamanu
Copy link
Contributor

eamanu commented Oct 30, 2018

Hi! IMHO this represent changes on the functionality and interface on t_sne and this can have some problems of compatibility right? Maybe this could be introduce this change in future version and add some Deprecation message to let know that t_sne will change.

@cciccole
Copy link
Author

Hi! IMHO this represent changes on the functionality and interface on t_sne and this can have some problems of compatibility right? Maybe this could be introduce this change in future version and add some Deprecation message to let know that t_sne will change.

@eamanu Hello! Technically it modifies the interface but not in such a way as to require a deprecation warning. That's because it's just an optional argument being added, the default value of which is equal to what it was before when this variable was hidden. After this update, code that was using the previous version will get identical results with no action necessary.

That being said, let me know if there are conventions or principles for this project that I'm not aware of that would require more careful treatment as you suggest.

@cciccole
Copy link
Author

@jnothman I'll look at addressing your other concerns, thanks.

@eamanu
Copy link
Contributor

eamanu commented Oct 31, 2018

@cciccole yeah! you are right. IMO this change must be written on doc on a versionchanged title.

@jnothman
Copy link
Member

jnothman commented Nov 1, 2018 via email

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs test to "Validated that results are identical before and after change for the same inputs."

n_iter_without_progress=300, min_grad_norm=1e-7,
metric="euclidean", init="random", verbose=0,
random_state=None, method='barnes_hut', angle=0.5):
n_iter_early_exag=250, n_iter_without_progress=300,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backward compatibility, please place, n_iter_early_exag at the end of the function signature.

@@ -518,15 +518,27 @@ class TSNE(BaseEstimator):
learning rate is too low, most points may look compressed in a dense
cloud with few outliers. If the cost function gets stuck in a bad local
minimum increasing the learning rate may help.
Some discussion on how to set learning rate optimally can be found
at https://doi.org/10.1101/451690. Effective use of this parameter has
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the References section and ReST citation format instead.

Base automatically changed from master to main January 22, 2021 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants