New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add early exaggeration iterations as argument to t-SNE #12476
base: main
Are you sure you want to change the base?
Conversation
Converting early exaggeration iterations from a private variable to a public member variable of the TSNE class. The default value for this new argument is set at 250 such that code using previous versions of sklearn will still get identical results if no action is taken to account for the change. This behavior was validated. Being able to set this variable is an important component of running t-SNE and it shouldn't be hidden. Other implementations, for example, LvdM's (https://github.com/lvdmaaten/bhtsne/blob/master/tsne.h#L44-L45) expose it. In LvdM case it's actually via two arguments but sklearn doesn't distinguish between momentum switch and stop lying. It doesn't seem necessary to treat the two arguments differently. This change is motivated by recent work (https://doi.org/10.1101/451690) that shows high quality embeddings can be achieved with much fewer than 250 early exaggeration iterations.
tighten up width in a couple places. Remove trailing whitespace.
Missed a spot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you able to add a test that this at least has some effect, and that the validation is now correct?
@@ -518,15 +518,27 @@ class TSNE(BaseEstimator): | |||
learning rate is too low, most points may look compressed in a dense | |||
cloud with few outliers. If the cost function gets stuck in a bad local | |||
minimum increasing the learning rate may help. | |||
Some discussion on how to set learning rate optimally can be found | |||
at https://doi.org/10.1101/451690. Effective use of this parameter has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the References section and ReST citation format instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the References section and ReST citation format instead.
Number of iterations out of total n_iter that t-SNE should spend | ||
in the early exaggeration phase. If embedding quality is suffering as a | ||
consequence of increasing number of samples being embedded, increasing | ||
this value and n_iter proportionately can help. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it often be useful to specify this as a fraction of n_iter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is one way to do it but I think it is better to specify the number as the actual number. This is for two main reasons:
- Aligns with other existing implementations which have the equivalent of this argument specified as the number (see my note about LvdM implementation above).
- Ensures that existing client code will get the same results as before, regardless of the number of iterations they had set with the n_iter argument. If we suddenly start using a percentage as the default argument, code with n_iter set to something different than 1000 will suddenly use a different early exaggeration phase length.
Hi! IMHO this represent changes on the functionality and interface on t_sne and this can have some problems of compatibility right? Maybe this could be introduce this change in future version and add some Deprecation message to let know that t_sne will change. |
@eamanu Hello! Technically it modifies the interface but not in such a way as to require a deprecation warning. That's because it's just an optional argument being added, the default value of which is equal to what it was before when this variable was hidden. After this update, code that was using the previous version will get identical results with no action necessary. That being said, let me know if there are conventions or principles for this project that I'm not aware of that would require more careful treatment as you suggest. |
@jnothman I'll look at addressing your other concerns, thanks. |
@cciccole yeah! you are right. IMO this change must be written on doc on a versionchanged title. |
versionadded, not changed, is appropriate here. And of course there will be
a changelog entry for each new feature
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs test to "Validated that results are identical before and after change for the same inputs."
n_iter_without_progress=300, min_grad_norm=1e-7, | ||
metric="euclidean", init="random", verbose=0, | ||
random_state=None, method='barnes_hut', angle=0.5): | ||
n_iter_early_exag=250, n_iter_without_progress=300, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For backward compatibility, please place, n_iter_early_exag
at the end of the function signature.
@@ -518,15 +518,27 @@ class TSNE(BaseEstimator): | |||
learning rate is too low, most points may look compressed in a dense | |||
cloud with few outliers. If the cost function gets stuck in a bad local | |||
minimum increasing the learning rate may help. | |||
Some discussion on how to set learning rate optimally can be found | |||
at https://doi.org/10.1101/451690. Effective use of this parameter has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the References section and ReST citation format instead.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Converting early exaggeration iterations from a private constant to a public member variable of the TSNE class. The default value for this new argument is set at 250 such that code using previous versions of sklearn will still get identical results if no action is taken to account for the change.
Being able to set this variable is an important component of running t-SNE and it shouldn't be hidden. Other implementations (e.g., LvdM's) expose it. In the LvdM case it's actually via two arguments but sklearn doesn't distinguish between momentum switch and stop lying. It doesn't seem necessary to treat the two arguments differently.
This change is motivated by recent work that shows a high quality embedding can be achieved with much fewer than 250 early exaggeration iterations.
Any other comments?
Validated that results are identical before and after change for the same inputs.