Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable degree of freedom is not supported for 1D #78

Closed
dkobak opened this issue Oct 1, 2019 · 11 comments · Fixed by #84
Closed

Variable degree of freedom is not supported for 1D #78

dkobak opened this issue Oct 1, 2019 · 11 comments · Fixed by #84

Comments

@dkobak
Copy link
Collaborator

dkobak commented Oct 1, 2019

Just realized that our variable degree of freedom is not supported for 1D visualizations. Let me tell you why it could be cool to implement when we talk later today :-)

By the way, here is the 1D MNIST tSNE that I am getting with my default settings
fast_tsne(X50, learning_rate=X50.shape[0]/12, map_dims=1, initialization=PCAinit[:,0])
where X50 is 70000x50 matrix after PCA:

mnist-1d

It's beautiful, but 4d and 9s (salmon/violet) get tangled up. I played a bit with the parameters but couldn't make them separate fully. It seems 1D optimisation is harder than 2D (which is no surprise).

@linqiaozhi
Copy link
Member

I finally went ahead and implemented this in my fork. I tried it on a simple example, but not yet on MNIST. Can you please try it out and if it works as expected I will merge to the main repo.

@linqiaozhi
Copy link
Member

linqiaozhi commented Nov 16, 2019

Here's a simple test:

from fast_tsne import fast_tsne
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt

SEED = 37

data = datasets.load_digits()
embed_df_1 = fast_tsne(data.data, df = 1, map_dims = 1, seed = SEED)
embed_df_05 = fast_tsne(data.data, df = .05, map_dims = 1, seed = SEED)

x = np.random.rand(data.data.shape[0]) 
fig, axs = plt.subplots(2)
axs[0].scatter(embed_df_1,x,  c=data.target)
axs[1].scatter(embed_df_05,x, c=data.target)
fig.show()

image

@dkobak
Copy link
Collaborator Author

dkobak commented Nov 16, 2019

Great! I will give it a try during next week.

@dkobak
Copy link
Collaborator Author

dkobak commented Nov 25, 2019

Sorry, was a bit overwhelmed with stuff. Hoping to try it out this week...

@dkobak
Copy link
Collaborator Author

dkobak commented Dec 13, 2019

I have finally gave it a try. I tried embedding full MNIST in 1D with various value of df. It seems to work as expected, so I guess you can go ahead and merge into master! Great that you found time to implement it.

One thing I was surprised to see, is that I did not observe any effect until I decreased df much below .5. For 2D, df=.5 was already producing a very strong effect.

mnist1d

And the same but rescaled horizontally:

mnist1d_rescaled

@dkobak
Copy link
Collaborator Author

dkobak commented Dec 13, 2019

Another thing, is that digits don't split like they do in 2D, but rather the gaps between digits increase...

@dkobak
Copy link
Collaborator Author

dkobak commented Dec 13, 2019

Also, didn't in 2D the embedding typically grow in size with decreasing df? Here the size decreases when df decreases below 1...

@dkobak
Copy link
Collaborator Author

dkobak commented Dec 27, 2019

Hey, what do you think? Are you planning to merge this branch? Or do you want to investigate anything first?

@linqiaozhi
Copy link
Member

linqiaozhi commented Dec 28, 2019

Thanks for all the extensive testing, @dkobak. These are interesting differences between the 2D and 1D. I don't have an explanation for them, but I don't think it's due to a bug.

I went ahead and merged the fork.

@dkobak
Copy link
Collaborator Author

dkobak commented Dec 28, 2019

Cool. I will add my above tests to the example Python notebook.

@dkobak
Copy link
Collaborator Author

dkobak commented Dec 29, 2019

Added to the example notebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants