Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using bhtsne.py with a numpy array #20

Closed
gsever opened this issue May 3, 2016 · 2 comments
Closed

Using bhtsne.py with a numpy array #20

gsever opened this issue May 3, 2016 · 2 comments

Comments

@gsever
Copy link

gsever commented May 3, 2016

Hello,

I have found this implementation as sklearn's TSNE doesn't scale well with my 50k x 50k similarity matrix. Is there a simple way to pass this matrix the same way it is passed in scikit-learn. Thanks.

@lvdmaaten
Copy link
Owner

Why can't you use the original data as input? If that doesn't work, you could perform an eigenanalysis of the top left singular vectors of the centered distance matrix and use that as input.

In general, however, your approach is not going to scale at all because the size of your input scales quadratically. This is why it is not supported by the code (and why I am not planning to support it).

@gsever
Copy link
Author

gsever commented May 4, 2016

I am not sure if it is a good idea to pass a 50k x 50k matrix as text input. Could you show an example about the eiegenanalysis? Also is there a way to calculate memory estimation for TSNE in general? How about if I don't require high precision, use float16's instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants