-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embedding Projector: UMAP and TSNE projections broken for embeddings that are not normalized #6271
Comments
I'm currently facing this issue. Is there any workaround? EDIT: This issue is still persistent in v2.13.0. I'm using Windows 11 and Google Chrome. |
@dmfolgado you can either use the built-in "Sphereize data" option or normalize the embedding yourself. I think all you need to make sure each one is a unit vector in euclidean space. Sphereizing does this and in addition normalizes the centroid to origin so it might even work better |
Thank you for the suggestion. I had already tried but I continue to get different projections between the online Projector and the offline. Local version. Clustering the CBF time series dataset (X_test) Online version. Clustering the CBF time series dataset (X_test) I attach the data and metadata for reproducibility. |
I just found what was causing the discrepancy. The data I was using for the online projection was standard scaled. It seems that using standard scale data and the spherization yields to the same results. |
Any update on the stuck on "Initialize UMAP..." message issue? I am currently facing the exact same problem which I cannot recreate with the online embedding projector using the same data. Unless "Spherize data" is checked before clicking on UMAP, TensorBoard embedding projector remains stuck on "Initialising UMAP...". |
Chrome 111.0.5563.64
Issue description
When using embeddings that are not normalized and sphereized, the UMAP and T-SNE are incorrect or not simply loading.
See #5547 for a previous bug report.
the reason is that knn expects normalized vectors for cosine distance (cosDistNorm) rather than arbitrary vectors.
Alternative repo:
related to: #2421
The text was updated successfully, but these errors were encountered: