Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYSTEMDS-831] Builtin t-SNE algorithm #1360

Closed
wants to merge 8 commits into from

Conversation

lambdasonly
Copy link
Contributor

Testing was done using visual inspection and a reference R implementation (see comments in Java Test file), since it is not possible to directly use an R Test, as the random number Generation is done differently. The CP test runs fine (3s at 1000 iterations), the Spark test takes extremely long (30s at 10 iterations) and therefore never was run to completion.

@lambdasonly lambdasonly changed the title T-SNE Builtin: T-SNE Aug 7, 2021
@lambdasonly lambdasonly changed the title Builtin: T-SNE [SYSTEMDS-831] Builtin t-SNE algorithm Aug 7, 2021
@mboehm7
Copy link
Contributor

mboehm7 commented Aug 9, 2021

LGTM - thanks for creating the builtin function and test @tim-sagaster. Also thanks for the original algorithm @iyounus.

During the merge I made a few tweaks though, which I mention here just for completeness. First, I fixed the test to use a relatively large epsilon so it passes (so far it compared equivalence of doubles which even for tiny discrepancies would fail; it seems ok, but there are a few values where the absolute error goes up to 10% of the value range; we will follow up to understand the algorithm-level differences; also, for iterative algorithms it's not useful to force all operations into distributed Spark ops so nothing to worry there). Second, I left a few TODOs to consolidate the distance computation with the respective builtin and compare against R instead of inlining expected outputs. Finally, I also fixed some formatting and stdout issues and unnecessary imports in the test.

@asfgit asfgit closed this in 962361f Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants