Integration into linfa #3

bytesnake · 2021-03-16T11:50:53Z

I just saw your post on Reddit, awesome work! I'm the maintainer of linfa and thought about implementing t-SNE as a transformative dimensionality reduction technique in the past, but never came to it. This crate can take off a lot of work for us. We would implement a wrapper which adepts your algorithm by:

implementing builder style pattern for configuration
using datasets for input/output
implementing transform trait for the algorithm

Sounds good? I just quickly glanced at the source code and three things stood out which could be improved:

make csv dependency optional, sometimes it's not necessary to pull that in
make algorithm generic for num_traits::Float
what is about error handling? Can any part of your algorithm fail, especially what happens if there are NaNs in your data or parameters are mis-configured (e.g. perplexity negative)

The text was updated successfully, but these errors were encountered:

frjnn · 2021-03-16T17:42:11Z

awesome work!

Thanks!

We would implement a wrapper which adepts your algorithm by (..)

Sounds good to me.

make csv dependency optional, sometimes it's not necessary to pull that in

Ok, easily done.

make algorithm generic for num_traits::Float.

Sure.

what is about error handling? (...)

Currently not much to be honest. There's just a check on the perplexity value and NaNs are free to spread.

bytesnake · 2021-03-17T07:38:48Z

k I will ping back here once I have a minimal PR

bytesnake · 2021-03-18T15:57:11Z

I opened a PR here rust-ml/linfa#101. I'm running into two problems with the Iris flower dataset:

everything is NaN for theta=0
stack overflow when perplexity=1 (is such a low perplexity reasonable?)

frjnn · 2021-03-18T16:41:42Z

Could you please specify the full configurations of parameters that you used?

bytesnake · 2021-03-18T18:39:16Z

for the stack overflow:

        bhtsne::run(
            &mut data, 
            nsamples,
            nfeatures,
            &mut y,
            2,
            1.0,
            0.5,
            false,
            2000,
            250,
            250,
        );

for the NaN output:

        bhtsne::run(
            &mut data, 
            nsamples,
            nfeatures,
            &mut y,
            2,
            15.0,
            0.0,
            false,
            2000,
            250,
            250,
        );

frjnn · 2021-03-19T08:46:30Z

The former error was caused by an overflow happening during the computation of the optimal entropy for the P distribution (it is done sort of by applying a binary search over the real numbers and for very small perplexity values it can take some iterations). Although unusual, as in the paper values between 5 and 50 are recommended, a perplexity value of 1.0 is fine and the algorithm should be capable of handling the case. The latter (the NaN output one) was caused by the same issue combined with a bug in the squared euclidean distance matrix function.

They should both be fixed now. I'm currently working on switching to num_traits::Float.

bytesnake · 2021-03-19T14:23:02Z

sounds good, I'm still a bit confused that you got a stack overflow though you are not using recursion anywhere in your algorithm. (at least that's where I ran into that last time) you still have to push your changes 😄
I would also recommend to move the rng init here out of the loop, because it can be really slow:

bhtsne/src/tsne/mod.rs

Line 486 in dce8862

.sample(&mut rand::thread_rng())

frjnn · 2021-03-19T16:07:36Z

I would also recommend to move the rng init here out of the loop, because it can be really slow.

Done, thanks for the tip.

you still have to push your changes

Also done.

(...) you are not using recursion anywhere in your algorithm.

 bhtsne::run(
            &mut data, 
            nsamples,
            nfeatures,
            &mut y,
            2,
            1.0,
            0.5, // theta
            false,
            2000,
            250,
            250,
        );

theta not being set to 0.0 implies that the algorithm uses the Barnes-Hut acceleration that indeed does some recursive calls during the gradient computation.

If you find any other quirks please let me know.

Also I'd like to ask the following question: why is the num_traits::Float trait necessary? Why would you need f64s?

bytesnake · 2021-03-19T21:13:36Z

theta not being set to 0.0 implies that the algorithm uses the Barnes-Hut acceleration that indeed does some recursive calls during the gradient computation.

I see!

Also I'd like to ask the following question: why is the num_traits::Float trait necessary? Why would you need f64s?

mainly for ergonomic reasons, so that people can choose the precision of their floating points without the need to cast

frjnn added the enhancement New feature or request label Mar 16, 2021

frjnn self-assigned this Mar 16, 2021

frjnn linked a pull request Mar 21, 2021 that will close this issue

Add num_traits::Float and better tests #5

Merged

frjnn closed this as completed in #5 Mar 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration into linfa #3

Integration into linfa #3

bytesnake commented Mar 16, 2021

frjnn commented Mar 16, 2021

bytesnake commented Mar 17, 2021

bytesnake commented Mar 18, 2021 •

edited

frjnn commented Mar 18, 2021

bytesnake commented Mar 18, 2021

frjnn commented Mar 19, 2021 •

edited

bytesnake commented Mar 19, 2021

frjnn commented Mar 19, 2021 •

edited

bytesnake commented Mar 19, 2021

Integration into linfa #3

Integration into linfa #3

Comments

bytesnake commented Mar 16, 2021

frjnn commented Mar 16, 2021

bytesnake commented Mar 17, 2021

bytesnake commented Mar 18, 2021 • edited

frjnn commented Mar 18, 2021

bytesnake commented Mar 18, 2021

frjnn commented Mar 19, 2021 • edited

bytesnake commented Mar 19, 2021

frjnn commented Mar 19, 2021 • edited

bytesnake commented Mar 19, 2021

bytesnake commented Mar 18, 2021 •

edited

frjnn commented Mar 19, 2021 •

edited

frjnn commented Mar 19, 2021 •

edited