Custom losses, coherent embeddings #35

znah · 2018-01-04T12:36:27Z

Nice property of TSNE, that is not exploited in most of implementations, is that it can be treated as a combination of two orthogonal components: loss function and optimization algorithm. For example one may visualize set on temporally varying vectors with a sequence on coherent embeddings by adding a loss term that penalizes unnecessary movement of each vector between those embeddings. Is it possible to have provide such flexibility to use additional constrains with UMAP?

lmcinnes · 2018-01-04T14:55:35Z

It is a little hard to do that and still maintain efficiency; potentially it could be added as an extra code-path on the side that is slower but more flexible. That would be a more significant project however.

znah · 2018-01-04T15:27:23Z

Thank you for the prompt reply!
I'm actually looking forward seeing your write-up about UMAP algorithm, hoping to reimplement it in a flexible way. For example, the most naive implementation of tsne loss with TensorFlow boils down to something like this:

def tsne_kl_loss(points, P):
    n = tf.shape(points)[0]
    Q = 1.0 / (1.0 + pdist2(points))
    sQ = tf.reduce_sum(Q) - tf.cast(n, tf.float32)
    return tf.reduce_sum(P*tf.log(P/Q)) + tf.log(sQ)

Then one can combine this loss with others and use one of standard optimizers.
Do you think if this kind of approach can be adopted to UMAP?

lmcinnes · 2018-01-04T15:54:58Z

At that level, yes almost certainly; if you are willing to do N^2 work then you can certainly have custom loss -- I was generally seeking to avoid that. On that front you might be interested by smallvis which implements t-SNE, LargeVis and UMAP in a common framework which I suspect would easily be adaptable to custom loss functions. The cathc is that it only supports small datasets for exactly the reason cited above. As a way to experiment, however, it is quite powerful.

znah · 2018-01-04T18:08:19Z

Thank you for pointing me to smallvis! I think that's exactly what I needed.
N^2 can actually go surprisingly far with efficient GPU implementation. For example, here is a random youtube video showing n^2 nbody with 60k particles at 30fps.

Still, I like to think of algorithmic optimizations, like employing BH or something else as yet another, partially orthogonal component.

vanhoan310 · 2020-07-23T22:23:20Z

How to obtain the matrix P from UMAP? Is it self.graph_ from fuzzy_simplicial_set function?

Thanks!

lmcinnes · 2020-08-07T03:40:59Z

That is the equivalent of it, yes.

stsievert mentioned this issue Aug 26, 2018

Implement loss function #133

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom losses, coherent embeddings #35

Custom losses, coherent embeddings #35

znah commented Jan 4, 2018

lmcinnes commented Jan 4, 2018

znah commented Jan 4, 2018

lmcinnes commented Jan 4, 2018

znah commented Jan 4, 2018

vanhoan310 commented Jul 23, 2020

lmcinnes commented Aug 7, 2020

Custom losses, coherent embeddings #35

Custom losses, coherent embeddings #35

Comments

znah commented Jan 4, 2018

lmcinnes commented Jan 4, 2018

znah commented Jan 4, 2018

lmcinnes commented Jan 4, 2018

znah commented Jan 4, 2018

vanhoan310 commented Jul 23, 2020

lmcinnes commented Aug 7, 2020