Occasional dramatic differences between tSNE and UMAP #319

jorvis · 2018-10-22T17:01:07Z

On a test dataset I compute neighbors and then immediately compute/plot both tSNE and UMAP and show them next to each other. Sometimes, we get pretty dramatic differences such as the one attached. Is this an algorithmic difference or something wrong with my approach?

sc.pp.neighbors(adata, n_pcs=n_pcs, n_neighbors=n_neighbors)
sc.tl.tsne(adata, n_pcs=n_pcs, random_state=random_state)
sc.tl.umap(adata)

sc.pl.tsne(adata, color=genes_to_color, color_map='RdBu_r', use_raw=False, save=".png")
sc.pl.umap(adata, color=genes_to_color, color_map='RdBu_r', use_raw=False, save=".png")

The text was updated successfully, but these errors were encountered:

chlee-tabin · 2018-10-22T17:53:17Z

This is normal, means that the far away clusters are "globally" more different from the cells that are closer together. UMAP is one way of preserving the global distance, whereas tSNE is pretty much ignorant of the global distance (so one should not consider global distance to make inferences from tSNE plot). I frequently see the UMAP when some very different contaminating cell types are in the sample.

falexwolf · 2018-10-23T18:08:29Z

UMAP also has no meaning attached when clusters are completely disconnected (Supplemental Figure 10 of this, soon updated on here on bioRxiv and finally in a journal...); and I'd tend to think that this is such a case. Then, UMAP's parameters have to be adjusted (mostly min_disd and spread).

It's true that UMAP has less tendency to tear apart connected things than tSNE. Overall, it's more faithful to the global topology.

chlee-tabin · 2018-10-23T19:10:46Z

@falexwolf Just out of curiosity, have you compared your method with PHATE? (https://www.biorxiv.org/content/early/2017/03/24/120378 ). I have yet to try out PAGA but have found PHATE working fairly well of showing the trajectory inference. (I am just a biologist, so don't know the specifics of comparing methodologies)

jorvis · 2018-10-24T01:49:29Z

Thank you all for your feedback here - that was helpful. I'll close this so it doesn't look like an issue needs to be handled, but please, do continue any discussion.

falexwolf · 2018-10-26T02:42:40Z

@chlee-tabin Which method? PAGA? PAGA is for coarse-graining the data whereas PHATE is for embeddings, right?

jorvis closed this as completed Oct 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Occasional dramatic differences between tSNE and UMAP #319

Occasional dramatic differences between tSNE and UMAP #319

jorvis commented Oct 22, 2018

chlee-tabin commented Oct 22, 2018

falexwolf commented Oct 23, 2018

chlee-tabin commented Oct 23, 2018

jorvis commented Oct 24, 2018

falexwolf commented Oct 26, 2018

Occasional dramatic differences between tSNE and UMAP #319

Occasional dramatic differences between tSNE and UMAP #319

Comments

jorvis commented Oct 22, 2018

chlee-tabin commented Oct 22, 2018

falexwolf commented Oct 23, 2018

chlee-tabin commented Oct 23, 2018

jorvis commented Oct 24, 2018

falexwolf commented Oct 26, 2018