<a href="https://colab.research.google.com/github/piquelab/sclabor/blob/master/UMAP_sphere_embedding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Placenta scRNAseq data embedded on a sphere using UMAP dev version 0.4
This is a simple tutorial demonstrating embedding single cell data on the surface of a sphere [UMAP development version 0.4](https://umap-learn.readthedocs.io/en/0.4dev/embedding_space.html) developed by McInnes et al. To read more about the motivation on why this type of embedding is an interesting and useful representation for the latent cell-type/state space you should check this [preprint by Jiarui Ding & Aviv Regev](https://www.biorxiv.org/content/10.1101/853457v1?rss=1) where they also develop a new method [scPhere](https://github.com/klarman-cell-observatory/scPhere) that can correct batch effects. 

The starting point of this notebook is the PCA output from our placenta single cell analysis we recently [published in eLife](https://elifesciences.org/articles/52004) using (Seurat)[https://satijalab.org/seurat/] (raw data in dbGaP [phs001886.v1.p1](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001886.v1.p1)). You can also find excellent tutorials on complete scRNAseq data analysis using kallisto and bustools developed by the Pachter lab [here](https://www.kallistobus.tools/tutorials). 

![](https://media.giphy.com/media/fVQDcPGiYKIj31g6Qv/giphy-downsized-large.gif)


In [0]:
## https://umap-learn.readthedocs.io/en/0.4dev/embedding_space.html
##!pip install umap-learn/
!pip install -e git+https://github.com/lmcinnes/umap.git@0.4dev#egg=umap4 

Obtaining umap4 from git+https://github.com/lmcinnes/umap.git@0.4dev#egg=umap4
  Cloning https://github.com/lmcinnes/umap.git (to revision 0.4dev) to ./src/umap4
  Running command git clone -q https://github.com/lmcinnes/umap.git /content/src/umap4
  Running command git checkout -b 0.4dev --track origin/0.4dev
  Switched to a new branch '0.4dev'
  Branch '0.4dev' set up to track remote branch '0.4dev' from 'origin'.
Collecting numba>=0.42
[?25l  Downloading https://files.pythonhosted.org/packages/53/34/22b6c2ded558976b5367be01b851ae679a0d1ba994de002d54afe84187b5/numba-0.46.0-cp36-cp36m-manylinux1_x86_64.whl (3.6MB)
[K     |████████████████████████████████| 3.6MB 3.6MB/s 
[?25hCollecting tbb>=2019.0
[?25l  Downloading https://files.pythonhosted.org/packages/49/74/46406b1f1439401e0cf02cfe6dff65e7fe72ac11e05fb4f63461368784fe/tbb-2020.0.133-py2.py3-none-manylinux1_x86_64.whl (933kB)
[K     |████████████████████████████████| 942kB 43.4MB/s 
Installing collected packages: numba, tbb, um

In [0]:
!pip install /content/src/umap4

Processing ./src/umap4
Building wheels for collected packages: umap-learn
  Building wheel for umap-learn (setup.py) ... [?25l[?25hdone
  Created wheel for umap-learn: filename=umap_learn-0.4.0rc1-cp36-none-any.whl size=63712 sha256=bf022e0b55d719bad941079662d745c12df7d134c69f69af69877a6a9442af02
  Stored in directory: /tmp/pip-ephem-wheel-cache-mlmza0hw/wheels/3b/09/21/805c6358969a8d781cc1d5f14b05a6df3494539cd5a296ad69
Successfully built umap-learn
Installing collected packages: umap-learn
  Found existing installation: umap-learn 0.4.0rc1
    Can't uninstall 'umap-learn'. No files were found to uninstall.
Successfully installed umap-learn-0.4.0rc1


In [0]:
import numpy as np
import pandas as pd
#import numba
#import sklearn.datasets
#import matplotlib.pyplot as plt
#import seaborn as sns
#from mpl_toolkits.mplot3d import Axes3D
import umap
%matplotlib inline


In [0]:
pca = pd.read_table("http://genome.grid.wayne.edu/sclabor/data/pca.tsv")

In [0]:
mdata = pd.read_table("http://genome.grid.wayne.edu/sclabor/data/umap3D.df.tsv")
mdata

Unnamed: 0,UMAP_1,UMAP_2,UMAP_3,Group,Location,LibraryID,NewCls,FinalName,nGene,nUMI
0,7.944668,-1.494323,2.663341,TL,BP,s1DB,C16,EVT,4393,25088
1,-5.786736,2.145621,-0.119451,TL,BP,s1DB,C1,T-cell-activated,368,815
2,-4.354593,1.495096,-1.778693,TL,BP,s1DB,C0,T-cell-resting,663,1838
3,-4.893460,1.061019,-0.581044,TL,BP,s1DB,C0,T-cell-resting,299,696
4,8.232032,-1.251961,2.459991,TL,BP,s1DB,C16,EVT,4758,27672
...,...,...,...,...,...,...,...,...,...,...
77901,5.430239,0.243172,-5.924269,TL,CAM,s9W,C12,Stromal-1,698,1342
77902,6.054707,3.460046,-0.832990,TL,CAM,s9W,C11,Decidual,5747,40408
77903,-3.819507,1.881343,3.518450,TL,CAM,s9W,C2,NK-cell,492,952
77904,0.214451,1.218535,-9.529546,TL,CAM,s9W,C5,Macrophage-1,1992,6982


In [0]:
sphere_mapper = umap.UMAP(output_metric='haversine', random_state=42, n_neighbors=50,min_dist=0.01).fit(pca)

In [0]:
x = np.sin(sphere_mapper.embedding_[:, 0]) * np.cos(sphere_mapper.embedding_[:, 1])
y = np.sin(sphere_mapper.embedding_[:, 0]) * np.sin(sphere_mapper.embedding_[:, 1])
z = np.cos(sphere_mapper.embedding_[:, 0])

In [0]:
 ## Color map as in the article
 cdm={ "B-cell" : "rgb(255,0,0)",
      "npiCTB" : "rgb(50,205,50)",
      "CTB" : "rgb(85,26,139)",
      "Decidual" : "rgb(139,71,38)",
      "Macrophage-1" : "rgb(255,105,180)",
      "Macrophage-2" : "rgb(255,99,71)", 
      "Endometrial" : "rgb(176,48,96)",
      "LED" : "rgb(139,139,0)",  
      "EVT" : "rgb(238,174,238)",
      "Fibroblast" : "rgb(0,0,128)",
      "HSC" : "rgb(255,0,255)",
      "Monocyte" : "rgb(132,112,255)",  
      "Stromal-3" : "rgb(205,133,63)",
      "NK-cell" : "rgb(65,105,225)",                                                                                                               
      "Stromal-1" : "rgb(0,134,139)",
      "Stromal-2" : "rgb(205,140,149)",
      "STB" : "rgb(154,205,50)",
      "T-cell-activated" : "rgb(46,139,87)",                                                                                                           
      "T-cell-resting" : "rgb(176,224,230)"}

In [0]:
import plotly.express as px

In [0]:
df=pd.DataFrame({"x":x,"y":y,"z":z,"id":mdata.FinalName,"lat":sphere_mapper.embedding_[:, 0],"lon":sphere_mapper.embedding_[:, 1]})
fig = px.scatter_3d(df, x='x', y='y', z='z',color="id",color_discrete_map=cdm)
fig.update_traces(marker=dict(size=2))
camera = dict(eye=dict(x=2, y=2, z=0.25))
fig.update_layout(template="plotly_dark",scene_camera=camera)
fig.show()

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [0]:
## save data in google drive
df.to_csv("/content/drive/Shared drives/Piquelab/sclabor.colab/placenta_sphere.tsv", sep='\t')