## Cell Assign - Annotation of cell types

taken from: https://docs.scvi-tools.org/en/stable/tutorials/notebooks/cellassign_tutorial.html

Assigning single-cell RNA-seq data to known cell types

CellAssign is a probabilistic model that uses prior knowledge of cell-type marker genes to annotate scRNA data into predefined cell types. Unlike other methods for assigning cell types, CellAssign does not require labeled single cell data and only needs to know whether or not each given gene is a marker of each cell type. The original paper and R code are linked below.

Paper: Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling, Nature Methods 2019

Code: https://github.com/Irrationone/cellassign



In [1]:
from scvi_colab import install

install()

ModuleNotFoundError: No module named 'scvi_colab'

In [5]:
import gdown
import matplotlib.pyplot as plt
import pandas as pd
import scanpy as sc
import scvi
import seaborn as sns
from scvi.external import CellAssign

In [6]:
url = "https://drive.google.com/uc?id=10l6m2KKKioCZnQlRHomheappHh-jTFmx"
output = "./annotation/sce_follicular_annotated_final.h5ad"
gdown.download(url, output, quiet=False)

url = "https://drive.google.com/uc?id=1Pae7VEcoZbKRvtllGAEWG4SOLWSjjtCO"
output = "./annotation/sce_hgsc_annotated_final.h5ad"
gdown.download(url, output, quiet=False)

url = "https://drive.google.com/uc?id=1Mk5uPdnPC4IMRnuG5N4uFvypT8hPdJ74"
output = "./annotation/HGSC_celltype.csv"
gdown.download(url, output, quiet=False)

url = "https://drive.google.com/uc?id=1tJSOI9ve0i78WmszMLx2ul8F8tGycBTd"
output = "./annotation/FL_celltype.csv"
gdown.download(url, output, quiet=False)

Downloading...
From: https://drive.google.com/uc?id=10l6m2KKKioCZnQlRHomheappHh-jTFmx
To: /home/pranav/work/research-reference/nextgen-analysis/scvi-tools/annotation/sce_follicular_annotated_final.h5ad
100%|██████████| 83.0M/83.0M [00:00<00:00, 805MB/s]
Downloading...
From (uriginal): https://drive.google.com/uc?id=1Pae7VEcoZbKRvtllGAEWG4SOLWSjjtCO
From (redirected): https://drive.google.com/uc?id=1Pae7VEcoZbKRvtllGAEWG4SOLWSjjtCO&confirm=t&uuid=b349605f-94fc-47b3-ab7d-ae539f138718
To: /home/pranav/work/research-reference/nextgen-analysis/scvi-tools/annotation/sce_hgsc_annotated_final.h5ad
100%|██████████| 110M/110M [00:00<00:00, 859MB/s] 
Downloading...
From: https://drive.google.com/uc?id=1Mk5uPdnPC4IMRnuG5N4uFvypT8hPdJ74
To: /home/pranav/work/research-reference/nextgen-analysis/scvi-tools/annotation/HGSC_celltype.csv
100%|██████████| 1.16k/1.16k [00:00<00:00, 3.70MB/s]
Downloading...
From: https://drive.google.com/uc?id=1tJSOI9ve0i78WmszMLx2ul8F8tGycBTd
To: /home/pranav/work/researc

'./annotation/FL_celltype.csv'

In [8]:
'./annotation/FL_celltype.csv'

'./annotation/FL_celltype.csv'

In [9]:
sc.set_figure_params(figsize=(4, 4))

%config InlineBackend.print_figure_kwargs={'facecolor' : "w"}
%config InlineBackend.figure_format='retina'

### Follicular lymphoma data

In [11]:
adata = sc.read("./annotation/sce_follicular_annotated_final.h5ad")
adata.var_names_make_unique()
adata.obs_names_make_unique()

  utils.warn_names_duplicates("obs")
  utils.warn_names_duplicates("var")


TypeError: Cannot setitem on a Categorical with a new category (FAM231C-1), set the categories first

In [None]:
marker_gene_mat = pd.read_csv("FL_celltype.csv", index_col=0)

#### Create and fit CellAssign model

In [None]:
bdata = adata[:, marker_gene_mat.index].copy()