In [1]:
import pandas as pd
import numpy as np
import scimap as sm
import anndata as ad
import warnings
warnings.filterwarnings("ignore")

Running SCIMAP  2.3.5



IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html



# Prior knowledge driven annotation

If you did Clustering before, use your anndata object from clustering!

In [2]:
adata = ad.read_h5ad('../cells_annotated.h5ad')

## read in decision matrix and inspect

In [3]:
# load the phenotyping workflow
phenotype = pd.read_csv('../scimapExampleData/phenotype_workflow.csv')
# view the table:
phenotype.style.format(na_rep='')

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,ELANE,CD57,CD45,CD11B,SMA,CD16,ECAD,FOXP3,NCAM
0,all,ECAD+,,,,,,,pos,,
1,all,Immune,,,pos,,,,,,
2,all,SMA+,,,,,pos,,,,
3,Immune,NK cells,,allpos,,neg,,allpos,,,
4,Immune,Other myeloid cells,,,,pos,,,,,
5,Immune,Treg,,,,,,,,pos,
6,Other myeloid cells,Dendritic cells,,allneg,,,,allneg,,,


Very 'broad' phenotypes only here. Try to undertand how this matrix works and how it tries to recapitulate the hierarchical system we talked about

In [4]:
#%gui qt

In [5]:
image_path = '../scimapExampleData/registration/exemplar-001.ome.tif'

## Threshold the image
This is very subjective and the data is not really 'clean' and suffers a lot from Autofluorescence. Try to find good gates by looking at the marker on the image and the histograms, add biological knowledge (long bright lines are rather autofluorescent elastic fibrers than real staining for example)

In [6]:
sm.pl.napariGater (image_path, adata)

Initializing...
Initializing gates with GMM (per image)...


                                                              

Loading image data...
Calculating contrast settings...


                                                                                                    

Saved contrast settings for exemplar-001--unmicst_cell with 12 channels
Initialization completed in 1.24 seconds
Opening napari viewer...


NameError: name 'napari' is not defined

In [7]:
adata

AnnData object with n_obs × n_vars = 11201 × 9
    obs: 'X_centroid', 'Y_centroid', 'Area', 'MajorAxisLength', 'MinorAxisLength', 'Eccentricity', 'Solidity', 'Extent', 'Orientation', 'CellID', 'imageid', 'leiden', 'leiden_phenotype'
    uns: 'all_markers', 'gates', 'napariGaterProvenance', 'image_contrast_settings'
    layers: 'log'

In [8]:
adata = sm.pp.rescale (adata, gate=adata.uns['gates'])


Scaling Image: exemplar-001--unmicst_cell
Scaling ELANE (gate: 7.464)
Scaling CD57 (gate: 7.207)
Scaling CD45 (gate: 7.027)
Scaling CD11B (gate: 7.093)
Scaling SMA (gate: 6.598)
Scaling CD16 (gate: 6.294)
Scaling ECAD (gate: 7.380)
Scaling FOXP3 (gate: 7.319)
Scaling NCAM (gate: 6.868)


In [9]:
adata = sm.tl.phenotype_cells (adata, phenotype=phenotype, label="phenotype") 

Phenotyping ECAD+
Phenotyping Immune
Phenotyping SMA+
-- Subsetting Immune
Phenotyping NK cells
Phenotyping Other myeloid cells
Phenotyping Treg
-- Subsetting Other myeloid cells
Phenotyping Dendritic cells
Consolidating the phenotypes across all groups


In [10]:
adata.obs['phenotype'].value_counts()

phenotype
SMA+                   3326
ECAD+                  3071
Unknown                2087
Immune                 1853
Other myeloid cells     417
Dendritic cells         248
Treg                    124
NK cells                 75
Name: count, dtype: int64

## Now rename the cells so they fit to the cluster annotations (hint: Dendritic cells need to 'vanish' into another category, otherwise check markers from clustering). This is needed for step 3

In [11]:
adata.obs.dtypes

X_centroid           float64
Y_centroid           float64
Area                   int64
MajorAxisLength      float64
MinorAxisLength      float64
Eccentricity         float64
Solidity             float64
Extent               float64
Orientation          float64
CellID                 int64
imageid             category
leiden              category
leiden_phenotype    category
phenotype             object
dtype: object

In [12]:
adata.obs['phenotype'].value_counts()

phenotype
SMA+                   3326
ECAD+                  3071
Unknown                2087
Immune                 1853
Other myeloid cells     417
Dendritic cells         248
Treg                    124
NK cells                 75
Name: count, dtype: int64

In [13]:
adata.obs['leiden_phenotype'].value_counts()

leiden_phenotype
Immune       4824
Tumor        3239
Myeloid      1798
Vessels       844
Treg          415
Artifacts      81
Name: count, dtype: int64

In [14]:
adata.obs['phenotype'] = adata.obs['phenotype'].astype(str)
adata.obs['phenotype_to_compare'] = adata.obs['phenotype']
adata.obs['phenotype_to_compare'] = adata.obs['phenotype_to_compare'].replace({'ECAD+': 'Tumor', 'SMA+': 'Vessels', 'Other myeloid cells': 'Myeloid',
                                                                               'Dendritic cells': 'Myeloid', 'NK cells': 'Immune'})

In [15]:
adata.write_h5ad('../cells_annotated_both.h5ad')