## Running SAM

Below, is a quickstart tutorial to analyze scRNA-seq data using SAM and visualize results.

In [1]:
from SAM import SAM
from SAMGUI import SAMGUI

Initialize the SAM object with default values.

In [2]:
sam = SAM()

Load the data (by default, expects genes as rows and columns as cells)

In [3]:
sam.load_data('../example_data/GSE74596_data.csv.gz')

By default, 'load_data' outputs an AnnData (`.h5ad`) file of the AnnData object containing the expression data. This file is much faster to load than the csv file, especially for large datasets. To load this file, use `load_data`. We also have cell annotations we can load (optional),

In [4]:
sam.load_annotations('../example_data/GSE74596_ann.csv')

Log-normalize and filter the data:

In [5]:
sam.preprocess_data()

To run SAM using default parameters,

In [6]:
sam.run()

8000
RUNNING SAM
Iteration: 0, Convergence: 0.3890052387357582
Iteration: 1, Convergence: 0.22737380247588126
Iteration: 2, Convergence: 0.010527955229685465
Computing the UMAP embedding...
Elapsed time: 5.46448540687561 seconds


SAM outputs a ranked list of genes (stored in `sam.adata.uns['ranked_genes']`). The genes are ordered by their respective weights (stored in `sam.adata.var['weights']`).

In [7]:
print(sam.adata.uns['ranked_genes'][:100]) #top 100 genes

['ENSMUSG00000037313.16' 'ENSMUSG00000017716.15' 'ENSMUSG00000069272.5'
 'ENSMUSG00000001403.13' 'ENSMUSG00000027306.15' 'ENSMUSG00000026573.7'
 'ENSMUSG00000044734.15' 'ENSMUSG00000020649.11' 'ENSMUSG00000006398.15'
 'ENSMUSG00000028873.16' 'ENSMUSG00000005233.16' 'ENSMUSG00000041431.16'
 'ENSMUSG00000032218.6' 'ENSMUSG00000014453.3' 'ENSMUSG00000046057.4'
 'ENSMUSG00000040204.6' 'ENSMUSG00000058773.2' 'ENSMUSG00000029810.15'
 'ENSMUSG00000023505.13' 'ENSMUSG00000094777.2' 'ENSMUSG00000040899.13'
 'ENSMUSG00000023367.14' 'ENSMUSG00000074403.2' 'ENSMUSG00000030149.15'
 'ENSMUSG00000019942.12' 'ENSMUSG00000021411.15' 'ENSMUSG00000027715.9'
 'ENSMUSG00000030165.16' 'ENSMUSG00000067613.5' 'ENSMUSG00000038943.16'
 'ENSMUSG00000005470.8' 'ENSMUSG00000020914.17' 'ENSMUSG00000019773.7'
 'ENSMUSG00000089656.1' 'ENSMUSG00000035459.15' 'ENSMUSG00000027469.16'
 'ENSMUSG00000025747.12' 'ENSMUSG00000015437.4' 'ENSMUSG00000027326.13'
 'ENSMUSG00000106281.1' 'ENSMUSG00000030867.7' 'ENSMUSG00000069301

All the important SAM objects and results are stored in `sam.adata`:

In [8]:
sam.adata #the AnnData object

AnnData object with n_obs × n_vars = 203 × 45686 
    obs: 'annotation'
    var: 'mask_genes', 'spatial_dispersions', 'weights'
    uns: 'preprocess_args', 'ranked_genes', 'pca_obj', 'X_processed', 'neighbors', 'run_args'
    obsm: 'X_pca', 'X_umap'
    layers: 'X_disp', 'X_knn_avg'

To launch the GUI,

In [9]:
sam_gui = SAMGUI(sam)
sam_gui.SamPlot

HBox(children=(Tab(children=(FigureWidget({
    'data': [{'hoverinfo': 'text',
              'marker': {'size'…

To save the SAM object and all its contents into a Pickle file:

In [10]:
sam.save('example_save',dirname = 'output_directory')

To load a saved SAM object and all its contents:

In [11]:
sam=SAM(); #create an empty SAM object
sam.load('output_directory/example_save.p')

For more detailed tutorials, please see the other Jupyter notebooks.