meK-Means: Biophysically Motivated and Interpretable Inference of Cell Types from Multimodal Sequencing Data
Notebooks for reproducing all figures and analysis of simulated and single-cell datasets for the meK-Means paper. All saved/processed data used for analysis can be found on CaltechData. Figure created with BioRender.com.
------ For a tutorial of how to use meK-Means -----
See the example_meKMeans_notebook.ipynb
------ For a tutorial in generating U and S counts -----
See the get_data_example_notebook.ipynb
.
For notebooks that run on Google Colab, you will see the Colab link included at the top of the notebook. Just click on the symbol.
An introduction to using Google Colab can be found here. Briefly, run each code cell by selecting the cell and executing Command/Ctrl+Enter.
#To install meK-Means
pip install monod
import monod
from monod import mminference #Function implementing meK-Means algorithm
meK-Means utilizes the Monod package for single-cell, CME-based inference.
-
example_meKMeans_notebook.ipynb
: Tutorial notebook for using meK-Means to cluster single-cell RNA-seq data. -
get_data_example_notebook.ipynb
: Tutorial notebook of how to obtain U and S counts from single-cell RNA-seq FASTQs. -
analysis_notebooks All analysis notebooks from which the paper figures were generated.
sim_benchmark_data_gen
: Folder for generation of simulation data and all preprocessing of biological datasets into loom files.Fig1_standard.ipynb
: Notebook for generating Fig. 1 standard clustering plots.Figs_2_3_plots.ipynb
: Notebook for generating Fig. 2 and 3 benchmark results plots.Fig4_explorData.ipynb
: Notebook for generating Fig. 4 exploratory analysis plots.sim_benchmark_meK_Leiden_KMeans.ipynb
: Notebook for running clustering of meK-Means, Leiden, and K-Means on all data.sim_benchmark_WNN.ipynb
: Notebook for running Seurat WNN on all data.sim_benchmark_scMDC_scVI.ipynb
: Notebook for running scVI and scMDC on all data (uses Colab GPU).Supp_dropout.ipynb
: Notebook for analysis of simulated dropout data.Supp_timing.ipynb
: Notebook for runtime benchmarking of meK-Means.
-
analysis_output
- Saved result files (clustering method results) from analysis notebooks.
-
scripts
- Python script to extract germ cell dataset metadata and scMixology data metadata.
-
env
- Conda environment (yml) for Linux.