Reproducibility repository for the manuscript:
Clonal embeddings allow exploratory analysis of lineage-resolved single-cell data
Sergey Isaev, Alek G Erickson, Igor Adameyko*, Peter V Kharchenko*
This repository contains only notebooks used to generate the figures and results in the paper. The clone2vec algorithm itself lives in a separate repository: https://github.com/kharchenkolab/clone2vec
The analyses use the following published datasets:
| Dataset | Tissue / system | Reference | Accession |
|---|---|---|---|
| Haan et al. | Murine CNS, E9.5/E10.5 | 10.1126/science.adq9248 | E-MTAB-14817 |
| Ireland et al. | SCLC murine organoids | 10.1038/s41586-025-09503-z | 10.5281/zenodo.15857303 |
| Weinreb et al. | In vitro murine hematopoiesis | 10.1126/science.aaw3381 | cospar.datasets.hematopoiesis_130K |
| Sureshchandra et al. | Human PBMCs and tonsils | 10.1016/j.immuni.2025.10.025 | 10.5281/zenodo.18868813 and 10.5281/zenodo.13119615 |
| Chen et al. | Human CRC | 10.1016/j.ccell.2024.06.009 | GSE236581 |
| Luoma et al. | Human colon | 10.1016/j.cell.2020.06.001 | GSE144469 |
| Caushi et al. | Human NSCLC | 10.1038/s41586-021-03752-4 | GSE176022 |
| Liu et al. | Human NSCLC | 10.1038/s43018-021-00292-8 | GSE179994 |
| McCord et al. | Human HNSCC | 10.1126/sciimmunol.aec3133 | GSE287301 and GSE300147 |
| Wang et al. | Human brain tumors | 10.1158/2159-8290.CD-23-0913 | 10.5281/zenodo.10672442 |
All analysis notebooks are linked below via nbviewer, which renders large notebooks more reliably than GitHub's built-in preview.
. ├── 00_Benchmarks/ │ ├── 00_Properties │ └── 01_Subsampling ├── 01_Exploratory_analysis/ │ ├── 00_Haan/ │ │ ├── 00_Preparation │ │ ├── 01_Neurons_analysis │ │ ├── 02_Spatial_validation │ │ ├── 03_Spatial_autocorrelation │ │ └── 04_Additional_example │ └── 01_Ireland/ │ └── 00_Clonal_embedding ├── 02_Gene_expression_analysis/ │ └── 00_Weinreb/ │ ├── 00_d6_prediction │ ├── 01_Neu │ ├── 02_Monocytes │ ├── 03_Illustration │ └── 04_ClonoClusters ├── 03_Immune_datasets/ │ ├── 00_Sureshchandra/ │ │ ├── 00_Data_preparation │ │ ├── 01_PBMC_embedding │ │ └── 02_Tonsils_embedding │ └── 01_Luoma/ │ ├── 00_Embedding_preparation │ ├── 01_Clonal_embedding │ └── 02_Integration ├── 04_Cancer_datasets/ │ ├── 00_NSCLC_Liu/ │ │ ├── 00_Data_preparation │ │ └── 01_Embedding_construction │ ├── 01_NSCLC_Caushi/ │ │ ├── 00_Preparation │ │ ├── 01_Gex_embeddings │ │ ├── 02_clone2vec │ │ └── 03_CD8_plots │ ├── 02_HNSCC_McCord/ │ │ ├── 00_Data_preprocessing │ │ ├── 01_Embedding_construction │ │ ├── 02_Clonal_embedding │ │ ├── 03_Spatial_analysis │ │ └── 04_PBMC_TCR_analysis │ ├── 03_CRC_Chen/ │ │ ├── 00_Embedding_preparation │ │ └── 01_Clonal_embedding │ ├── 04_GBM_Wang/ │ │ ├── 00_Data_loading │ │ ├── 01_Cohort_1 │ │ └── 02_Cohort_2 │ └── 05_Integration/ │ ├── 00_Clonal_embeddings_integration │ └── 01_Seurat └── 05_Additional_analysis/ ├── 00_Clone_size_distribution └── 01_Illustrations