scEval😈: An evaluation platform for single-cell Foundation Models (FMs)

This is the repo for our benchmarking and analysis project. All methods are collected until March 1st, 2024.

Install

To install our benchmarking environment based on scGPT, please use conda to create an environment based on this yml file in your own machine:

conda env create -n scgpt --file scgpt_bench.yml

For other methods we used, please refer to their original project website for instructions. We recommend creating different environments for different methods. Considering the difficulties of installing different scFMs, we provide a list of yml files we used to install these models in the folder installation_baselines.

These methods include:

tGPT, Geneformer, scBERT, CellLM, SCimilarity, scFoundation, CellPLM, UCE. These are also single-cell FMs.

And

TOSICA, scJoint, GLUE, ResPAN, Harmony, scDesign3, Splatter, scVI, Tangram, GEARS. These are task-specific models.

We need scIB for evaluation. Please use pip to install it:

pip install scib

We also provide a scib version with our new function in this repo. Please make sure you have scib >=1.0.4 to run kBET correctly.

We will release a version of scEval with more functions in the future!

Pre-training weights

Most of our experiments were finished based on weights under scGPT_bc. scGPT_full from scGPT v2 was also used in the batch effect correction evaluation.

Pre-training weights of scBERT can be found in scBERT. Pre-training weights of CellLM can be found in cellLM. Pre-training weights of Geneformer can be found in Geneformer. Pre-training weights of SCimilarity can be found in SCimilarity.

scFoundation relies on the APIs for access, please refer scFoundation for details.

Benchmarking information

Please refer to different folders for the codes of scEval and metrics we used to evaluate single-cell LLMs under different tasks. In general, we list the tasks and corresponding metrics here:

Tasks	Metrics
Batch Effect Correction, Multi-omics Data Integration
and Simulation	scIB
Cell-type Annotation and Gene Function Prediction	Accuracy, Precision, Recall and F1 score
Imputation	scIB, Correlation
Perturbation Prediction	Correlation
Gene Network Analysis	Jaccard similarity

The file 'sceval_lib.py' includes all of the metrics we used in this project.

To run the codes in different tasks, please use (we choose batch effect correction as an example here):

python sceval_batcheffect.py

We offer demo datasets for batch effect correction and cell type annotation. Such datasets can be found here.

To avoid using wandb, please set:

os.environ["WANDB_MODE"] = "offline"

Results

We have an official website as the summary of our work. Please use this link for access.

Contact

Please contact tianyu.liu@yale.edu if you have any questions about this project.

Citation

@article{liu2023evaluating,
  title={Evaluating the Utilities of Foundation Models in Single-cell Data Analysis},
  author={Liu, Tianyu and Li, Kexing and Wang, Yuge and Li, Hongyu and Zhao, Hongyu},
  journal={bioRxiv},
  pages={2023--09},
  year={2023},
  publisher={Cold Spring Harbor Laboratory}
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.history		.history
Batch Effect Correction		Batch Effect Correction
Cell type Annotation		Cell type Annotation
Emergent ability		Emergent ability
Gene Network Analysis		Gene Network Analysis
Gene function preiction		Gene function preiction
Imputation		Imputation
Multi-omic data integration		Multi-omic data integration
Perturbation Prediction		Perturbation Prediction
Simulation		Simulation
installation_baselines		installation_baselines
scib		scib
.DS_Store		.DS_Store
readme.md		readme.md
sceval_lib.py		sceval_lib.py
sceval_method.py		sceval_method.py
scgpt.yml		scgpt.yml
scgpt_bench.yml		scgpt_bench.yml
scjoint.py		scjoint.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scEval😈: An evaluation platform for single-cell Foundation Models (FMs)

Install

Pre-training weights

Benchmarking information

Results

Contact

Citation

About

Releases

Packages

Languages

HelloWorldLTY/scEval

Folders and files

Latest commit

History

Repository files navigation

scEval😈: An evaluation platform for single-cell Foundation Models (FMs)

Install

Pre-training weights

Benchmarking information

Results

Contact

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages