GitHub - caokai1073/uniPort: a unified single-cell data integration framework by optimal transport

The original paper: A unified single-cell data integration framework with optimal transport

Website and documentation: https://uniport.readthedocs.io

Source Code (MIT): https://github.com/caokai1073/uniport

Installation

The uniport package can be installed via pip3:

pip3 install uniport

Tutorials

Please checkout the documentations and tutorials for more information at uniport.readthedocs.io.

Main function: uniport.Run()

Key parameters includes:

adatas: List of AnnData matrices for each dataset.
adata_cm: AnnData matrix containing common genes from different datasets.
mode: Choose from ['h', 'v', 'd'] If 'mode=h', integrate data with common genes (Horizontal integration). If 'mode=v', integrate data profiled from the same cells (Vertical integration). If 'mode=d', inetrgate data without common genes (Diagonal integration). Default: 'h'.
lambda_s: balanced parameter for common and specific genes. Default: 0.5
lambda_recon: balanced parameter for reconstruct term. Default: 1.0
lambda_kl: balanced parameter for KL divergence. Default: 0.5
lambda_ot: balanced parameter for OT. Default: 1.0
iteration: max iterations for training. Training one batch_size samples is one iteration. Default: 30000
ref_id: id of reference dataset. Default: The domain_id of last dataset
save_OT: if True, output a global OT plan. Need more memory. Default: False
out: output of uniPort. Choose from ['latent', 'project', 'predict']. If out=='latent', train the network and output cell embeddings. If out=='project', project data into the latent space and output cell embeddings. If out=='predict', project data into the latent space and output cell embeddings through a specified decoder. Default: 'latent'

Data

Google Drive
Baidu Drive Code: 1122

Example

import uniport as up
import scanpy as sc

# HVG: highly variable genes
adata1 = sc.read_h5ad('adata1.h5ad') # preprocessed data with data1 specific HVG
adata2 = sc.read_h5ad('adata2.h5ad') # preprocessed data with data2 specific HVG, as reference data
adata_cm = sc.read_h5ad('adata_cm.h5ad') # preprocesssed data with common HVG

# integration with both common and dataset-specific genes
# latent representation are stored in adata.obs['latent']
adata = up.Run(adatas=[adata1, adata2], adata_cm=adata_cm)
# save global optimal transport matrix: adata, OT = up.Run(adatas=[adata1, adata2], adata_cm=adata_cm, save_OT=True)
# integration with only common genes: adata = up.Run(adata_cm=adata_cm)

Citation

@Article{Cao2022,
author={Cao, Kai and Gong, Qiyu and Hong, Yiguang and Wan, Lin},
title={A unified computational framework for single-cell data integration with optimal transport},
journal={Nature Communications},
year={2022},
month={Dec},
day={01},
volume={13},
number={1},
pages={7419},
issn={2041-1723},
doi={10.1038/s41467-022-35094-8}}

Contact via caokai1073@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
R process		R process
docs		docs
uniport		uniport
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Tutorials

Please checkout the documentations and tutorials for more information at uniport.readthedocs.io.

Main function: uniport.Run()

Data

Example

Citation

About

Releases 7

Packages

Contributors 2

Languages

License

caokai1073/uniPort

Folders and files

Latest commit

History

Repository files navigation

Installation

Tutorials

Please checkout the documentations and tutorials for more information at uniport.readthedocs.io.

Main function: uniport.Run()

Data

Example

Citation

About

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 2

Languages

Packages