注意：gene/protein ids must be HUMAN，其它物种需要先转换至相应的orthologous  --- 本例使用pbmc, 假设我们已经**完成了CellType注释**：

[CellphoneDB 教程](https://cellphonedb.readthedocs.io/en/latest/)


1. 先自 [releases](https://github.com/ventolab/cellphonedb-data/releases) 下载、解压，获取其中的 cellphonedb.zip
2. 跟随 [notebooks](https://github.com/ventolab/CellphoneDB/tree/master/notebooks) 进行练习

## 输入示例

In [1]:
import cellphonedb
import scanpy as sc
import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

adata = sc.read_h5ad("tmp/pbmc3k_anno1.h5ad")  
adata.obs['seurat_clusters'] =  adata.obs["seurat_clusters"].astype('category')


adata

AnnData object with n_obs × n_vars = 2697 × 3000
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'pct.mt', 'pct.hb', 'pct.rp', 'nCount_SCT', 'nFeature_SCT', 'S.Score', 'G2M.Score', 'Phase', 'old.ident', 'CC.Difference', 'SCT_snn_res.0.1', 'seurat_clusters', 'Anno_1'
    var: 'features'
    uns: 'neighbors'
    obsm: 'X_pca', 'X_tsne', 'X_umap'
    varm: 'PCs'
    obsp: 'distances'

In [2]:
## 1. metadata 即细胞注释
metadata = adata.obs[['Anno_1']]
metadata.rename(columns={'Anno_1': 'cell_type'}, inplace=True)
metadata.index.names = ['barcode_sample']
metadata.head(3)

Unnamed: 0_level_0,cell_type
barcode_sample,Unnamed: 1_level_1
AAACATACAACCAC-1,T_cells
AAACATTGAGCTAC-1,B_cell
AAACATTGATCAGC-1,T_cells


In [3]:
## 2. 保证 counts files 与 metadata一致
list(adata.obs.index).sort() == list(metadata['cell_type']).sort()

True

In [4]:
## 3. Micronevironments --- CellphoneDB will only calculate interactions between cells that belong to a given microenvironment
## 此处假设所有细胞都存在于一个环境中  即没有限制
microenv = pd.DataFrame({'cell_type':metadata['cell_type'].unique()})
microenv['microenvironment'] = 'Env1'
microenv.head(3)

Unnamed: 0,cell_type,microenvironment
0,T_cells,Env1
1,B_cell,Env1
2,Monocyte,Env1


## 运行
输入只能是filepath，有点麻烦 --- 流程中直接使用其命令行cellphonedb会比较方便

相互作用 = 配体-受体

1. cpdb_analysis_method (本例): 单纯的返回 配体-受体 在各自细胞中的表达值，取二者的平均
2. statistical_analysis_method: 指定细胞类型中，指定的 配体-受体 是否显著高于其余背景？
3. cpdb_degs_analysis_method: 基于DGE，更自由的统计？

总之1/2/3依此扩展了筛选功能，详情见 [notebooks](https://github.com/ventolab/CellphoneDB/tree/master/notebooks)操作以及 [Document](https://cellphonedb.readthedocs.io/en/latest/RESULTS-DOCUMENTATION.html#tutorials)介绍

In [5]:
from cellphonedb.src.core.methods import cpdb_analysis_method 
cpdb_file_path = 'tmp/cellphonedb.zip'
meta_file_path = 'tmp/cellphoneTmp/metadata.tsv'
microenvs_file_path = 'tmp/cellphoneTmp/microenv.tsv'
output_path = 'tmp/cellphoneTmp/out'


## 先保存一下
import os
if not os.path.exists('tmp/cellphoneTmp'):
    os.mkdir('tmp/cellphoneTmp')
metadata.to_csv(meta_file_path, index=True, header=True, sep = '\t')
microenv.to_csv(microenvs_file_path, index=False, header=True, sep = '\t')


## 运行 
cpdb_results = cpdb_analysis_method.call(
    cpdb_file_path = cpdb_file_path,         
    meta_file_path = meta_file_path,         
    counts_file_path = adata,     
    counts_data = 'gene_name',          ## adata.var is "ensembl", "gene_name", "hgnc_symbol"
    microenvs_file_path = microenvs_file_path,
    output_path = output_path
)

[ ][CORE][26/01/25-16:48:27][INFO] [Non Statistical Method] Threshold:0.1 Precision:3
Reading user files...
The following user files were loaded successfully:
counts from AnnData object
tmp/cellphoneTmp/metadata.tsv
tmp/cellphoneTmp/microenv.tsv
[ ][CORE][26/01/25-16:48:30][INFO] Running Basic Analysis
[ ][CORE][26/01/25-16:48:30][INFO] Limiting cluster combinations using microenvironments
[ ][CORE][26/01/25-16:48:30][INFO] Building results
Saved means_result to tmp/cellphoneTmp/out\simple_analysis_means_result_01_26_2025_164830.txt
Saved deconvoluted to tmp/cellphoneTmp/out\simple_analysis_deconvoluted_01_26_2025_164830.txt
Saved deconvoluted_percents to tmp/cellphoneTmp/out\simple_analysis_deconvoluted_percents_01_26_2025_164830.txt


## Result

In [6]:
cpdb_results.keys()

dict_keys(['means_result', 'deconvoluted', 'deconvoluted_percents'])

In [7]:
cpdb_results['means_result'].head(2)

Unnamed: 0,id_cp_interaction,interacting_pair,partner_a,partner_b,gene_a,gene_b,secreted,receptor_a,receptor_b,annotation_strategy,...,NK_cell|T_cells,NK_cell|B_cell,NK_cell|Monocyte,NK_cell|NK_cell,NK_cell|Platelets,Platelets|T_cells,Platelets|B_cell,Platelets|Monocyte,Platelets|NK_cell,Platelets|Platelets
2868,CPI-SS01560CA22,CD99_PILRA,simple:P14209,simple:Q9UKJ1,CD99,PILRA,True,False,True,curated,...,0.0,0.0,0.89,0.0,0.0,0.0,0.0,1.111,0.0,0.0
