Materials and scripts for building cell type encyclopedia table
Make sure that the latest models are uploaded to s3://celltypist/models/*/
. Put all shareable models in a local folder (a subset of s3 models), and run the following:
python src/generate_json_from_latest_models.py /path/to/local_model_folder
New json file will stay in json/models.json
. Upload to s3://celltypist/models/
.
Cell types with <10 cells from a tissue-dataset combination are removed. Make sure the latest models are in ~/.celltypist/data/models/
.
python src/generate_encyclopedia_table.py
The resulting table will stay in encyclopedia/encyclopedia_table.xlsx
.
AnnData should be log-normalised first (1e4). Tissue and cell type orders are defined in the src/generate_Heatmap_data.py
and Heatmap_data/celltype_order.txt
, respectively. Currently tissue and cell type information is stored in Tissue
and re_harmonise_annotation
columns.
Cell types with <=10 cells from a tissue-celltype combination are thought as non-existing (black grids in the heat map).
python src/generate_Heatmap_data.py /path/to/adata
Heatmap data will stay in Heatmap_data/exp_pct_celltypist_immune.pkl
. Upload to s3://celltypist/Heatmap_data/
.
Images are in images/*.png
. White background, 842 x 736 (pixels).
Correspondence between cell type names and images is in images/celltype_to_image.csv
.
tables/Basic_celltype_information.xlsx
: free text of basic cell type information.
tables/celltypist_immune_meta.csv
: cell meta-information for deriving the tissue and dataset information: adata.obs[['re_harmonise_annotation', 'Tissue', 'Dataset']].to_csv('celltypist_immune_meta.csv', header=True, index=False)
.
tables/dataset_to_PMID.csv
: link/paper of each data set.