White adipose tissue (WAT) single-cell atlases in human and mice 🧈

Goal of the project

This project aims to evaluate the robustness and reliability of cell type annotations in human and mouse white adipose tissue (WAT) single-cell RNA sequencing atlases. Using the comprehensive dataset generated by Emont et al., the central objective is to assess how stable and reproducible annotated cell types are within and across species. To achieve this, we will apply scTypeEval, a framework designed to evaluate the quality of cell type annotations by quantifying inter-sample consistency and the robustness of assigned labels. This approach enables an objective assessment of how consistently annotated cell types are represented across individuals within each species and how well these annotations generalize between human and mouse.

Questions

How consistent are annotated cell types across individuals within each species ?
To what extent do annotated cell types generalize across species ?
How does annotation resolution (broad vs fine-grained) affect consistency ?

Context

Alterations in adiposity are associated with dyslipidemia, insulin resistance, and type 2 diabetes. Understanding how white adipose tissue (WAT) changes, and determining whether mouse models accurately reflect human biology, can help identify specific cell populations or pathways that drive disease progression. scRNA-seq atlases, such as the dataset generated by Emont et al., provide unprecedented resolution to characterize adipose tissue heterogeneity across species and physiological states. However, the reliability of biological conclusions drawn from these atlases critically depends on the robustness and consistency of cell type annotations. Variability in annotation strategies, marker selection, and resolution can limit reproducibility and hinder meaningful cross-species comparisons. Therefore, a systematic evaluation of annotation quality is necessary to ensure that observed similarities or differences between human and mouse WAT reflect biological reality rather than methodological inconsistencies.

scTypeEval

scTypeEval is a framework designed to evaluate the quality of cell type annotations in single-cell RNA sequencing (scRNA-seq) data. Since true reference labels are often unavailable, it uses internal validation metrics to measure how consistent cell type labels are across samples. The tool processes scRNA-seq data, identifies relevant genes (like highly variable genes or marker genes), computes dissimilarities between cell types, and calculates consistency metrics to detect misclassified or ambiguous cell populations. Overall, scTypeEval helps benchmark and compare manual annotations, automated classifiers, and clustering results without requiring ground-truth labels.

Methods

Dataset description

The human lite objects dataset is available at: https://uchicago.box.com/s/bmhkw0j2qkkgnpmpz33bw583pppoib0y and the mouse lite objects dataset at: https://uchicago.box.com/s/p7r6cdbcbwqxh8lxm7frqlcx88zb0mjp . The human data were generated using single-nucleus sequencing (sNuc-seq), which enables the capture of adipocytes, on subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT). In addition, whole-cell Drop-seq (scRNAseq) was performed on subcutaneous WAT. For this approach, single cells were isolated from the non-adipocyte stromal-vascular fraction SVF using collagenase digestion. However, this method cannot capture mature adipocytes because they are too fragile to withstand the procedure. For the mouse data, mice were fed either a chow diet or a high-fat diet (HFD) for 13 weeks. sNuc-seq was then performed on inguinal adipose tissue (ING)(corresponding to human SAT) and perigonadal adipose tissue (PG) (epididymal (EPI) in males and periovarian (POV) in females, corresponding to human VAT)

We see 166 149 observations in the human metadata dataset and 197 721 observations in the mouse metadata dataset. The script dedicated to obtain these numbers is data/WAT_BroadSingleCellPortal_SCP1376/Curate_metadata.Rmd, it also creates a summary .csv file (dataset_metadata_summary.csv)

In the human metadata :

Technology, represents the 2 techniques used to produce the dataset :
- Chromium-v3 (sNuc-seq) = 137 684 nuclei
- Drop-Seq = 28 465 whole cells
Number of replicates, refers how many times an experiment is independently repeated to ensure reliability and measure variability :
- 32 samples
- 22 individuals
Number of genes detected per cell (nFeature_RNA variable) :
- Min. : 249 genes/cell
- Median : 1524 genes/cell
- Mean : 1753 genes/cell
- Max. : 14 442 genes/cell
Number of cells per replicate :
- Mean : 5192.156 cells /replicate
- Median : 5090 cells /replicate
Low-quality / dying cells often exhibit extensive mitochondrial contamination (mt.percent variable)
- Min. : 0.00 %
- Median : 1.34 %
- Mean : 2.00 %
- Max. : 10.00 %
Granularity, number of cell types described
- 45 cell types

In the mouse metadata :

Technology, represents the only techniques used to produce the dataset :
- Chromium-v3 (sNuc-seq) = 197 721 nuclei
Number of replicates:
- 24 samples
- 14 individuals (variable animal)
Number of genes detected per cell (nFeature_RNA variable) :
- Min. : 21 genes/cell
- Median : 1369 genes/cell
- Mean : 1614 genes/cell
- Max. : 11 061 genes/cell
Number of cells per replicate :
- Mean : 8238.375 cells /replicate
- Median : 7766 cells /replicate
Low-quality / dying cells often exhibit extensive mitochondrial contamination (mt.percent variable)
- Min. : 0.00 %
- Median : 0.00 %
- Mean : 0.18 %
- Max. : 9.95 %
Granularity, number of cell types described
- 48 cell types

Preprocessing strategy

The preprocessing strategy consisted of standardizing and quality-controlling the single-cell RNA-seq dataset to ensure comparability with other datasets in the atlas. First, the raw Seurat object was loaded and cell metadata were cleaned and standardized to include consistent variables such as sample, individual, tissue, and technology. Cell identifiers were reformatted to follow a unified structure combining study information and the original barcode, ensuring unique cell IDs across datasets. Gene symbols were then harmonized using STACAS to map them to a reference gene annotation (Ensembl GRCh38), allowing consistent gene naming across studies. The data were normalized using the NormalizeData function from Seurat, which applies log-normalization to correct for differences in sequencing depth between cells. Quality control metrics were evaluated, including the number of detected genes, total UMI counts, mitochondrial and ribosomal transcript percentages, and sequencing complexity. Cells not meeting predefined thresholds were removed to exclude low-quality or stressed cells. Finally, samples with very few cells were discarded and large samples were downsampled to balance representation across individuals.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
Images		Images
Proposed_improvements		Proposed_improvements
UMAP		UMAP
data/WAT_BroadSingleCellPortal_SCP1376		data/WAT_BroadSingleCellPortal_SCP1376
scRNAseq_data_processing		scRNAseq_data_processing
scTypeEval		scTypeEval
scTypeEval_across_species		scTypeEval_across_species
utils		utils
.Renviron		.Renviron
.gitignore		.gitignore
README.md		README.md
renv.lock		renv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

White adipose tissue (WAT) single-cell atlases in human and mice 🧈

Goal of the project

Questions

Context

scTypeEval

Methods

Dataset description

Preprocessing strategy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

White adipose tissue (WAT) single-cell atlases in human and mice 🧈

Goal of the project

Questions

Context

scTypeEval

Methods

Dataset description

Preprocessing strategy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages