A meta-analysis of gene expression data from 7 vitiligo studies, combining bulk RNA-seq, pseudobulk (from scRNA-seq), and microarray platforms. This repository contains the analysis code, documentation of results, and the AI-assisted workflow used throughout the project.
R scripts for each individual study and three meta-analysis approaches:
| Directory | Study | Platform | Design |
|---|---|---|---|
analyses/Shiu_bulk/ |
Shiu (GSE203262) | Pseudobulk | 6 paired L/NL |
analyses/Brunner_bulk/ |
Brunner (GSE298871) | Pseudobulk | 5 paired L/NL |
analyses/Brunner_bulk_RNAseq/ |
Brunner (GSE298871) | Bulk RNA-seq | 15 paired L/NL |
analyses/Xu_2021_bulk/ |
Xu (OMIX691) | Pseudobulk | 10 vitiligo vs 3 healthy |
analyses/Natarajan_GSE75819/ |
Natarajan (GSE75819) | Illumina microarray | 15 paired L/NL |
analyses/Rashighi_GSE53146/ |
Rashighi (GSE53146) | Illumina microarray | 5 vitiligo vs 5 control |
analyses/Regazzetti_GSE65127/ |
Regazzetti (GSE65127) | Affymetrix microarray | 10 paired L/NL |
analyses/Shiu_GSE203262/ |
Shiu (GSE203262) | scRNA-seq (Seurat) | 6 paired L/NL |
analyses/integrated_scrnaseq/ |
Shiu + Brunner + Xu | scRNA-seq integration | Reference-based |
Meta-analysis scripts in meta_analysis/:
bulk/-- Random-effects meta-analysis (metafor, 5 paired studies)bulk2/-- Robust Rank Aggregation (7 studies, pi-value ranking)bulk3/-- Extended RRA with leave-one-out robustness validation (7 studies)
Shared utilities in scripts/utils/: gene collapsing, enrichment plots, tree plots, count filtering, pathway heatmaps, cell type heatmaps, rank aggregation helpers.
Analysis results (figures, tables, enrichment outputs) are documented in markdown throughout the repository. Key locations:
manuscript/-- Manuscript workspace (JID submission)03-figures/-- Figure legends (main + supplementary)04-tables/-- Table descriptions05-manuscript/-- Full manuscript sections (Abstract, Introduction, Methods, Results, Discussion)06-references/-- Reference list and paper summaries
docs/-- Analysis protocols, data dictionary, study comparisonanalyses/*/README.mdorCLAUDE.md-- Per-study documentation
Note: Data files (CSV, RDS, PDF, PNG, XLSX, HDF5) are gitignored and not included in this repository. See the Data Access section below.
The entire repository is structured as an Obsidian vault. Open the root folder in Obsidian to browse all documentation, manuscript sections, figure legends, and analysis notes as interlinked markdown files. The .obsidian/ configuration is included.
This project was developed using Claude Code as an AI-assisted analysis environment. The .claude/ directory contains:
- Skills (
.claude/skills/): Reusable analysis templatesmicroarray-analysis-- Affymetrix/Illumina microarray DE pipelinernaseq-analysis-- DESeq2 bulk RNA-seq pipelineliterature-search-- PubMed searchword-document-- Manuscript export to .docxcodex-cli-- OpenAI Codex integration
- Slash commands (
.claude/commands/):log-results,study-summary,codex,code-review,codex-review - Project instructions (
CLAUDE.md): Context file that Claude Code reads automatically
Daily Claude Code session prompts are archived in prompts/, organized by date (2026-01-14 onward). These document the iterative AI-assisted analysis and writing process.
A reproducible R environment is provided via Dev Container:
- Docker image: gexijin/vitiligo on Docker Hub
- Base: R 4.5.2 / Bioconductor 3.22 (rocker/verse)
- Configuration:
.devcontainer/Dockerfileand.devcontainer/devcontainer.json - Includes: Seurat, DESeq2, limma, clusterProfiler, fgsea, ComplexHeatmap, and all project dependencies
To use with VS Code Dev Containers or GitHub Codespaces, open the repository and select "Reopen in Container."
Raw data are not included in this repository. To reproduce analyses:
- GEO datasets: Download using accession numbers listed in
geo_datasets.csv - ARCHS4 data: Available at https://maayanlab.cloud/archs4/
- Xu 2021 data: Available from OMIX (accession OMIX691)
- See
data/README.mdfor detailed download instructions
Three complementary meta-analyses across 7 studies (115 samples) identified:
- Melanocyte marker loss: MLANA, PMEL, DCT, KIT, TYRP1 consistently downregulated across all studies
- IFN-stimulated gene activation: EPSTI1, OAS2, IFI27, XAF1, IFIT1 upregulated
- T cell infiltration signatures: CD3D, CD2 upregulated in lesional skin
- Heterogeneity pattern: Downregulated melanocyte genes show high cross-study concordance; upregulated immune genes show more study-specific patterns
This project analyzes publicly available data. Please cite original studies when using their data.