Skip to content

gexijin/vitiligo

Repository files navigation

Vitiligo Meta-Analysis Project

A meta-analysis of gene expression data from 7 vitiligo studies, combining bulk RNA-seq, pseudobulk (from scRNA-seq), and microarray platforms. This repository contains the analysis code, documentation of results, and the AI-assisted workflow used throughout the project.

What's in This Repository

Analysis Code

R scripts for each individual study and three meta-analysis approaches:

Directory Study Platform Design
analyses/Shiu_bulk/ Shiu (GSE203262) Pseudobulk 6 paired L/NL
analyses/Brunner_bulk/ Brunner (GSE298871) Pseudobulk 5 paired L/NL
analyses/Brunner_bulk_RNAseq/ Brunner (GSE298871) Bulk RNA-seq 15 paired L/NL
analyses/Xu_2021_bulk/ Xu (OMIX691) Pseudobulk 10 vitiligo vs 3 healthy
analyses/Natarajan_GSE75819/ Natarajan (GSE75819) Illumina microarray 15 paired L/NL
analyses/Rashighi_GSE53146/ Rashighi (GSE53146) Illumina microarray 5 vitiligo vs 5 control
analyses/Regazzetti_GSE65127/ Regazzetti (GSE65127) Affymetrix microarray 10 paired L/NL
analyses/Shiu_GSE203262/ Shiu (GSE203262) scRNA-seq (Seurat) 6 paired L/NL
analyses/integrated_scrnaseq/ Shiu + Brunner + Xu scRNA-seq integration Reference-based

Meta-analysis scripts in meta_analysis/:

  • bulk/ -- Random-effects meta-analysis (metafor, 5 paired studies)
  • bulk2/ -- Robust Rank Aggregation (7 studies, pi-value ranking)
  • bulk3/ -- Extended RRA with leave-one-out robustness validation (7 studies)

Shared utilities in scripts/utils/: gene collapsing, enrichment plots, tree plots, count filtering, pathway heatmaps, cell type heatmaps, rank aggregation helpers.

Documentation and Results

Analysis results (figures, tables, enrichment outputs) are documented in markdown throughout the repository. Key locations:

  • manuscript/ -- Manuscript workspace (JID submission)
    • 03-figures/ -- Figure legends (main + supplementary)
    • 04-tables/ -- Table descriptions
    • 05-manuscript/ -- Full manuscript sections (Abstract, Introduction, Methods, Results, Discussion)
    • 06-references/ -- Reference list and paper summaries
  • docs/ -- Analysis protocols, data dictionary, study comparison
  • analyses/*/README.md or CLAUDE.md -- Per-study documentation

Note: Data files (CSV, RDS, PDF, PNG, XLSX, HDF5) are gitignored and not included in this repository. See the Data Access section below.

Obsidian Vault

The entire repository is structured as an Obsidian vault. Open the root folder in Obsidian to browse all documentation, manuscript sections, figure legends, and analysis notes as interlinked markdown files. The .obsidian/ configuration is included.

Claude Code Setup

This project was developed using Claude Code as an AI-assisted analysis environment. The .claude/ directory contains:

  • Skills (.claude/skills/): Reusable analysis templates
    • microarray-analysis -- Affymetrix/Illumina microarray DE pipeline
    • rnaseq-analysis -- DESeq2 bulk RNA-seq pipeline
    • literature-search -- PubMed search
    • word-document -- Manuscript export to .docx
    • codex-cli -- OpenAI Codex integration
  • Slash commands (.claude/commands/): log-results, study-summary, codex, code-review, codex-review
  • Project instructions (CLAUDE.md): Context file that Claude Code reads automatically

AI Prompts Archive

Daily Claude Code session prompts are archived in prompts/, organized by date (2026-01-14 onward). These document the iterative AI-assisted analysis and writing process.

Computing Environment

A reproducible R environment is provided via Dev Container:

  • Docker image: gexijin/vitiligo on Docker Hub
  • Base: R 4.5.2 / Bioconductor 3.22 (rocker/verse)
  • Configuration: .devcontainer/Dockerfile and .devcontainer/devcontainer.json
  • Includes: Seurat, DESeq2, limma, clusterProfiler, fgsea, ComplexHeatmap, and all project dependencies

To use with VS Code Dev Containers or GitHub Codespaces, open the repository and select "Reopen in Container."

Data Access

Raw data are not included in this repository. To reproduce analyses:

  1. GEO datasets: Download using accession numbers listed in geo_datasets.csv
  2. ARCHS4 data: Available at https://maayanlab.cloud/archs4/
  3. Xu 2021 data: Available from OMIX (accession OMIX691)
  4. See data/README.md for detailed download instructions

Key Findings

Three complementary meta-analyses across 7 studies (115 samples) identified:

  • Melanocyte marker loss: MLANA, PMEL, DCT, KIT, TYRP1 consistently downregulated across all studies
  • IFN-stimulated gene activation: EPSTI1, OAS2, IFI27, XAF1, IFIT1 upregulated
  • T cell infiltration signatures: CD3D, CD2 upregulated in lesional skin
  • Heterogeneity pattern: Downregulated melanocyte genes show high cross-study concordance; upregulated immune genes show more study-specific patterns

Contact

License

This project analyzes publicly available data. Please cite original studies when using their data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages