This repository stores code for comprehensive single-cell multiomic processing for Mouse and Human Dorsal Root Ganglion (DRG) data, building in DRG based Gene Regulatory Network (GRN) inference via SCENIC+, and training Predicting Accessibility In Nociceptors-net (PAIN-net),.
PAIN-net is a 1D Residual Neural Network (ResNet) designed to model the relationship between genomic sequence and cell-type-specific activity.
-
Multi-Task Learning: The model predicts both Differential Expression (DE)
$Log_2$ Fold Changes (primary task) and auxiliary ATAC-seq signal intensity (auxiliary task). -
Architecture: Features a stem Conv1D layer followed by 5 dilated residual blocks to capture long-range DNA interactions across 800bp input sequences.
-
Weighted Loss Strategy: Implements inverse-frequency weighting plus explicit boosting (e.g., for C-PEP, Mrgprd, and Calca+Sstr2) to handle class imbalances.
-
Custom Loss: Combines Mean Squared Error (MSE) and negative Pearson correlation to optimize for both magnitude and trend.
This directory contains Level 1 processing pipelines for reconstructing and integrating single-cell data.
Processes 18 multiome libraries focusing on the RNA modality.
- Workflow: Performs Seurat v5 integration using CCA (Canonical Correlation Analysis) to correct batch effects.
Reprocesses raw fragment files for 32 human DRG libraries to generate a new Signac object.
- Workflow: Performs Seurat v5 integration using CCA (Canonical Correlation Analysis) to correct batch effects.
A stepwise workflow to generate region sets and topic models from Signac/Seurat objects for GRN inference.
| Step | Script | Purpose |
|---|---|---|
| 1 | Step1_writepycisTopicInputs.R |
Exports peak matrix, cells, and metadata from Signac for pycisTopic compatibility. |
| 2 | Step2_writeSignacDAPsBEDs.R |
Exports cluster-specific Differential Accessibility Peaks (DARs) as BED files. |
| 3 | Step3_create_custom_cisTarget_database.sh |
Builds a custom cisTarget motif database restricted to consensus peaks. |
| 4 | Step4_Run_pycisTopic.py |
Runs MALLET topic modeling, binarizes topics (Otsu/Top-3k), and exports BEDs. |
| 5 | Step5_RunSCENICPlus.sh |
Initializes and runs the SCENIC+ Snakemake pipeline. |
- Parallel Group 1 (Export): Run
Step1_writepycisTopicInputs.RandStep2_writeSignacDAPsBEDs.Rto generate initial matrices and BED files. - Parallel Group 2 (Modeling): Build the custom motif database (Step 3) and run
pycisTopicmodeling (Step 4). - Final (Inference): Edit
config.yamlto include species-specific resources (e.g., mm10 for mouse, hg38 for human) and run the SCENIC+ Snakemake pipeline.
- Genome Consistency: Ensure peaks, blacklist files, reference FASTAs, and chromsizes match your species (mm10 vs hg38).
- Barcode Reconciliation: In Step 4, ensure the
___cisTopicsuffix is removed from cell names to match RNA AnnData barcodes for multiomic integration. - Compute Intensity: Steps 3 and 4 are CPU-heavy and should be run as batch jobs on an HPC system.