Automated RNA-seq processing pipeline developed by Deng YuSen and Zhou Jing.
This pipeline streamlines the analysis of bulk RNA-seq data, including:
- Quality Control (QC)
- Read Mapping (STAR)
- Expression Quantification (RSEM)
- Result Merging and Annotation
The workflow is modular, easy to run, and suitable for standard bulk RNA-seq datasets.
- Fully automated from raw FASTQ to merged expression matrix
- Modular design: each step can be run independently
- Generates logs for QC, mapping, counting, and merging
- Easy to track and debug
- Designed for reproducibility in Tangyilab projects
- Linux system
- Anaconda (recommended) with Python 3.7+
- STAR, RSEM, TrimGalore installed and in PATH
- Required Python packages:
pandas,numpy,logging - Reference annotation CSV file for gene info
| Version | Description |
|---|---|
| V1.0 (Current) | Internal release for Tangyilab servers. Fully automated execution; dependencies pre-installed manually. |
| V2.0 | Generalized version that supports installation and execution on any Linux server (manual dependency setup required). |
| V3.0 | Adds automatic dependency installation and environment configuration (conda-based setup). |
| V4.0 | Introduces multi-species support and automatic reference genome building. |
- Operating system: Linux
- Python: ≥3.7 (Anaconda recommended)
- Installed software:
- Python packages:
pandas,numpy,logging
Run the main pipeline script inside your project directory:
bash autoRNA.sh