a RNAseq quantification pipeline from .fastq files to a gene counts matrix using kallisto for coding and non-coding RNA.
before running, you have to set up the attached Docker image (will take ~30 min):
docker build -t rnaseq-pipeline https://raw.githubusercontent.com/loipf/RNAseq-pipeline/master/docker/Dockerfilenow either replace the Docker container hash (last output line from previous build command) in nextflow.config or run nextflow with the -with-docker rnaseq-pipeline argument.
no pre-processing or quality improvement is performed and must be done by the user! (check file kallisto_aligned_reads_qc.csv for p_pseudoaligned >70% and DESeq2_size_factor around 0.6-1.4). sums up transcripts to gene_symbols (without haplotype and scaffold genes).
it can be run locally with downloaded github-repo and edited nextflow.config file with:
nextflow run main.nfor
nextflow run loipf/RNAseq-pipeline -r main --project_dir /path/to/folder --reads_dir /path/to/samples --ensembl_release 101 --num_threads 10 -with-docker rnsaseq-pipelinefor this execution to work properly, you have to be in the current project directory.
nextflow optional extendable with:
-resume
-with-report report_RNAseq-pipeline
-with-timeline timeline_RNAseq-pipeline
-w work_dirpipeline optional extendable with:
--num_threads 5
--ensembl_release 101
--include_ncrna true # false
--nextflow_stageInMode symlink # copyby default, all output will be saved into the data folder of the current directory.
best to run with a new clear folder structure as not all new results do overwrite old ones.
check quality reports in data/quality_reports to exclude problematic samples.
additional, an 3' and 5' adapter sequence (file) needs to be specified with the nextflow arguments --adapter_3_seq_file [sequence|file.fasta] and --adapter_5_seq_file [sequence|file.fasta] or in the main.nf file. otherwise two empty files named NO_FILE and NO_FILE2 must be created to make this work (needs to be fixed someday). if a file is provided, it must be structured like the following example:
> adapter_3_batch_01
AANTGG
> adapter_3_batch_02
GATCGG