Skip to content

Transcriptome analysis pipeline using Fastp, Hisat2 and Stringtie

License

Notifications You must be signed in to change notification settings

Venky2804/FHSpipe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FHSpipe

Transcriptome analysis pipeline using Fastp, Hisat2 and Stringtie. Generates read count files for differential expression analysis using popular R programs like edgeR or DEseq2

Figure representing the FHSpipe flowchart

Figure representing the FHSpipe flowchart

R E Q U I R E M E N T S

Requires python version newer than 3.5.0

Python modules:

os,shutil, subprocess, argparse, sys, json, yaml

TOOLS NEEDED:

FILES REQUIRED:

  • Reference genome fasta file (.fasta/.fa)
  • Reference genome annotation file (.gff/.gtf)
  • All Sample fastq files in a single directory (no extra files). Single end sequenced files should be named as Sample.fastq. Pair end sequenced files should be names as Sample_R1.fastq and Sample_R2.fastq to avoid errors. Unzip the sample files if in zip format (.gz)

U S A G E

fhs_glm_pipeline.py [-h] [-p PATH] [-s SPCS] [-r REF] [-g GFF] [--len LEN] [--orfl ORFL] [-o OUT] [-t THREADS] [--lnc]  

A program to run FHS pipeline i.e. FASTp, HISAT2, STRINGTIE, MERGE, GFFCOMPARE, STRINGTIE 2, PREPDE, extract potential lncRNAs (IUX)

Required arguments:

-p PATH,  --path PATH    --> Path of sample fastq files
-s SPCS,  --spcs SPCS    --> Reference species name

Optional arguments:

-h,      --help       --> Show this help message and exit
-r REF,   --ref REF      --> Path of reference genome fasta file if species is not available
-g GFF,    --gff GFF      --> Path of reference genome gtf/gff file if species is not available
      --len LEN     --> Length filter cut off (In number of nucleotides) [Default: 200]
      --orfl ORFL     --> ORF Length filter cut off (In number of amino acids) [Default: 100]
-o OUT,   --out OUT      --> Prefix of output files [Default: out]
-t THREADS, --threads THREADS --> Number of threads to use in Hisat2, Samtools, Stringtie. [Default: 1]
       --lnc        --> Use to only extract lncRNAs and skip file processing for differential expression

OUTPUT FILES

Check the file "Output_guide.txt" in pipeline directory for details of output of the pipeline

About

Transcriptome analysis pipeline using Fastp, Hisat2 and Stringtie

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages