Skip to content

sid-sethi/ANNSeq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ANNSeq - ANalysis of Nanopore SEQuencing

Maintainer Generic badge Maintenance Linux Lifecycle:maturing GPLv3 license

ANNSeq is a snakemake pipeline that takes Oxford Nanopore Sequencing (ONS) data (fastq) as input, generates fastq stats using nanostat, performs fastq processing and filtering using pychopper, map the reads to the genome using minimap2 and uses talon to assemble and quantify transcripts. Below is the dag of the pipeline:

Getting Started

Input

  • ONS fastq reads
  • Reference genome assembly in fasta format
  • GTF: Gencode GTF; tested on v38 comprehensive CHR gene annotation

Depedencies

  • miniconda
  • The rest of the dependencies (including snakemake) are installed via conda through the environment.yml file

Installation

Clone the directory:

git clone --recursive https://github.com/sid-sethi/ANNSeq.git

Create conda environment for the pipeline which will install all the dependencies:

cd ANNSeq
conda env create -f environment.yml

Usage

Edit config.yml to set up the working directory and input files/directories. snakemake command should be issued from within the pipeline directory. Please note that before you run any of the snakemake commands, make sure to first activate the conda environment using the command conda activate annseq.

cd ANNSeq
conda activate annseq
snakemake --use-conda -j <num_cores> all

It is a good idea to do a dry run (using -n parameter) to view what would be done by the pipeline before executing the pipeline.

snakemake --use-conda -n all

You can visualise the processes to be executed in a DAG:

snakemake --dag | dot -Tpng > dag.png

To exit a running snakemake pipeline, hit ctrl+c on the terminal. If the pipeline is running in the background, you can send a TERM signal which will stop the scheduling of new jobs and wait for all running jobs to be finished.

killall -TERM snakemake

To deactivate the conda environment:

conda deactivate

Output

working directory  
|--- config.yml           # a copy of the parameters used in the pipeline  
|--- Nanostat/  
     |-- # output of nanostat - fastq stats  
|--- Pychopper/  
     |-- # output of pychopper - filtered fastq  
|--- Mapping/  
     |-- # output of minimap2 - aligned reads  
|--- Talon/  
     |-- # output of Talon  
     |-- _talon.gtf                       # assembled transcripts  
     |-- _talon_abundance_filtered.tsv    # transcript abundance  
     

About

Analysis of Nanopore Sequencing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published