Skip to content

MattBashton/grolar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Grolar

Grolar is a downstream parser script for the Pizzly RNA-Seq fusion detector implemented in R. It flattens Pizzly JSON output using jsonlite package in R. It also adds in co-ordinates for gene A and gene B of a fusion, and calculates gene distances (where fusions take place on the same chromosome) using the ensembldb bioconductor package.

Utility shell scripts for GNU parallel

The two provided bash script are also of utility, allowing both Kallisto and Pizzly to be run on multiple cores for an arbitrary number of samples using GNU parallel. The input for both these scripts is a tab delimited flat file taking the form of:

sample_1	sample_1_R1.fastq.gz	sample_1_R2.fastq.gz
sample_2	sample_2_R1.fastq.gz	sample_2_R2.fastq.gz
sample_3	sample_3_R1.fastq.gz	sample_3_R2.fastq.gz

If more than one pair of fastq files exists for each sample simply add these on as extra tab delimited columns.

Having made such a file you can create a job lists for GNU parallel like so:

Make_job_list_kallisto.sh sample_list.txt kallisto_jobs
Make_job_list_pizzly.sh sample_list.txt pizzly_jobs

Each of these lists can then be run with GNU parallel like so:

parallel --progress --jobs 12 --joblog kallisto_joblog.txt < kallisto_jobs

The above example would set of a maximum of 12 jobs simultaneously sourced from kallisto_jobs writing log info to the kallisto_joblog.txt file. Note that defaults such as command-line arguments and .gft and index files need to be set by editing the two shell scripts. The --progress argument will give the following info on the terminal Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete. Full documentation for GNU parallel can be found here.

About

Downstream parser for JSON output from Pizzly

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published