Skip to content

davemcg/Snakerail

Repository files navigation

Snakerail

Snakemake wrapper for monorail.

How to use:

  1. Remove all module load singularity lines from Snakerail unless your HPC also uses a module system
  • Then install singularity with conda
  1. git clone https://github.com/davemcg/Snakerail.git
  2. git clone https://github.com/langmead-lab/monorail-external
  3. Install reference info and images using monorail scripts:
  • cd /path/to/ref/folder
  • bash ~/path/monorail-external/get_unify_refs.sh
  • bash ~/path/monorail-external/get_human_ref_indexes.sh
  • singularity pull docker://quay.io/broadsword/recount-unify:1.1.0
  • singularity pull docker://quay.io/benlangmead/recount-rs5:1.0.6
  1. ONLY IF YOU HAVE SINGLE END FILES: copy the run_recount_pump_single.sh in src of this repo to the singularity folder in wherever you cloned the monorail repo. For example: cp ~/path/to/Snakerail/src/run_recount_pump_single.sh ~/path/monorail-external/singularity/
  2. Copy and edit the yaml to your working dir
  3. Create file metadata file in tsv format (used in yaml as study_fq). Example here
  4. Run (SPECIFIC TO NIH HPC) bash /path/to/repo/Snakerail/Snakerail/Snakerail.wrapper.sh snakerail_config.yaml
  • a bit more generically, you could run something like snakemake -s /path/to/this/repo/Snakerail --configfile snakerail_config.yaml

Uh, doesn't monorail use Snakemake?

Yes, but the pump and unify steps are (at least for me) a bit fiddly to keep track of the individual steps. So this wraps the whole thing in one Snakefile. Essentially you start with a metadata tsv (first col is study, second col is fastq prefix, and third col is single or paired to denote how the sequencing was done) and your fastq files in a folder. It runs pump, then moves them all into a folder for unify. After unify finishes, it munges the unify output into a RSE for direct use in recount3

Why doesn't David just remove all the module lines?

Because this is for my working use on NIH HPC, which uses a module system which I abuse instead of rolling my own containers or conda envs or something. If you do want to run this and are having trouble, let me know. I don't think it's much more effort to make more general. Again, I'm lazy and don't want to optimize further if only I am using it.

What is with that file in src?????

Monorail has a bug (?) where the script they provide to run it assumes, for a local run, that it is paired end. This script just tweaks it lightly to take out the second fq file and move up the study name by one

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages