MultilocusBlast is a bioinformatics pipeline written in Nextflow. It performs BLAST based multi-locus species characterization
Prepare the input samplesheets as per the format shown in the dummy reference files. DO NOT change the column headers as it was configured that way in the pipeline.
- assemblies samplesheet (refer assets/fasta_samplesheet.csv)
- reference genomes (refer assets/reference_input1.csv)
- consensus genome (fasta format)
Note
If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.
Now, you can run the pipeline as below on your terminal or modify the run_MULTILOCUS.sh if you want to submit to HPC:
nextflow run main.nf \
-profile <conda,singularity.../institute> \
--input fasta_samplesheet.csv \
--reference_input reference_input1.csv \
--consensus test_reference_Genome.fasta \
--pid 75 \ (default, adjust as per your species)
--outdir <OUTDIR>Warning
Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration as per your environment setup except for parameters;
see docs.
multilocusblast was originally written by Anusha Reddy Ginni.
Thanks to these amazing people for their assistance in the development of this pipeline: SMorrison42 hseabolt
If you would like to contribute to this pipeline, please see the contributing guidelines.
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.