MosMitCRT

MOSquito MITochondrial Control Region Trimmer

This software takes mitochondrial genomes, annotates them with Prokka, and extracts the control sequence (all the bases between the Isoleucine tRNA and the 12S rRNA that don't code for anything). The control sequence is then matched against a reference database of known species' control sequence motifs.

Dependencies:

To run properly, this pipeline needs:

Recommended versions of theese dependencies are in the environment.yml file, for use with Conda virtual environments.

Options:

Option	Use	Default
--in	Specify the input file(s) in fasta format	You must specify an input
--motif	Specify the motif file to use	You must specify a motif file or use --nomotif
--out	Specify the ouput folder name	pipe_out
--nomotif	Override --motif and skip searching for motifs	Searches for motifs from the --motif file

Output files:

(named output)
- prokka-annotations
  - One folder for each annotation format, each containing annotation files for each processed sequence.
- control-sequences
  - individual
    - Individual fasta files for each extracted control sequence.
  - "all_sequences.fasta" contains all the control sequences in one fasta file.
- mast
  - Contains the MAST reports in html, json, and plaintext formats.
- "annotations.gff" contains annotations (and only annotations) in GFF3 format for the control sequences.
- "annSeq.gff" contains annotations in GFF3 format, with a ##FASTA section containing all sequence data, as well.

Test commad

This pipeline comes with some genomes from NCBI as test material, as well as some motifs to search for.

Test commands:

Normal usage:

nextflow run MosMitCRT --in "MosMitCRT/testData/mos*.fasta" --motif MosMitCRT/testData/motifs.txt

This command can also be run by using -profile test.

No motif search, check reverses and rotated sequence handling:

nextflow run MosMitCRT --in MosMitCRT/testData/testv3.fasta --nomotif

This command can be run by using -profile test2

Versioning:

versioning: X.Y.Z X: Major release version

Anything that adds an analysis step.
Things like adding a BLAST search, adding options for genome annotation programs, etc.
Changes the core functionality of the program. Y: Minor release version
Changing the interface or adding smaller features, things that don't affect the core functionality of the program.
Giving a command line option to add Prokka options, changing output folder strucutre, etc.
Optionally different, but produces the same type of output and accepts the same inputs. Z: Bugfix version
For when I make mistakes and have to push another version to change something that otherwise breaks the program.
Also used for back-end improvements that don't actually affect the usage or output.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
bin		bin
testData		testData
.travis.yml		.travis.yml
README.md		README.md
environment.yml		environment.yml
license.txt		license.txt
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MosMitCRT

Dependencies:

Options:

Output files:

Test commad

Test commands:

Versioning:

About

Releases

Packages

Languages

License

TMRHarrison/MosMitCRT

Folders and files

Latest commit

History

Repository files navigation

MosMitCRT

Dependencies:

Options:

Output files:

Test commad

Test commands:

Versioning:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages