Skip to content

The annotate subcommand

Guanliang MENG edited this page Jun 22, 2023 · 1 revision

You can use this subcommand to annotate the mitogenomes generated by MitoZ or any other assemblers.

$ mitoz annotate -h
usage: mitoz annotate [-h] [--workdir <STR>] --outprefix <STR> [--thread_number <INT>] --fastafiles <STR>
                      [<STR> ...] [--fq1 <file>] [--fq2 <file>] [--profiles_dir <STR>] [--species_name <STR>]
                      [--template_sbt <file>] [--genetic_code <INT>]
                      [--clade {Chordata,Arthropoda,Echinodermata,Annelida-segmented-worms,Bryozoa,Mollusca,Nematoda,Nemertea-ribbon-worms,Porifera-sponges}]

Annotate PCGs, tRNA and rRNA genes.

optional arguments:
  -h, --help            show this help message and exit
  --workdir <STR>       workdir [./]
  --outprefix <STR>     output prefix [required]
  --thread_number <INT>
                        thread number [8]
  --fastafiles <STR> [<STR> ...]
                        fasta file(s). The length of sequence id should be <= 13 characters, and each sequence
                        should have 'topology=linear' or 'topology=circular' at the header line, otherwise it
                        is assumbed to be 'topology=linear'. For example, '>Contig1 topology=linear' [required]
  --fq1 <file>          Fastq1 file if you want to visualize the depth distribution
  --fq2 <file>          Fastq2 file if you want to visualize the depth distribution
  --profiles_dir <STR>  Directory cotaining 'CDS_HMM/', 'MT_database/' and 'rRNA_CM/'.
                        [/home/gmeng/dev/MitoZ_private/mitoz/profiles]
  --species_name <STR>  species name to use in output genbank file ['Test sp.']
  --template_sbt <file>
                        The sqn template to generate the resulting genbank file. Go to
                        https://www.ncbi.nlm.nih.gov/genbank/tbl2asn2/#Template to generate your own template
                        file if you like. ['/home/gmeng/dev/MitoZ_private/mitoz/annotate/script/template.sbt']
  --genetic_code <INT>  which genetic code table to use? 'auto' means determined by '--clade' option. [auto]
  --clade {Chordata,Arthropoda,Echinodermata,Annelida-segmented-worms,Bryozoa,Mollusca,Nematoda,Nemertea-ribbon-worms,Porifera-sponges}
                        which clade does your species belong to? [Arthropoda]

Note:

  • The --fastafiles option allows you to provide multiple fasta files (i.e., multiple samples, one fasta file containing ONE mitogenome). This is convenient if you want to annotate many mitogenomes with MitoZ.

  • The length of sequence id should be <= 13 characters, and each sequence should have 'topology=linear' or 'topology=circular' at the header line, otherwise, it is assumed to be 'topology=linear'. For example, '>Contig1 topology=linear'

Clone this wiki locally