Skip to content
CDS characterization in transcripts of Eukaryote species
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
tutorial Update Oct 9, 2019



Latest GitHub release DOI

CodAn (Coding sequence Annotator) is a computational tool designed to characterize the CDS and UTR regions on transcripts from any Eukaryote species.

Getting Started


Decompress the CodAn.tar.gz file:

tar -xf CodAn.tar.gz

Add the bin directory to your PATH:

export PATH=$PATH:path/to/CodAn/bin/


Predictive models

The predictive models are available in the subfolder "models". The folder contains all models designed for Eukaryote species (i.e., Fungi, Plants and Animals [Invertebrates and Vertebrates]). The models were designed to be used in Full-Length or Partial transcripts.

Download the model specific to your necessities, as described at the "models" folder, decompress the model file (using unzip, and indicate the decompressed model path in the -m option.


Usage: [options]

  -h, --help            show this help message and exit
  -t file, --transcripts=file
                        Mandatory - input transcripts file (FASTA format),
  -m model, --model=model
                        Mandatory - path to model, /path/to/model
  -s string, --strand=string
                        Optional - strand of sequence to predict genes (plus,
                        minus or both) [default=both]
  -c int, --cpu=int     Optional - number of threads to be used [default=1]
  -o folder, --output=folder
                        Optional - path to output folder,
                        /path/to/output/folder/ if not declared, it will be
                        created at the transcripts input folder
  -b proteinDB, --blastdb=proteinDB
                        Optional - path to blastDB of known protein sequences,
  -H int, --HSP=int     Optional - used in the "-qcov_hsp_perc" option of
                        blastx [default=80]

Basic usage (predict CDS): -t transcripts.fa -o output_folder -m model

Alternative usage (predict CDS and perform BLAST search in specific DB to annotated predicted genes based on similarity): -t transcripts.fa -o output_folder -m model -b blast_DB

To run this optional step, just indicate a specific protein DB mounted using the makeblastdb function from the NCBI-BLAST approach. The user can download the pre-mounted protein DBs, such as swissprot (


Follow the instructions in the quick tutorial to learn how to use CodAn and interpret the results.


If you use or discuss CodAn, please cite the preprint:

Nachtigall et al. CodAn: predictive models for the characterization of mRNA transcripts in Eukaryotes




To report bugs, to ask for help and to give any feedback, please contact Pedro G. Nachtigall:

You can’t perform that action at this time.