These are the flags specific to Similarity Searching using DIAMOND. These will be used via the command line (denoted CMD) or ini file (denoted INI).
- Specify any number of FASTA formatted databases you would like to configure for EnTAP
- Not necessary if you already have DIAMOND configured databases (.dmnd)
Specify which EnTAP database you'd like to use for execution (UniProt, Gene Ontology, and Taxonomy lookups)
- Binary Database (default) - This will be much quicker and is recommended
- SQL Database - Slower although will be more easily compatible with every system
This can be flagged multiple times (ex: - - data-type 0 - - data-type 1)
I would not use this flag unless you are experiencing issues with the EnTAP Binary Database
- Specify :ref:`contaminant<tax-label>` level of filtering
- Multiple contaminants can be selected through repeated flags
- This flag will allow for :ref:`taxonomic<tax-label>` 'favoring' of hits that are closer to your target species or lineage. Any lineage can be used as referenced by the NCBI Taxonomic database, such as genus, phylum, or species.
- Format must replace all spaces with underscores ('_') as follows: "- -taxon homo_sapiens" or "- -taxon primates"
- Specify Gene Ontology levels you would like to normalize to (ex: 0, 1, 2, 3, 4)
- A level of '0' indicates all levels will be printed
- Any amount of these flags can be used
- Default: 1
- More information at: http://geneontology.org/page/ontology-structure
- Specify minimum E-value cutoff for similarity searching (scientific notation)
- Default: 10E-5
- Specify minimum target coverage for similarity searching
- Default: 50%
- Specify minimum query coverage for similarity searching
- Default: 50%
Path to a list of terms you would like to be deemed "uninformative"
The file must be formatted with one term on each line of the file
Example (defaults):
- conserved
- predicted
- unnamed
- hypothetical
- putative
- unidentified
- uncharacterized
- unknown
- uncultured
- uninformative