Skip to content

4. Output Directories and Files

Laura Carroll edited this page Feb 14, 2022 · 1 revision

A single BTyper3 run will deposit the following in your specified output directory (--output):

btyper3_final_results (directory): Final results directory in which BTyper3 deposits all of its output files. BTyper3 creates this directory in your specified output directory (--output)

your_genome_final_results.txt (file): Final tab-separated text file, 1 per input genome. BTyper3 creates this file, which has a header (denoted with "#"), followed by a single row containing results for the input genome, where columns contain the following:

  • Column 1: #filename The path to the fasta file supplied as input.

  • Column 2: prefix The file name prefix used by BTyper3 for all results files

  • Column 3: species(ANI) The B. cereus group genomospecies producing the highest average nucleotide identity (ANI) value, with the corresponding ANI value in parentheses. If the input genome does not share >= 92.5 ANI with any known B. cereus group species medoid genome, an asterisk is appended to the species name. If species assignment is not performed (--ani_species False), a designation of "(Species assignment not performed)" is given.

  • Column 4: subspecies(ANI) Assigned B. cereus group subspecies, with the corresponding ANI value in parentheses, if applicable. If the input genome does not meet any subspecies thresholds, a subspecies designation of "No subspecies" is given. If subspecies assignment is not performed (--ani_subspecies False), a designation of "(Subspecies assignment not performed)" is given.

  • Column 5: Pseudo_Gene_Flow_Unit(ANI) Assigned B. cereus group pseudo-gene flow unit, with the corresponding ANI value relative to the pseudo-gene flow unit medoid genome in parentheses. If the input genome does not fall within the observed ANI boundary for any previously delineated "true" gene flow unit, an asterisk is appended to the pseudo-gene flow unit name.

  • Column 6: Closest_Type_Strain(ANI) The B. cereus group species type strain, which shares the highest ANI value with the query genome, with the ANI value reported in parentheses. Note: B. cereus group species are often proposed in the literature using unstandardized approaches (e.g., varying genomospecies thresholds, which may produce overlapping genomospecies). We have added the type strain comparison method in BTyper3 v3.2.0, as users may still want to compare a query genome with the type strains of published B. cereus group species. However, interpret results with caution, as some B. cereus group genomes may belong to multiple species using type strain genomes. For more information, check out our review of B. cereus group taxonomy/nomenclature.

  • Column 7: anthrax_toxin(genes) Number of anthrax toxin-encoding genes detected in the input genome, out of the total number of anthrax toxin genes required for a genome to be assigned to biovar Anthracis (i.e., 3 genes; cya, lef, pagA). Anthrax toxin genes detected in the input genome are listed in parentheses.

  • Column 8: emetic_toxin_cereulide(genes) Number of cereulide synthetase-encoding genes detected in the input genome, out of the total number of cereulide synthetase genes required for a genome to be assigned to biovar Emeticus (i.e., 4 genes; cesABCD). Cereulide synthetase genes detected in the input genome are listed in parentheses.

  • Column 9: diarrheal_toxin_Nhe(genes) Number of non-hemolytic enterotoxin (Nhe)-encoding genes detected in the input genome, out of three (nheABC). Nhe-encoding genes detected in the input genome are listed in parentheses.

  • Column 10: diarrheal_toxin_Hbl(genes) Number of hemolysin BL (Hbl)-encoding genes detected in the input genome, out of four (hblABCD). Hbl-encoding genes detected in the input genome are listed in parentheses.

  • Column 11: diarrheal_toxin_CytK(top_hit) Highest-scoring Cytotoxin K (CytK)-encoding gene detected in the input genome (either cytK-1 or cytK-2). The highest-scoring CytK-encoding gene detected in the input genome is listed in parentheses.

  • Column 12: sphingomyelinase_Sph(gene) Sphingomyelinase (Sph)-encoding gene detected in the input genome (sph). The Sph-encoding gene detected in the input genome is listed in parentheses.

  • Column 13: capsule_Cap(genes) Number of B. anthracis-associated poly-γ-D-glutamic acid capsule (Cap)-encoding genes detected in the input genome, out of five (capABCDE). Cap-encoding genes detected in the input genome are listed in parentheses.

  • Column 14: capsule_Has(genes) Number of hyaluronic acid capsule (Has)-encoding genes detected in the input genome, out of three (hasABC). Has-encoding genes detected in the input genome are listed in parentheses.

  • Column 15: capsule_Bps(genes) Number of "B. cereus" exo-polysaccharide capsule (Bps)-encoding genes detected in the input genome, out of nine (bpsXABCDEFGH). Bps-encoding genes detected in the input genome are listed in parentheses.

  • Column 16: Bt(genes) Total number of Bacillus thuringiensis toxin (Bt toxin) genes detected in the input genome. Bt toxin genes detected in the input genome are listed in parentheses. Note: BTyper3 currently detects known Bt toxin genes (i.e., those present in the Bt toxin nomenclature database; accessed September 19, 2019) using translated nucleotide blast (tblastn). This approach is conservative to reflect the analyses conducted in the manuscript (i.e., to limit false positives).

  • Column 17: PubMLST_ST[clonal_complex](perfect_matches) Sequence type (ST) assigned using PubMLST's seven-gene multi-locus sequence typing (MLST) scheme for B. cereus s.l. Square brackets contain the name of the PubMLST clonal complex associated with the ST, if available/applicable. Parentheses contain the number of perfect allele matches (i.e., with 100% nucleotide identity and coverage) out of seven possible.

  • Column 18: Adjusted_panC_Group(predicted_species) panC group assigned using the adjusted, eight-group panC group assignment scheme proposed by Carroll, et al. panC sequences of effective and proposed B. cereus s.l. species are also included in the database but are assigned a species name (e.g., “Group_manliponensis”) rather than a number (i.e., Group_I to Group_VIII). Species associated with a panC group are listed in parentheses. If the query genome does not share >= 99% nucleotide identity and/or >= 80% coverage with one or more panC alleles in the database, the closest-matching panC group is reported with an asterisk.

  • Column 19: final_taxon_names Taxonomic assignment of the isolate, written from longest (species, subspecies [if applicable], and biovars [if applicable]) to shortest (biovars, if applicable) form. If the input genome does not share >= 92.5 ANI with any known B. cereus group species medoid genome (i.e., there is an asterisk appended to the species name in the "species(ANI)" column), a species designation of "(Species unknown)" is given (this designation is also used if species assignment is not performed, i.e., --ani_species False). If 2 or more anthrax toxin genes and/or cereulide synthetaste genes are detected in the input genome, but one or more anthrax toxin genes and cereulide synthetase genes are missing, respectively, an asterisk is appended to the biovar (i.e., "Anthracis*" and "Emeticus*", respectively)

species (directory): Directory in which BTyper3 deposits raw fastANI output files during species assignment. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/species).

subspecies (directory): Directory in which BTyper3 deposits raw fastANI output files during subspecies assignment. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/subspecies).

geneflow (directory): Directory in which BTyper3 deposits raw fastANI output files during pseudo-gene flow unit assignment. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/geneflow).

typestrains (directory): Directory in which BTyper3 deposits raw fastANI output files during type strain comparison. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/typestrains).

virulence (directory): Directory in which BTyper3 deposits raw blast output files during virulence gene detection. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/virulence).

bt (directory): Directory in which BTyper3 deposits raw blast output files during Bt gene detection. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/bt).

mlst (directory): Directory in which BTyper3 deposits raw blast output files during seven-gene MLST. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/mlst).

panC (directory): Directory in which BTyper3 deposits raw blast output files during panC group assignment. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/panC).

logs (directory): Directory in which BTyper3 deposits its log files for a run. BTyper3 creates this directory within the btyper3_final_results directory within your specified output directory (output_directory/btyper3_final_results/logs).

Clone this wiki locally