v3.3.3
-
Fixed warning related with BLASTp
--seqidlistparameter. For BLAST>=2.9, the TXT file with the sequence IDs is converted to binary format withblastdb_aliastool. -
The
Bio.Applicationmodules are deprecated and might be removed from future Biopython versions. Modified the function that calls MAFFT so that it uses the subprocess module instead ofBio.Align.Applications.MafftCommandline. Changed the Biopython version requirement to >=1.79. -
Added a
pyproject.tomlconfiguration file and simplified the instructions insetup.py. The use ofsetup.pyas a command line tool is deprecated and thepyproject.tomlconfiguration file allows to install and build packages through the recommended method. -
Updated the Dockerfile to install chewBBACA with
python3 -m pip install .instead of the deprecatedpython setup.py installcommand. -
Removed FASTA header integer conversion before running BLASTp. This was done to avoid a warning from BLAST related to sequence header length exceeding 50 characters.
-
The seqids and coordinates of the CDSs closest to contig tips are stored in a dictionary during gene prediction to simplify LOTSC and PLOT5/3 determination (in many cases this reduces runtime by ~20%).
-
Limited the number of values stored in memory while creating the
results_contigsInfo.tsvandresults_alleles.tsvoutput files to reduce memory usage. -
Adding data to the FASTA and TSV files for the missing classes per locus instead of storing the complete per input data to reduce memory usage.
-
The data for novel alleles is saved to files to reduce memory usage.
-
Fixed the in-frame stop codon count values displayed in the reports created by the SchemaEvaluator module.
-
The
UniprotFindermodule now exits cleanly if the output directory already exists. -
Improved info printed to the stdout by the CreateSchema and AlleleCall modules, added comments, and changed variable names to better match data being stored.