We rolled out a framework over the last two releases to deal with issues arising from the introduction of the new BLAST database format and databases created without the -parse_seqids option of makeblastdb. This took the form of SequenceServer trying to detect old format and non-parse_seqids databases on startup and forcing users to upgrade the databases to the new format or rebuild with -parse_seqids option. We are changing our approach. SequenceServer will still detect old format and non-parse_seqids database on startup, but it will no longer force users to upgrade or rebuild BLAST databases. Instead, it will display a warning for each such database and launch as before. For non-parse_seqids databases, SequenceServer will additionally disable FASTA download links on the results page.
To rebuild BLAST databases and enable FASTA download links, use sequenceserver -m.
We had previously noticed that mixing old and new format databases can cause BLAST searches to fail. We have not been able to replicate these results with BLAST+ 2.12.0. If you encounter any such issue, please try upgrading old format databases with sequenceserver -m and report the issue.
Upgrading or rebuilding databases can be slow for very large databases and fail if sequence ids are longer than 50 characters. The latter is a limitation of BLAST. sequenceserver -m informs users of these limitations and also advises to backup the databases before reformatting (inspired by Lukasz Sobala's experience).
Thanks to all the users who tried the initial approach and reported issues. Most of them are documented below. Special thanks to @Shellfishgene for being one of the first to critique the initial approach (#513).
The detection of old and new format databases has been revised to use the %v option of blastdbcmd -list_outfmt in BLAST+ 2.12.0. This is more robust than our initial approach which relied on file extensions and would miss some cases.
Database aliases created using blastdb_aliastool are now ignored by the database reformatting utility. This prevents a bug where the reformatting utility could 'corrupt' the larger database that was referenced by the alias (credit: Lukasz Sobala, #521; special thanks for providing a wonderfully detailed bug report).
Detection of non-parse_seqids option would fail in several cases (see issues #511, #512, #519). We have fixed the reported issues and an unreported issue with multi-part databases.
Attempting to reformat databases created without the -parse_seqids option using sequenceserver -m, would end in the following error message “Error: [makeblastdb] No sequences matched any of the taxids provided”. Fixed it.
SequenceServer should continue its startup routine after offering to create BLAST databases during the initial setup, but a programming error prevented that and it would instead quit after creating BLAST databases. Fixed it.
Fixed a rare issue where SequenceServer would classify a non-FASTA file as FASTA during initial setup (or when using sequenceserver -m) and offer to create BLAST databases from it (#567).
Fixed a rare issue where SequenceServer would fail to scan databases directory during startup (or when using sequenceserver -m) when running inside a docker container.
SequenceServer asks for an optional taxonomy id during database creation, however, a programming error accidentally made it compulsory (sorry!). We have made it optional again.
The plugin/extensions file (-r option) is now loaded before searching for BLAST binaries and databases so that these aspects of SequenceServer can also be customised through the extensions file for advanced use cases (credit: Elvin).
Ability to drag and drop a FASTA file to the search form was not working in Safari. Fixed it.
The form state was not correctly restored in some cases when using the browser's back button or the ‘Edit search’ link on the results page. Fixed it.
The database 'Select all' button would get stuck if a database was manually selected first and the 'Select all' was subsequently used (#562). Fixed it. Thanks to @MatthAlex for reporting the issue.
We have introduced an optional, experimental widget to display databases in tree format - very helpful if you have a long list of databases. The feature was contributed by @Bjoernsen (#520) and is similar to what was previously produced by lepbase (#307). See #520 for details. To activate the widget, add ":databases_widget: tree" to your SequenceServer config file (which is ~/.sequenceserver.conf by default). The tree widget additionally allows linking databases to external page (how to do so will be documented on the website in due time)
Commas are now allowed in advanced params input so that multiple values can be provided to options such as -taxids (credit: Lukasz Sobala)
Include an entry for the very helpful -task option to the handy reference of command-line BLAST options included in the search form - accessed using the ? button next to advanced params input field (#517).
Removed the twitter button from the footer and made the footer more compact.
In the results page we have reverted to showing identity of top HSP in the per-query hit table instead of the average identity of all HSPs for the matching database sequence (credit: Etienne Bucher, #506).
HSPs in Kablammo visualisation are no longer labelled if there are too many HSPs, as too many labels creates clutter without adding much value (#518).
SequenceServer used to automatically hide sidebar containing download links etc. if the BLAST search resulted in no hits. However, as the sidebar also contains the links to edit or start a new search since the last few candidate releases, the sidebar is now always shown.
Ensure long query sequence ids wrap in the sidebar instead of shooting past the sidebar boundary (#571). Credit @MatthAlex
The links to download FASTA and pairwise alignment of all hits are now disabled if the BLAST search results in no hits (#552).
We have introduced a new num_jobs setting. This is the number of concurrent BLAST searches that SequenceServer will run - the default value is 1. This is distinct from num_threads, which is passed to BLAST, and is the number of threads that each BLAST job will use.
Made sure that SequenceServer will write default values of new configuration options to the configuration file for new users. Existing users can update their configuration file to see all possible configuration values by running sequenceserver -s
The whichdb function, that can be used in link generators to determine which database a hit came from, was returning CommandFailed error in some cases. Fixed it (credit: @jveera888, #529).
If you like to customise SequenceServer and use our docker image, there is now an option to build JS and CSS assets as part of the docker build step: just add --target=minify to your docker build command. Note that this requires BuildKit to be enabled (credit: Nathan Weeks)
Huge shout-out to @Iain-S for fixing several code style issues (JS and CSS), and revising our code linting framework (#531, #532).
Shellfishgene, MatthAlex, and 2 other contributors
Automatically check for incompatible databases on startup and prompt users to reformat them.
Reformatting databases now preserves taxonomy information embedded in the database (if any).
Add ability to use -taxids_map of makeblastdb during database creation. To use it place a '.taxids_map.txt' file next to the FASTA file.
Add 'Edit search' and 'New search' links to the report page (thanks to Tomas-Pluskal & TomMD for the push).
Add option to open BLAST results in a new tab.
Search form can now be cleared by reloading the page. Relevant if you used browser's back button or 'Edit search' link and wanted to clear the form to start over.
Make it easier to pass command line arguments to Docker image. For example, number of BLAST threads can now be set as docker run ... wurmlab/sequenceserver sequenceserver -n 4 instead of docker run ... wurmlab/sequenceserver bundle exec bin/sequenceserver -d /db -n4 (yeah!)
Switch to using sequence id instead of accession for sequence retrieval. This fixes FASTA download for gnl|Morex|chr type sequence ids when using version 5 database (#475). Thanks to Eric Y for reporting the issue.
Ensure SequenceServer does not crash if it could not determine host IP, which can be the case if it is being run offline (#482). Thanks to Vladimir for reporting the issue.
Ensure error modal does not remove the rest of the content from the page.
JSON endpoints responded with content-type 'plain/text'. Change that to 'application/json'. Thanks to Richard Adams for reporting the issue.
Include hit title (the stuff after sequence id) in the summary table of hits per query. Of course, titles can be really long, so the text is truncated with ... when it overflows the table cell. The entire title is displayed in a tooltip. Feature requested by Niek Art.
Running sequenceserver -m will now automatically detect older V4 databases, and those created without the -parse_seqids option of makeblastdb, and offer to rebuild them. This works even if you deleted the original FASTA files. Database titles are preserved when rebuilding, however, taxonomy information in the database is unfortunately not preserved.
BLASTing a mix of older V4 and newer V5 databases causes error. SequenceServer now catches and informs the user of this error. Thanks to Massimiliano babbucci for reporting this issue.
The list of databases in the search form should be alphabetically sorted. This behaviour was lost in the rewrite leading to version 2.0 and has now been fixed. Thanks to Loraine Guéguen for reporting the issue.
Improve regular expression used to detect multi-part databases, so that databases that aren't multi-part are recognised and displayed in the interface (#465). Issue reported and fixed by Loraine Brillet-Guéguen.