@meren meren released this Dec 28, 2016 · 4133 commits to master since this release

Assets 3

We are happy to finally announce a new stable release of anvi'o.

Between August and December we made more than 500 commits to the repository that resulted in ~8,000 new lines of Python and JavaScript code, and the removal of ~3,500. These changes introduced fixes for many small bugs, and new features.

We originally intended to release this before the XMAS, you know, as a "thank you" present from the anvi'o developers to the users of anvi'o, but it didn't work. But we couldn't let this stop us from using the holiday theme!

image

OK. Here is an incomplete list of things that worth mentioning.

If you are updating your anvi'o installation, please don't forget to refresh your browser window by pressing CTRL+SHIFT+R, or COMMAND+SHIFT+R so the cached browser files from the old version are updated as well.

A new pangenomic workflow

Until this version the anvi'o pangenomic workflow has mostly been exploiting the visualization capabilities of anvi'o. In this release, we go beyond that and offer you a much more powerful workflow with summary (#412), and protein cluster inspection (248fb14) capabilities.

An up-to-date user tutorial for our new pangenomic workflow is here:

http://merenlab.org/2016/11/08/pangenomics-v2/

We thank Jarrod J Scott for following the development very closely and testing new pangenomic features. His feedback has been very helpful to identify problems and resolve them quickly.

Anvi'o contigs dbs can has NCBI COGs

Functional annotation continues to be a pain. Although we have been very excited about EggNOG mapper, and wrote hundreds of lines of code to make things easier for anvi'o users (i.e. a driver class and a client that can import results into an anvi'o contigs database), that endeavor ended up being a failure (due to some technical issues outside of anvi'o).

To make at least a simple option available, we decided to make anvi'o able to deal with NCBI COGs. Yes, they are not being updated any longer (because why should we have nice things anyway), but they still do quite a sufficient job. Here is a short documentation that describes how can anvi'o contigs databases can be annotated with COGs:

http://merenlab.org/2016/10/25/cog-annotation/

A new, much less painful way to install anvi'o

Homebrew. Enough said!

Please see our updated installation instructions:

http://merenlab.org/2016/06/26/installation-v2/

New programs and scripts

  • anvi-gen-genomes-storage: Creates a new class of anvi'o databases, "genomes storage", for pangenomics.
  • anvi-display-pan: Just like anvi-interactive, but for anvi'o pan profiles.
  • anvi-setup-ncbi-cogs: Sets up NCBI COGs by downloading the data from the NCBI, reformats some files, and generates search databases for NCBI's blastp or DIAMOND.
  • anvi-run-ncbi-cogs: Uses your own resources to perform search, and stores results into the contigs database.
  • anvi-export-functions: Exports all the functional annotations from an anvi'o contigs database so you can import them using anvi-import-functions
  • anvi-export-gene-calls: Exports gene calls from an anvi'o contigs database so you can use them later with --external-gene-calls parameter when generating a new contigs database.
  • anvi-delete-collection: Removes a collection from an anvi'o profile or pan database.

And here are some scripts:

  • anvi-script-FASTA-to-contigs-db: For the lazy to go from a FASTA file to an anvi'o contigs database. You should NEVER use this and do it properly. But I use it all the time. It's OK. We can share the shame.
  • anvi-script-get-collection-info: Cool script that tells you completion and redundancy estimates for a bin or the entire contigs database.
  • anvi-script-snvs-to-interactive: Takes an anvi'o SNVs profile by anvi-gen-snv-profile, and generates an output for anvi-interactive in --manual mode. It is kinda cool.
  • anvi-script-upgrade-contigs-db-v6-to-v7, anvi-script-upgrade-contigs-db-v7-to-v8, anvi-script-upgrade-profile-db-v16-to-v17: You know what they are.
  • anvi-script-run-eggnog-mapper: It doesn't work. Because it calls eggnog-mapper, and then eggnog-mapper starts a server for fast search (which is awesome), and then when it is done it sends a SIGKILL, making every parent process die abruptly. So you get your results, but you die before you have a chance to store your results into the database. You can hack eggnog-mapper to not do that, but then you would have to kill every instance of egnogg-mapper manually later to clean the memory. Sigh.

Upgrading the way we deal with single-copy gene collections

We now maintain two single-copy gene collections. One from Campbell et al, which serves for the bacterial domain, and one from Rinke et al., which serves for the archaeal domain (#401). The new HMM infrastructure (#402) and the new learning module in anvi'o makes it possible to do some other interesting things going forward.

More intuitive nomenclature

We have been using "portion-covered" as a way to describe the ratio of the number of nucleotide positions that are covered by at least one read in a given contig. This in fact is what can be called 'detection', since it is exactly what it is. Well, that mistake is now fixed (#428).

Prettier interfaces

We know. You are asking yourself "how prettier can anvi'o get?". Right?! Well, thank you. As you know, besides the standard one, the anvi'o interactive interface has multiple modes for specialized operations. Such as the pan mode for pangenomics, or the manual mode for ad hoc visualization tasks. Now the interface tells you which mode you are in, because that's the right thing to do:

image

Also if you have been using anvi'o for a while, we hope you will appreciate what we did with the left panel.

Other notable bug fixes and improvements

  • Tom finally won, and we dropped the requirement to have the name 'contig' in the A1:1 column of additional data files (#425).
  • Improved default values for the interface to make everyone's life much easier (#422).
  • anvi-get-sequences-for-hmm-hits can now return AA sequences with --get-aa-seqeunces flag (#400).

Features we have in mind for the next release.

  • We are aware of the performance issues with anvi-profile. Our next major goal is to take care of it.
  • We are also planning to leave Python 2 behind, and completely switch to Python 3.
  • We are planning to complement anvi-run-ncbi-cogs with a new program, anvi-run-pfams.
  • We are hoping to expand the new pangenomic workflow with carefully crafted analytical approaches that will deliver sweets from phylogenomics.
  • Finally we want to focus on anvi'server, and make it ready for prime-time.

Although we have no evidence to know whether anyone is reading these release notes or not, we thank you very much for patience with us. Please visit the installation page for instructions.