@meren meren released this Sep 21, 2018 · 180 commits to master since this release

Assets 2

A minor anvi'o release with bug fixes

v5.2 is a minor update over v5 margaret. Release notes for margeret is here:


(please refer to the installation document for up-to-date installation instructions).

@meren meren released this Jun 25, 2018 · 466 commits to master since this release

Assets 2

A minor anvi'o release with bug fixes

v5.1 is a minor release over v5 margaret. Release notes for margeret is here:


(please refer to the installation document for up-to-date installation instructions).

@meren meren released this Jun 29, 2018 · 478 commits to master since this release

Assets 2

We are happy to announce a new version of anvi'o, "margaret".

After nearly 1,500 changes that introduced about 15,000 new lines to the anvi'o codebase and removed about 4,000 from it, the current version includes many fixes to big and small bugs, as well as new features. This page intends to give you a summary of most notable changes that comes with margaret.


The codename is a small tribute to Margaret Oakley Dayhoff, an American physical chemist, who is known as the founder of bioinformatics. Dayhoff developed first programmable computer methods to compare protein sequences, and published in 1965 a book titled "Atlas of Protein Sequences and Structure", which is considered as of today the first text book of bioinformatics. The codename was suggested by Mick Watson, and won the popular vote on Twitter. Dayhoff sadly died at an early age of 57 in 1994, shortly before bioinformatcis emerged as a distinct field. However, her astonishing contributions to life sciences, such as the development of essential approaches for protein sequence comparison and evolutionary tree construction, still constitute some of the most common approaches in our bioinformatics toolkit.

Your new disconcerting toy: GC-content overlaid on reference contexts

Metagenomic read recruitment often results in wavy coverage patterns in the reference context. This phenomenon, which can be attributed to three major sources, can result in up to an order of magnitude coverage difference for genes within the same contig. While we are kind enough to leave those alone who solely work with metagenomic short reads to quantify functions in metagenomes in their blissful world, we wanted to include in this version of anvi'o something so you can overlay GC-content change throughout your contigs to see whether variation you observe in the context of some of your key genes is largely driven by GC-content or not:


This is not yet anything but a qualitative insight for you to make sense of to what extent variation in coverage could be explained by deterministic factors that have nothing to do with the biology of your system given the metagenome, but it shows that more quantitative insights into this could be useful. We will think about this going forward, and we are open to your suggestions!

A new anvi'o workflow management system for serious anvians

This new version of anvi'o includes a new program anvi-run-workflow, which provides an interface to our new module that implements snakemake-based anvi'o workflows.

These workflows offer accessible, reproducible, and comprehensible solutions for complex analyses that may include hundreds of samples. We have been using anvi-run-workflow every day in our lab since it first appeared in our master repository, and we are happy to make its power available to you as soon as we could.


There will be an extensive tutorial very soon, but until then you can send your questions to Alon (smiley).

Single-codon variants for a more powerful framework to study microbial population genetics

Anvi'o already could make sense of single-amino acid variants (SAAVs) in environmental metagenomes. But working with SAAVs was limiting our ability to infer and quantify neutral processes that may not result in changes in the amino acid sequence. We changed our design in such a way, now anvi-profile can characterize single-codon variants (SCVs) if --profile-SCVs flag is declared. We updated our reference manual for variability analysis to include new sections describing SCVs and SAAVs.

With SNVs, SCVs, and SAAVs, anvi'o v5, deserving of its codename, offers a robust framework to investigate population genetics of environmental microbes, while SCVs and SAAVs leverage our ability to tease apart evolutionary forces acting upon them. We hope you enjoy these new toys, and feel free to get in touch with us if you have questions or suggestions.

Visualize environmental variation on protein structures through the new Structure DB

Our efforts to push the boundaries of investigations of environmental variation within microbial populations reaches to a new level in this release with a brand new ability about which we are very excited: linking variation to predicted protein structures.

With the new structure database associated workflows anvi’o can predict the tertiary structure of genes identified from a contigs database using the Protein Database Bank. Then, it can directly overlay onto the predicted protein structures the variability data from your metagenomes in the form of SCVs and SAAVs. All of this is accomplished in just two new programs, anvi-gen-structure-database and anvi-display-structure.

We believe that this nexus between structural biology and metagenomics will elevate environmental metagenomics into the realm of biophysics, and enable investigations into evolutionary processes driving the diversity of proteins that could not be learned from sequence analyses alone.

With these new advances come two new dependencies to additional open-source software, for which we are very grateful: MODELLER and DSSP.

Here is a teaser from the new interactive anvi-display-structure interface:


We will soon make available an extensive tutorial to describe this workflow in detail. Until the, you can send your questions to Evan and Ozcan.

Computing average nucleotide identity for genomes in pangenomes

This release also includes significant improvements for our comparative genomics and pangenomics workflows.

One of these improvements is the inclusion of a new program, anvi-compute-ani, to calculate the average nucleotide identities across a given set of genomes, which can be automatically added into any anvi'o pangenome.

For instance, this is an anvi'o pangenome of the 31 Prochlorococcus isolates we played with in our recent paper:


And this is what you get when you run anvi-compute-ani:


Mike Lee had suggested this as an option a long time ago. We are happy to finally deliver this functionality, which uses pyANI as a backend, for which we are thankful for its developers.

We updated our tutorial on pangenomics to describe intermediate steps.

A new approach to explore functional enrichment in pangenomes

This version of anvi'o also incluedes a new analytical framework to study functional enrichment in a given pangenome based on any arbitrary organizations of genomes. You simply define how would you like to partition your genomes, whether based on a phylogenetic tree or a dendrogram that anvi'o computed from gene cluster distributions, and this new tool finds functions that are enriched in those groups (i.e. functions that are characteristic of a given group of genomes, and predominantly absent from genomes from outside this group).

This is done by the new program anvi-get-enriched-functions-per-pan-group, and Alon extended our current tutorial on pangenomics with an extensive description of how it works.

Native functional annotation options += PFAMs

If you have your own functional annotations for your genes in an anvi'o contigs database, it is quite straightforward to import them via anvi-import-functions program. Anvi'o v3 had made available another program to automatize the annotation process, anvi-run-ncbi-cogs, if you were fine with NCBI's Cluster of Orthologus Groups. This release contains a new program, anvi-run-pfams to use the collection of HMMs produced by the European Bioinformatics Institute based on UniProt.

Tree modification through the interactive interface

It has been a challenge to deal with phylogenetic tree operations in anvi'o interactive interface. This version includes a significant code refactoring effort, which makes possible to have new toys that we could not have before. These new toys include basic tree editing and storage abilities such as re-rooting trees, rotating and collapsing branches. You can even see the branch support values in the mouse tab of the anvi'o interactive interface. These functions are now available to you through the menu that appears when you click a branch in the interactive interface while pressing the Command or Control key:


A new HMM collection to estimate completion of eukaryotic bins

Since its conception anvi’o included single-copy core gene collections to assess the completion and redundancy of bacterial and archaeal bins. This release includes a collection to estimate the completion of eukaryotic bins that Tom Delmont, who recently left us physically to join the ranks of Genoscope, curated from the BUSCO collection.


See Tom's blog post for details and preliminary benchmarks (also, if you are finding these release notes too boring to read, you can try reading this one too).

If you are recovering tiny eukaryotic organisms from your metagenomes please help us improve this collection by reporting back your experiences with it.

Importing metagenome-level short-read taxonomy and the enhanced stacked bar data type

While our efforts on shotgun metagenomes largely focus on genome-resolved strategies, we acknowledge that one could learn a lot from taxonomic annotation of short-reads as an additional layer of information. In this release anvi'o comes with a new program, anvi-import-taxonomy-for-layers with a KrakenHLL parser, which can import short-read level taxonomic annotations into anvi'o profiles. Thanks to the improved data groups, different levels of taxonomy would be available in the layers tab,


And could be visualized easily:


The best part is that our improved stacked bar data type in this release then would allow you to order your metagenomes based on the relative abundance of any given taxon at any given taxonomic level in those metagenomes according to short reads (the example below, orders metagenomes in the infant gut dataset from Sharon et al. based on the increasing relative abundance of Enterococcus):


Here we would like to assume that you're saying to yourself "the example is boring, but the concept has promise". Thanks! We agree.


A year ago I listened to Jeff Gordon's talk at the University of Chicago to which he started with this African proverb:

If you want to go fast, go alone. If you want to go far, go together.

This concept applies to scientific endeavors so well. Speed is transient, and teamwork is essential for major contributions. Fortunately anvi'o has been becoming more and more of a team effort. But looking at our release notes, I don't know whether we could go any faster from v4 to v5 either. This release was a result of significant intellectual and coding contributions from Alon Shaiber, Evan Kiefl, as well as Özcan Esen, whose guidance and hard work continue to keep this operation together. Altogether, they spent hours and hours on big and small features and issues, with an enthusiasm that can be best justified by curiosity and the desire to contribute to your journey in data-driven microbiology. I, Meren, who gets to write this release note one more time, thank them wholeheartedly.


As a team we also thank Jarrod Scott, Alexandra Campbell, Samantha Atkinson, Carlos Ruiz, Bryan Merill, Mike Lee, Varun Srinivasan, and many others who asked for features and reported bugs with their endless patience with us.

We hope you find v5 useful for your research, and we certainly hope you will not run into any bugs we probably left in the code 😇

If you are interested in anvi'o but don't know where to start, catch us in one of our free workshops, or find us on our Slack channel.

@meren meren released this Feb 24, 2018 · 1933 commits to master since this release

Assets 3

We are happy to announce a new version of anvi'o, "rosalind".

After nearly 300 changes that introduced about 15,000 new lines, and removed about 7,500 from the anvi'o codebase, the current version includes many bug fixes, as well as some new features. This release note intends to give you a summary of most important changes.


The codename is a small tribute to Rosalind Franklin, the British biophysicist whose work, among other advances in life sciences, led to the discovery of the DNA double helix. This codename was inspired by Emily Crossette's suggestion, 'esther', "after Esther Lederberg, who co-developed a replica plating method with her husband but was largely unrecognized and discriminated against as a woman scientist". Emily explained that her suggestion was to "celebrate how far we have come as a scientific community and look to the future". Yes. We fortunately did not stay where we were, but we are still far from where we could have been. We remember these women and many others with respect and gratitude, and understand our responsibility to make sure the younger generations of scientists will not suffer from the kinds of discrimination to which their professors were subjected.

An elegant way to upgrade anvi'o databases

Upgrading anvi'o databases is now simpler than ever. With this change, the number of excuses you can use to not switch to the newest version of anvi'o goes from "0" to "-1". Just saying.

We now have a single program, anvi-migrate-db, that upgrades any anvi'o database to the latest version in one step.

As a part of this change we replaced all HDF5 files, which resulted in tremendous performance gain (especially in pangenomic operations that required access to the genome storage database), and up to 10-fold reduction in disk storage needs (for auxiliary data files). As a result, these changes did occur: No more CONTIGS.h5 --the content of this file is now a part of the CONTIGS.db (yay for less clutter). No more SAMPLES.db (more on this down below). Genome storage and auxiliary data files now have .db extensions rather than .h5 as they are now SQLite databases, instead of HD5 files.

Improvements in the pangenomic workflow

We made multiple very critical improvements in our pangenomic workflow. Here is a list of them:

These are the gene clusters you are looking for. Now it is possible to "select" gene clusters programmatically both from the command line, and from the anvi'o interactive interface through the combination of filters. We thank Ryan Bartelme for pushing us to improve our pangenomic workflow as he once again did in #668. Gene clusters that match to these filters are highlighted immediately on the interface, and can be added into any bin/collection for summary:

screen shot 2018-02-15 at 11 41 19 am

Search gene clusters by function. We also now have the capacity to search for gene clusters that describe genes with functions of interest through the command line as well as the interactive interface:

screen shot 2018-02-15 at 4 10 20 pm

Parallel alignment. After identifying all gene clusters in a given pangenome, anvi'o by default would use muscle or famsa to store multiple sequence alignments for amino acid sequences in each gene cluster. This was one of the most time consuming steps of the pangenomic workflow. With v4, anvi'o uses as many cores as you wish anvi'o to use to parallelize amino acid alignments per gene cluster. It changes a lot.

Forced synteny. Gene clusters in a pangenome are by default organized based on their distribution across genomes (so that is the dendrogram in the center). However, with this version there are additional ways to order them, including ordering them by "synteny". In this forced organization you get to choose one of the genomes in your analysis from the "item orders" combo box, which tells anvi'o that you wish to order all gene clusters in your pangenome based on the order of genes in that genome. We found it to be an efficient way to study missing genomic loci, and other not-so-straightforward-to-spot phenomena.

Everything is better in color. Arguably, one of the most important improvements to the pangenomic workflow was the addition of an amino acid alignment conservancy coloring algorithm. This was done in #732 by Mahmoud Yousef, who is currently a second year Computer Science student at the University of Chicago. Mahmoud also very kindly wrote a blog post to explain the details of this algorithm with examples: http://merenlab.org/2018/02/13/color-coding-aa-alignments/.

Gene popups. Now you can click gene caller ids next to the amino acid sequence alignment in inspection pages, and enjoy these functional popups to access any information (#680):


Cleaner terminology. After consulting with the community, we changed all instances of 'protein clusters' in our pangenomic workflow with 'gene clusters'.

Metapangenomics: linking pangenomes and metagenomes

Anvi'o comes with powerful analytical tools to study pangenomes and metagenomes. Now you can take things one step further with the same ease-of-use.

We define metapangenomics as the outcome of the analysis of pangenomes in conjunction with the environment where the abundance and prevalence of gene clusters and genomes are recovered through shotgun metagenomes. This version includes a new program, anvi-meta-pan-genome, that brings the power of metapangenomics into a single command line. Please read our paper on the Prochlorococcus metapangenome to see how this concept could apply to your research.

Improvements in the interactive interface

This release also include multiple notable improvements in the interactive interface.

The 'max coverage' fix we all needed but didn't know. Inspection pages are great to investigate coverage data and single-nucleotide variants in a single-nucleotide resolution, however, it was not quite easy to make visual sense of data when coverage values dramatically differed between samples, or short but non-specific mapping pushed maximum coverage values too high to make sense of the actual population coverage in the context of long contigs. In v4 you will see additional buttons in the inspection pages to mitigate these kinds of visual imperfections. Here is some action for you skeptics:


Descriptions tab gets 1-up. One of the most useful features of the interactive interface is the "Descriptions" tab. Yes, we know you are not using it, but you should. Here is an example to see why you should use them (just wait until the page loads, and see all the information that will show up in the right panel): https://anvi-server.org/merenlab/dwh_o_desum. The description tab is extremely useful to take notes and store them in a profile database to remember later. With this new version, you will be able to point out to an item (#715), which will give access to the reader so they can see where it is on the display by highlighting it, or they can inspect it by clicking 'inspect':


Gene mode: a new, highly-resolved interactive mode to study genome bins

This is yet another way for you to examine your data in high-resolution.

We added a flag to anvi-interactive: --gene-mode. When you use this flag along with a collection and a bin name, it allows you to load the interactive interface in the "gene mode". In this mode every item is a gene, instead of a contig, and you can see the coverage, detection, non-outlier coverage, and non-outlier standard deviation of coverage statistics per gene, independently. You can use these data to order the genes, and order the samples. Inspection of nucleotide level coverages, gene sequences, and even gene functions could also be explored in this mode. This allowed us to easily recognize genes that recruit a lot of non-specific mapping, and identify hyper-variable regions in our genomes. One can also search for genes with certain functions, and see their coverages, and the coverage of the genes that are next to them.

Please refer to the help menu for the interactive interface (via anvi-interactive -h or here) to find out more about this mode.

We are excited about this new feature, and we plan to expand it in future versions of anvi'o. If you have any suggestions/complaints/compliments please leave a comment in this issue: #754. We will soon put a tutorial for this mode online, so stay tuned!

A new and elegant way to extend anvi'o displays: additional data tables

We made a major change in our design to simplify the way various data for items and layers can be imported into anvi'o profile or pan databases, and managed. This change opens doors for endless possibilities to manipulate additional data streams through interactive, command line, and application programmer interfaces. Please see a detailed description on this new framework here: http://merenlab.org/2017/12/11/additional-data-tables/.

Other improvements

Optional noise cutoffs for HMMs. This has been a long standing issue (detailed in #498). The current version allows user defined noisee-cutoff terms and make it easier for anyone who wish to make anvi'o use their own HMM collection.

Anvi'o vignettes. All anvi'o programs, their categories, parameters, and help: http://merenlab.org/software/anvio/vignette/. This is big, guys.

Variability performance improvement. Anvi'o now relies on pandas (not the animal, sadly, but the library) to take care of variability operations. While this will not impact user experience much, the code is much more elegant now and we wanted you to know it. See the Pull Request at #660 for more details.

Anvi'o disco mode [ON]. We heard more than once that people do not realize that they need to click the 'Draw' button in the interface if there is no default state to load and draw everything automtically (#739). So, disco:


New scripts and programs

We have some new programs that comes with this version. Click on their links to learn more about them.


We know that developing anvi'o would have been much less fun without its enthusiastic and engaged users. We are thankful for those, including Bryan Merrill, Rika Anderson, Mike Lee, Marta Royo-Llonch, Xabier Vázquez-Campos, Emily Crossette, Varun Srinivasan, Julie Reveillaud, Alban Mathieu, and others, who help us improve anvi'o with their science, patience, issue reports, and suggestions.

We are also thankful to our users that share their experiences, such as Elaina Graham, who recently wrote about importing GhostKOALA/KEGG annotations into anvi'o, and Bryan Merrill, who shared his experience with importing VirSorter annotations into anvi'o to study phages for making anvi'o more accessible to the community.

@meren meren released this Oct 5, 2017 · 2831 commits to master since this release

Assets 3

We are happy to announce a new version of anvi'o, "Eden".

After more than 300 changes that introduced about 6,000 new lines, and removed about 2,250 from the anvi'o codebase, the current version includes many bug fixes, as well as some new features. This release note intends to give you a summary of most important changes.


The codename, which was suggested by @watsonar and won the popular vote, is to mark the arrival of the newest honorary member of our lab. Despite her very young age which doesn't even round to a positive integer, she managed to tip the scale of the gender diversity in our lab for the better. We thank Alon and Rebecca for allowing us to witness essential beauties of life through their happiness.

A new program: anvi-display-contigs-stats

Thanks to @ozcan's recent efforts, anvi'o now can give you basic stats about your contigs databases. Using anvi-display-contigs-stats you can generate insights from one or more contigs databases. Beyond obvious uses, we hope it to be useful to interactively compare different assemblers on the same dataset, or multiple genomes for each of which you have a contigs database.

Since a basic framework is now in place for such comparisons, we will be looking forward to hearing suggestions from anvi'o users to improve it further at every chance.

Here is what you see when you run it on the contigs database of our FMT study:

anvi-display-contigs-stats FMT-CONTIGS.db


And this is what you see when you run it on a bunch of single genomes:

anvi-display-contigs-stats c0328-Microgenomates.db \
                             c0319-Microgenomates.db \
                             c0091-Candidate_CPR3.db \
                             c0205-Parcubacteria.db \
                             c0661-Parcubacteria.db \


We hope you try it on your current contigs databases, and let us know about your suggestions.

A new versioning approach

We will no longer follow the standards of semantic versioning with anvi'o releases. The last anvi'o version was v2.4.0, this one v3, and the next one will likely be v4, unless there is an absolutely minor change that will not require you to update your v3 installation.

Why leaving the field standards? Anvi'o is an essential tool for us to do science, and we know of multiple other groups besides our's that also use this platform quite rigorously to go after their own questions. Since this is not a media player, which would continue to play your favorite shows even if you don't keep it up-to-date, and since the excellence of everything we do with anvi'o depends on its accuracy, we want our users to update their installations every time there is a new version. When this is the case, the conventional versioning of software becomes rather irrelevant, and somewhat confusing. We are aware of the fact that installing and updating software could be quite a frustrating task, and we do our best to improve those steps for you. If you have any questions or suggestions, please send them to the anvi'o discussion group.

Making a release is quite a painful process, and I personally hate it passionately as it takes at least a full day from my life. We are only doing it when there is a need for it. So, please keep your anvi'o up-to-date while @ozcan, myself, and other developers and contributors do their best to try to keep it bug free for you :)

Bug fixes, new anvi'o programs, and new flags

This version includes a large amount of bug fixes and minor improvement of available programs with additional flags.

New programs that became available with this release include anvi-delete-state, anvi-display-contigs-stats, and anvi-export-samples-db.

We also added new scripts to anvi'o distribution, such as anvi-script-add-default-collection (to add a quick 'default' collection to access to all splits in a profile database, and anvi-script-filter-fasta-by-blast (to remove weak hits from BLAST search outputs).

Some of the new flags we added to our existing programs include --return-codon-frequencies-instead to be able to get codon frequencies instead of amino acid frequencies from BAM files via anvi-get-aa-frequencies, --items-order to explicitly provide an items order to anvi-interactive, --max-num-genes-missing-from-bin and —min-num-bins-gene-occurs to remarkably improve the functionality of anvi-get-seqeunces-for-hmm-hits to filter weak genomes or genes for better phylogenomic analyses thanks to Ryan Bartelme's suggestions, and --align-with flag so we can optionally use FAMSA instead of muscle, which works exceptionally better than muscle in our experience to align sequences within our protein clusters. We heard FAMSA thanks to Antonio Fernandez-Guerra.

Contigs database version bump

We realized that were. You will be able to update your existing contigs databases without any pain via the upgrade script anvi-script-upgrade-contigs-db-v8-to-v9.

Genome storage version bump

We extended the functionality of anvi'o genome storages that are used for pangenomic workflows. They now also keep a copy of every gene they describe in DNA alphabet. Unfortunately this also requires an update, which can be done via anvi-script-upgrade-genomes-storage-v3-to-v4.

Thank you for your interest in anvi'o!

Please find anvi'o tutorials and installation instructions here:


@meren meren released this Jul 26, 2017 · 3156 commits to master since this release

Assets 3

We are happy to announce the new version of anvi'o, "Pyrenees".

After 350 changes in the codebase that introduced more than 4,500 lines of code and removed about 9,000, the current version includes many bug fixes, as well as some important additions to the repository. This release note will give you a summary of most important changes.


The codename is to honor our friend and colleague, Tom Delmont, who is going to continue working with us from France. The Pyrenees is a mountain range in southwest Europe between France and Spain, where Tom spent most of his life.

Single-amino acid variants (SAAVs)

Since its very first version, anvi'o has been providing you with one of the most comprehensive frameworks to make sense of single-nucleotide variants (SNVs). Here is a tutorial for skeptics.

Although it has been in anvi'o for more than a year, we are very excited to officially announce the anvi'o workflow to investigate single-amino acid variants. More resources on SAAVs will soon be available, and we will keep you posted. But if you have been using the anvi-gen-variability-profile program to characterize single-nucleotide populations in your population genomes, all you need to do now is to add --engine AA to get single amino acid profiles.

Fluency in phylogenomics

Anvi'o now can speak phylogenomics. Here you will find an extensive tutorial with reproducible examples:


You now can do phylogenomic analyses in anvi'o for metagenomic bins or pangenomes thanks to others who pushed us for it. Luke McKay of the Montana State University asked for gene concatenation for HMM hits in genomes from metagenomes, we delivered it, (47fd334, b9780a5, 29a3827, 3ee670e, fa7128a). Then, Ryan Bartelme of the University of Wisconsin-Milwaukee asked for concatenated genes in protein clusters of pangenomes by sending a private e-mail, and we delivered it, too (1726476, 3d07e8b, 99359bc). We didn't stop there. We implemented a simple driver for FastTree (227e892) for starters so you can immediately start playing with your data. As always, we are thankful for your suggestions.

Identifying ribosomal RNAs everywhere

In most cases, getting ribosomal RNA genes out of isolate or metagenome-assembled genomes is not as straightforward as one would like, even when the assembler managed to assemble them.

Gene callers usually don’t perform well when it comes to identifying 16S or 23S rRNA genes, and using primer sequences for these regions is not exactly 2017. We added a new feature in anvi’o that reduces the recovery of rRNA gene sequences from isolate, single-cell, or metagenome-assembled genomes to a couple of key strokes. More information and examples are here:


The small but mighty 'push' button

This is something we are very excited about. You know how sometimes you have something on your interactive interface you would like to show to your colleagues, or share with everyone on the planet? Well, with this version you will be able to click on this little new cloud button on the interface,


and you will be able to send your interactive display directly to http://anvi-server.org or any other anvi'server instance online. After which you can share this interactive display in read-only mode privately with your colleagues, editors, or reviewers, or with everyone by making it public.

Here is an appropriate use of it in a recent paper in Cell Reports that gives another dimension to a static figure:


More functional side buttons

Now we have a way to deliver some important news to your door step:


More information on this new addition, including notes about possible privacy concerns is here:


Please consider sharing your thoughts and opinions on this.

We also made sure anvi'o can convey extensive descriptions of the displays shown in anvi'o. Here is an example:


The description tab gives access to an editor, in which you can describe the data using Markdown syntax. We hope that this practice catches on, so whenever someone looks at an anvi'o display, they know that there will be some information on the side panel to better understand the details.

Please find anvi'o tutorials and installation instructions here:


@meren meren released this Apr 20, 2017 · 3503 commits to master since this release

Assets 3

We are happy to announce the new version of anvi'o: "Ocean" (The code name for this release emphasizes our attachment for this fragile environment, and all of its inhabitants. This code name was suggested by @ShaiberAlon, and won the popular vote. Another suggestion was Samurai Champloo by @tdelmont, and although we all REALLY wanted it to win, it managed to get only a single vote. Real pity).


After 127 changes in the codebase that introduced more than 1,850 lines of code and removed 500, as usual, the current version includes many bug fixes, as well as some major changes. This release note will give you a summary of most important changes.

Splitting large anvi'o profiles into easy-to-share, self-contained, and fully functional pieces

We have always been interested in providing our users with the ability to share intermediate findings their analyses produce. However, if anvi'o profiles can get too large too quickly, and it was not possible to share only a subset of a profile.

Now it is possible to create individual, self-contained anvi'o profile and contig databases for bins stored in anvi'o collections (#482). This way, instead of sharing the entire profile database and contigs database, you can share only a single genome bin from your profiles. This is done by our new program anvi-split, which creates a very comprehensive subset of your metagenome. It is so comprehensive that everything that can be done with your full data (from visualizing your data interactively to displaying or investigating SNVs) can be done with this subset, as well.

Leaner auxiliary data files

Now anvi'o uses GZIP compression on HDF5 files, which results in approximately 5 times smaller auxiliary data files (b83f747). The best part? It is that it is backwards compatible!

A small gift for miners of single-nucleotide variants

The inspection page of the anvi'o interactive interface allows you to investigate mapping results in a comprehensive way. We have been looking at our inspection pages a lot, and it was always a pain to learn important details for a given single-nucleotide variant directly from the interface.

With this release (13f1fbd), you can click on those SNVs, and enjoy your new popup window that lists everything that is relevant:


Single tear.

Storing anvi'o displays in batch mode like a pro

Have you ever had tens of bins that you would like to visualize in the anvi'o interactive interface and save what you see in a figure? We had to, and it required lots of manual work. So we implemented a new parameter (e467360): --export-svg. When you provide a file name to this parameter in your anvi-interactive call, and if you have a state called default in your profile database, you will get an SVG file! So this becomes totally possible to do:

anvi-refine -c CONTIGS.db -p PROFILE.db -C MY_COLLECTION -b MY_BIN --export-svg MY_BIN.svg

If you had read "Working with SVG files anvi'o generate" post, you also know that you can immediately convert this output into a PNG file, too:

inkscape --without-gui -f MY_BIN.svg --export-png MY_BIN.png -d 300 -D

Doing these two operations in a BASH loop allowed us to visualize all our bins in a recent project in a heart beat and create a figure like this with minimal effort:


If we can do this, think about what you can do!

While he was on this, @ozcan also fixed the alignment issue with our SVG outputs (f7cd4de) so our SVG exports center in Inkscape correctly.

More interface sweets

We know none of you use it, but we can now show samples and layer labels in phylogram view (#490) in case thats why you were not using it.

Also now you can use the shortcuts m and s to hide the settings and mouse panels (4447eec). The best part? They come back when you press them again! Magic.

Thanks to YOU

  • Thanks to Roxanne Beinart (@rbeinart), now anvi'o exports gene alignments in protein clusters (#494).
  • Thanks Damien Courtine (@dcourtine), anvi'o will no longer complain about the unavailability of IP addresses and stuff (#485).
  • Thanks to John Eppley (@jmeppley), anvi'o will not fail to close files it opens (#487).
  • When Joe Vineis (@jvineis) complained about the interface glitches when there are too many bins in a collection, @tdelmont suggested that we should tell him to not use anvi'o with too many bins. But we didn't listen to him, and fixed it (#491).
  • When Joe Vineis came back and told us that anvi-summarize tasks were failing when custom HMMs are used, @tdelmont suggested that we respond to him by asking him to not use custom HMMs with anvi'o. We didn't listen to him, and fixed it, too (#484). Now you know!

Installation or upgrade

Please do not use the binaries below, and visit our installation page for details:



Come join our discussion group!

Apr 24, 2017
Apr 20, 2017
very important save.

@meren meren released this Mar 7, 2017 · 3783 commits to master since this release

Assets 3

We are happy to announce the new version of anvi'o, "Spring" (a code name which was suggested by @ShaiberAlon, and won the majority vote).

After 300 changes in the codebase that introduced more than 9,500 lines of code and removed 3,000, the current version includes many bug fixes, as well as some fundamental changes to the repository that we are very happy about. This release note will give you a summary of most important changes.


Python 3 .. finally!

It has been a long time desire to switch our Python 2 codebase to Python 3. Well, we did it (24ff580), and finally closed the issue #343 with a piece of mind. If you are using Homebrew, the switch should be seamless. If you have been using anvi'o via a virtual environment, you will have to recreate your environment, which is a trivial task (please see the updated installation instructions if you need any guidance).

Much better anvi-profile

We completely revamped anvi'o's profiling program. Scalability-wise, this was one of the most critical bottlenecks of anvi'o. But thanks to @ozcan's efforts, it now can be parallelized, and its memory use can be adjusted depending on how much memory you have, and how quickly you want your analyses to finish. In fact, you now have (1c38887) a small info bar that tells you about your memory usage during profiling:


Adjusting the number of threads, memory use, and the frequency of disk access is controlled by three new flags: --num-threads, --queue-size, and --write-buffer-size.

Please see the updated help menu of the program anvi-profile for more information about these flags, and/or take a look at the new blog post for the technical details behind this upgrade and some usage tips:


A revamped anvi'o interactive interface

If you have been using anvi'o for a while, you will probably like it a lot. Not only the interface now looks much more elegant, it is responsive (a960b00), has panels that can be hidden (8ae2f56), shows legends in a separate tab (dd7c6bc) and lets you order them quickly (49cfa29) or change their colors in batch mode (f5f7885), and allows you to change the background opacity of numerical layers (ce15b8a). With these changes, anvi'o interactive interface is likely to be one of the most comprehensive and flexible browser-based visualization environment.

If you are updating your anvi'o installation, please don't forget to refresh your browser window by pressing CTRL+SHIFT+R, or COMMAND+SHIFT+R so the cached browser files from the old version are updated as well.

Markdown descriptions

We often want to take notes as we work on data using the interactive interface. More often than that, we want to be able to communicate what the data is telling us to other people who will take a look at it later. We finally added a functionality to the interface so you can prepare or display descriptions in anvi'o profile databases (it supports Markdown syntax, so you can create pretty looking, readable summaries of the display):



We are very thankful for people who take their invaluable time to report bugs, or ask questions. We would like to acknowledge the following names for their direct contributions to the codebase:

  • Daniel Blankenberg for adding the missing parameter for anvi-gen-variability-matrix (#457).
  • Elmar Prüesse for ameliorating the hardcoded file name requirement for the centrifuge parser (#464).
  • We thank John Eppley for fixing the NCBI blast / DIAMOND check (#441), and for embarrassing us (#443) (and by 'us', of course we mean 'Tom').

Installation or upgrade

Please do not use the binaries below, and visit our installation page for details:



Come join our discussion group!