Pairwise ANI (Average Nucleotide Identity) visulization tool. This tool is designed to help visualize the results of pairwise comparisons among multiple genomes.
First, Scipy is used to perform hierarchical/agglomerative clustering, followed by generating a clustermap using Seaborn that supports various Matplotlib colormaps.
- Support various matplotlib colormaps
- Taxonomic classification result can be included to illustrate different taxa
- Specific outrange ANI values can be set (eg. 95% ANI values)
- Multi-format outputs (JPG, PNG, TIFF, SVG, PDF, EPS)
1. Using different matplotlib colormaps
# Dependencies: Matplotlib, Seaborn, Scipy, Pandas
# Install pairwiseANIviz using pip
pip install pairwiseANIviz==1.0usage: pairwiseANIviz [options] anifile
positional arguments:
anifile File containing pairwise ANI analysis result.
options:
-h, --help show this help message and exit
-v, --version Show pairwiseANIviz version number and exit.
-o OUTDIR, --outdir OUTDIR
Directory to save the output figures (default 'pairwiseANIviz').
--method {single,complete,average,weighted,centroid,median,ward}
Linkage method to use for calculating clusters (default 'average').
See https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage
--metric {braycurtis,canberra,chebyshev,cityblock,correlation,cosine,dice,euclidean,hamming,jaccard,jensenshannon,kulczynski1,mahalanobis,matching,minkowski,rogerstanimoto,russellrao,seuclidean,sokalmichener,sokalsneath,sqeuclidean,yule}
The distance metric to use (default 'euclidean').
See https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html#scipy.spatial.distance.pdist
-cmap COLORMAP, --colormap COLORMAP
Matplotlib colormap used when drawing the heatmap of ANI values (default 'Blues').
See https://matplotlib.org/stable/users/explain/colors/colormaps.html
--figWidth FIGWIDTH Figure width (default '15').
--figHeight FIGHEIGHT
Figure height (default '15').
--linewidth LINEWIDTH
Line width of the main heatmap (default 0.5)
--linecolor LINECOLOR
Line color of the main heatmap (default 'grey').
--rowCluster Draw the row cluster.
--colCluster Draw the column cluster.
--annotation Show ANI values on the plot.
--outrangeValue OUTRANGEVALUE
Cells have ANI values over specific threshold set to red (eg. cells have ANI value >=0.95 set to red) (default 100).
-c CLASSIFICATIONFILE, --classificationFile CLASSIFICATIONFILE
File containing classification result generated by GTDBTk(https://github.com/Ecogenomics/GTDBTk).
-t {domain,phylum,class,order,family,genus,species}, --taxaLevel {domain,phylum,class,order,family,genus,species}
Taxa level illustrated on the plot.
Choose from "domain, phylum, class, order, family, genus, species".
Note that this parameter only works if classification result was input.
--colorPalette COLORPALETTE
Color palette used to return a specified number of evenly spaced hues which are then used to illustrate different taxa (default 'hls').
Note that this parameter only works if classification result was input.
General usage
----------------
1. ANI result visulization **without classification info**:
$ pairwiseANIviz ani_result.txt
2. ANI result visulization **with classification info**:
$ pairwiseANIviz ani_result.txt --classificationFile classification_result.tsv
Runjia Ji, 2023
If you have any questions using pairwiseANIviz, feel free to open an issue or contact me jirunjia@gmail.com.
