Skip to content

Latest commit

 

History

History
173 lines (131 loc) · 8.74 KB

README.md

File metadata and controls

173 lines (131 loc) · 8.74 KB

rankratioviz

Build Status codecov

(Name subject to change.)

rankratioviz visualizes the output from a tool like songbird or DEICODE. It facilitates viewing a plot of feature rankings alongside a plot showing the log ratios of selected features' abundances within samples.

rankratioviz can be used standalone (as a Python 3 script that generates a folder containing a HTML/JS/CSS visualization) or as a QIIME 2 plugin (that generates a QZV file that can be visualized at view.qiime2.org or by using qiime tools view).

rankratioviz should work with most modern web browsers. Firefox or Chrome are recommended.

rankratioviz is still being developed, so backwards-incompatible changes might occur. If you have any questions, feel free to contact the development team at mfedarko@ucsd.edu.

Screenshot and Demo

Screenshot

This visualization (which uses some of the Red Sea metagenome data, with ranks generated by songbird) can be viewed online here.

Installation and Usage

The following command will install the most up-to-date version of rankratioviz:

pip install git+https://github.com/fedarko/rankratioviz.git

A python version of at least 3.5 is required to use rankratioviz.

Temporary Caveat

Please make sure that your sample metadata fields do not contain any period or square bracket characters (.[]). This is due to Vega-Lite's special treatment of these characters. (Eventually rankratioviz should be able to handle this accordingly, but in the meantime this is a necessary fix.) See this issue for context.

Integration with metabolomics feature metadata

If you have a GNPS feature metadata file (where each row in the file has a parent mass and RTConsensus column), you can pass in the -gnps (--assume-gnps-feature-metadata) command-line argument to rankratioviz' standalone script to make rankratioviz understand the metadata file. Please note that this functionality is experimental; furthermore, it is not yet available in the QIIME 2 plugin version of rankratioviz.

Tutorials

Examples of using rankratioviz (both inside and outside of QIIME 2) are available in rankratioviz' example Jupyter notebooks, which are located here:

Interacting with a rankratioviz visualization

The two plots (one of feature rankings, and one of samples' log ratios) in a rankratioviz visualization are linked [1]: when a change is made to the selected features in a log ratio, both the rank plot and sample plot are accordingly modified.

To elaborate on that: clicking on two features in the rank plot sets a new numerator feature (determined from the first-clicked feature) and a new denominator feature (determined from the second-clicked feature) for the abundance log ratios in the sample plot.

You can also run textual queries over the various feature IDs in order to construct more complicated log ratios (e.g. "the log ratio of the combined abundances of all features that contain the text 'X' over the combined abundances of all features that contain the text 'Y'"). Although this method doesn't require you to manually select features on the rank plot, the rank plot is still updated to indicate the features used in the log ratios.

Acknowledgements

Dependencies

Code files for the following projects are distributed within rankratioviz/support_file/vendor/. See the dependency_licenses/ directory for copies of these software projects' licenses (each of which includes a respective copyright notice).

The following software projects are required for rankratioviz's python code to function, although they are not distributed with rankratioviz (and are instead installed alongside rankratioviz).

Testing Dependencies

For python testing/style checking, rankratioviz uses pytest, pytest-cov, flake8, and black.

For JavaScript testing/style checking, rankratioviz uses Mocha, Chai, mocha-headless-chrome, nyc, jshint, and prettier.

rankratioviz also uses Travis-CI and Codecov.

Data Sources

The test data located in rankratioviz/tests/input/byrd/ is from this repository. This data, in turn, originates from Byrd et al.'s 2017 study on atopic dermatitis [2].

The test data located in rankratioviz/tests/input/sleep_apnea/ (and in example_notebooks/DEICODE_sleep_apnea/input/) is from this Qiita study, which is associated with Tripathi et al.'s 2018 study on sleep apnea [4].

Lastly, the data located in rankratioviz/tests/input/red_sea (and in example_notebooks/songbird_red_sea/input/, and shown in the screenshot above) was taken from songbird's GitHub repository in its data/redsea/ folder, and is associated with this paper [3].

Special Thanks

The design of rankratioviz was strongly inspired by EMPeror and q2-emperor, along with DEICODE. A big shoutout to Yoshiki Vázquez-Baeza for his help in planning this project, as well as to Cameron Martino for a ton of work on getting the code in a distributable state (and making it work with QIIME 2). Thanks also to Jamie Morton, who wrote the original code for producing rank plots from which this is derived.

References

[1] Becker, R. A. & Cleveland, W. S. (1987). Brushing scatterplots. Technometrics, 29(2), 127-142. (Section 4.1 in particular talks about linking visualizations.)

[2] Byrd, A. L., Deming, C., Cassidy, S. K., Harrison, O. J., Ng, W. I., Conlan, S., ... & NISC Comparative Sequencing Program. (2017). Staphylococcus aureus and Staphylococcus epidermidis strain diversity underlying pediatric atopic dermatitis. Science translational medicine, 9(397), eaal4651.

[3] Thompson, L. R., Williams, G. J., Haroon, M. F., Shibl, A., Larsen, P., Shorenstein, J., ... & Stingl, U. (2017). Metagenomic covariation along densely sampled environmental gradients in the Red Sea. The ISME journal, 11(1), 138.

[4] Tripathi, A., Melnik, A. V., Xue, J., Poulsen, O., Meehan, M. J., Humphrey, G., ... & Haddad, G. (2018). Intermittent hypoxia and hypercapnia, a hallmark of obstructive sleep apnea, alters the gut microbiome and metabolome. mSystems, 3(3), e00020-18.

License

This tool is licensed under the BSD 3-clause license. Our particular version of the license is based on scikit-bio's license.