Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alpha and beta diversity without qiime file #35

Closed
khemlalnirmalkar opened this issue Feb 10, 2022 · 7 comments · Fixed by #36
Closed

alpha and beta diversity without qiime file #35

khemlalnirmalkar opened this issue Feb 10, 2022 · 7 comments · Fixed by #36
Labels
enhancement New feature or request

Comments

@khemlalnirmalkar
Copy link

Hi @sbslee,
Can you add some codes to make alpha (Shannon, observed and evenness) and beta (Bray-Curtis and Jaccard index) with some plots from a normal text file?

If someone is not using qiime or shotgun data which comes as a normal text file, this code can be useful,
Of course, phylogenetic info cant be added here, but still, this is going to be useful,
There are vegan and phyloseq packages available but not straightforward codes with proper explanation,
Thanks,
Khem

sbslee added a commit that referenced this issue Feb 10, 2022
@sbslee sbslee added the enhancement New feature or request label Feb 10, 2022
@sbslee
Copy link
Owner

sbslee commented Feb 10, 2022

@khemlalnirmalkar,

Thanks for the suggestion! I see this is similar to your previous request in #21, correct? As you can see in
1fd24aa, I just updated the dokdo.alpha_diversity_plot method to accept pandas.DataFrame as well. You'd still need to import your text file (CSV, TSV, etc.) into a dataframe object, but I'm assuming you're already familiar with that because that's the identical solution I implemented for #21.

Below example shows how it works:

import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import seaborn as sns
sns.set()

qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/faith_pd_vector.qza'
metadata_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/sample-metadata.tsv'
text_file = '/Users/sbslee/Desktop/test-alpha-diversity.csv'

Plot the regular way:

dokdo.alpha_diversity_plot(qza_file, metadata_file, 'body-site')
plt.savefig('with-qza-file.png')

with-qza-file

Now with a text file:

df = pd.read_csv(text_file, index_col=0)
dokdo.alpha_diversity_plot(df, metadata_file, 'body-site')
plt.savefig('with-text-file.png')

with-text-file

If you are satisfied with above, I will also update the dokdo.beta_2d_plot and dokdo.beta_3d_plot methods. I just wanted to hear your feedback before I commit.

Lastly, please don't forget these changes are implemented in the 1.12.0-dev branch until the official version is released:

$ git clone https://github.com/sbslee/dokdo
$ cd dokdo
$ git checkout 1.12.0-dev
$ pip install -e .

@khemlalnirmalkar
Copy link
Author

khemlalnirmalkar commented Feb 10, 2022

Hi @sbslee
Thanks for your quick action,
this looks great and i guess its ready for beta-div,
I am curious to know how was the data structure of the text file for this figure, please can you share it?
Thanks again,
Cheers
Khem

@sbslee
Copy link
Owner

sbslee commented Feb 10, 2022

Oops, forgot to attach the test input file (test-alpha-diversity.csv):

test-alpha-diversity.csv

The regular QZA file is included in the dokdo repository (dokdo/data/moving-pictures-tutorial/faith_pd_vector.qza).

I will also leave a link to development documentation in case you want to check it out (there you will see that the method now accepts pandas.DataFrame as input):

https://dokdo.readthedocs.io/en/1.12.0-dev/dokdo_api.html#module-dokdo.api.alpha_diversity_plot

Finally, thanks for your feedback. I will update the other methods ASAP and get back to you.

sbslee added a commit that referenced this issue Feb 11, 2022
* :issue:`35`: Update the methods :meth:`alpha_diversity_plot`, 
:meth:`beta_2d_plot`, and :meth:`beta_3d_plot` to accept 
:class:`pandas.DataFrame` in case the input data was not generated from 
QIIME 2 (e.g. shotgun sequencing).
* Update the methods :meth:`beta_2d_plot` and :meth:`beta_3d_plot` to 
print out the proportions explained instead of embedding them in the 
PCoA plot.
@sbslee
Copy link
Owner

sbslee commented Feb 11, 2022

@khemlalnirmalkar,

The update is complete! Please let me know if you have additional methods you want that need to be updated.

Input test files:
test-beta-diversity-2d.csv
test-beta-diversity-3d.csv

import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import seaborn as sns
sns.set()
qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/unweighted_unifrac_pcoa_results.qza'
metadata_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/sample-metadata.tsv'
dokdo.beta_2d_plot(qza_file, metadata=metadata_file, hue='body-site', figsize=(5, 5))
plt.savefig('beta-2d-plot-qza.png')
# Explained proportions computed by QIIME 2:
# 33.94% for Axis 1
# 25.90% for Axis 2

beta-2d-plot-qza

df = pd.read_csv('test-beta-diversity-2d.csv', index_col=0)
dokdo.beta_2d_plot(df, metadata=metadata_file, hue='body-site', figsize=(5, 5))
plt.savefig('beta-2d-plot-csv.png')

beta-2d-plot-csv

dokdo.beta_3d_plot(qza_file, metadata=metadata_file, hue='body-site', figsize=(7, 7))
plt.savefig('beta-3d-plot-qza.png')
# Explained proportions computed by QIIME 2:
# 33.94% for Axis 1
# 25.90% for Axis 2
# 6.63% for Axis 3

beta-3d-plot-qza

df = pd.read_csv('test-beta-diversity-3d.csv', index_col=0)
dokdo.beta_3d_plot(df, metadata=metadata_file, hue='body-site', figsize=(7, 7))
plt.savefig('beta-3d-plot-csv.png')

beta-3d-plot-csv

@khemlalnirmalkar
Copy link
Author

Hi @sbslee,
Thank you so much for the update, this is great,
I was thinking to have codes to calculate the alpha (Shannon, observed and evenness) and beta-div (Bray-Curtis and Jaccard index) and then make plots,
is it possible here? in R, generally works with vegan and also phyloseq (doesnt have evenness)
Thanks,

@sbslee
Copy link
Owner

sbslee commented Feb 11, 2022

Unfortunately, what you are describing here (i.e. performing diversity analyses for non-QIIME 2 data) is beyond the scope of Dokdo. Sure, Dokdo can be -- and has been -- extended to visualize data from non-QIIME 2 software, but it's a whole different story to support analyzing such data. Hope this makes sense!

@khemlalnirmalkar
Copy link
Author

Yes, i can understand,
I was thinking if vegan or phyloseq can be added/import in Dokdo and run diversity analyses for taxonomy files from non-qiime2 data such as meta-genomic/transcriptomic data.
If it cant be, no worries,
this new update is more than i was wishing for,
Thanks again,
Cheers

@sbslee sbslee closed this as completed Feb 11, 2022
@sbslee sbslee linked a pull request Feb 11, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants