Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can we use dokdo without qiime files #21

Closed
khemlalnirmalkar opened this issue Jul 8, 2021 · 11 comments
Closed

can we use dokdo without qiime files #21

khemlalnirmalkar opened this issue Jul 8, 2021 · 11 comments
Labels
enhancement New feature or request

Comments

@khemlalnirmalkar
Copy link

Hi @sbslee ,
Dokdo works great with qzv files,
Can we also use Dokdo for relative abundance data from txt files, instead of qiime's qzv files?
Thanks,

@sbslee
Copy link
Owner

sbslee commented Jul 8, 2021

@khemlalnirmalkar, that's an interesting suggestion.

Q1. If it's not a visualization file from QIIME 2, may I ask how you are generating your "txt files"?
Q2. I suppose a user can provide a pandas.DataFrame object as input instead of a qiime2.Visualization object. Do you think that will be sufficient for your case?

@khemlalnirmalkar
Copy link
Author

Ans1 data is from shotgun sequences but it's already in relative abundance for taxonomy. It Should not matter if it is for 16 or shotgun

Ans2 I was thinking trying the same. I will give a try and see if it works or not. Dokdo is simple and easy to use... thinking to use for plots from all my shotguns relative abundance data.

Thanks

@sbslee
Copy link
Owner

sbslee commented Jul 8, 2021

@khemlalnirmalkar, I get it now. Thanks for the answers. As for Q2, the current dokdo.taxa_abundance_bar_plot method won't accept pandas.DataFrame yet. I will create a development branch and try to implement the function to do just that. Will let you know here when it's done. In the meantime, you're more than welcome to tweak it around yourself as well. You may find better solution :)

@khemlalnirmalkar
Copy link
Author

@sbslee That will be great, thank you so much.

@sbslee sbslee added the enhancement New feature or request label Jul 9, 2021
@sbslee
Copy link
Owner

sbslee commented Jul 9, 2021

@khemlalnirmalkar,

Great new! I was able to update the dokdo.taxa_abundance_bar_plot method to accept pandas.DataFrame as input. This was actually pretty easy because internally the method already extracts a .csv file from the QIIME 2 visualization and then converts it to a pandas.DataFrame. Therefore, all I needed to do was skipping this part when the input is already a pandas.DataFrame. One thing to note is that the level option will be ignored and the user should know which taxonomic level their input file was created from.

This update has been implemented in the 1.11.0-dev branch. In the future, I think it's possible to extend some of the Dokdo methods to support shotgun data in the same manner. Give it a try and let me know what you think.

import pandas as pd
import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()
dokdo.taxa_abundance_bar_plot('taxa-bar-plots.qzv',
                              figsize=(10, 7),
                              level=6,
                              count=8,
                              legend_short=True,
                              artist_kwargs=dict(show_legend=True,
                                                 legend_loc='upper left'))
plt.tight_layout()
plt.savefig('Input_Visualization.png')

Input_Visualization

df = pd.read_csv('level-6.csv', index_col=0)
dokdo.taxa_abundance_bar_plot(df,
                              figsize=(10, 7),
                              count=8,
                              legend_short=True,
                              artist_kwargs=dict(show_legend=True,
                                                 legend_loc='upper left'))
plt.tight_layout()
plt.savefig('Input_DataFrame.png')

Input_DataFrame

@khemlalnirmalkar
Copy link
Author

test2.csv

Hi @sbslee ,
Thanks for making this change and your support.
I tried this with my data and didnt go well,

i got an error
TypeError: no numeric data to plot
this error doesnt make sense, probably something i am missing,
Here i attached one of my test file,
I have ~5k taxa but that didnt work and not even with test file,
Please can you have a look? i have the same format for my entire data as this test file,

Thanks,
Khem

@sbslee
Copy link
Owner

sbslee commented Jul 9, 2021

@khemlalnirmalkar,

That's because in your current file, it's difficult to distinguish between data columns (e.g. Prevotella_species) vs. metadata columns (e.g. Group). In the QIIME 2 visualization file, data columns are indicated by the presence of two consecutive underscores (__). For example, your Prevotella_species column would be s__Prevotella_species for species and g__Prevotella_species for genus.

That being said, when I changed the column names (e.g. Prevotella_species to s__Prevotella_species), it worked:

import pandas as pd
import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()
df = pd.read_csv('test2-modified.csv', index_col=0)
dokdo.taxa_abundance_bar_plot(df)
plt.savefig('test.png')

test

test2-modified.csv

Can you try this and let me know if it works?

@sbslee
Copy link
Owner

sbslee commented Jul 9, 2021

Here's the another example CSV file which you can use as template.

level-6.csv

@khemlalnirmalkar
Copy link
Author

@sbslee Thanks for checking the file. I will make these changes to my original dataset and will let you know you soon.
I already tried with qiime2 examples file, and it worked. Sorry i forgot to mention earlier. If you want you can close the issue.

Thanks a lot,
Khem

@sbslee
Copy link
Owner

sbslee commented Jul 9, 2021

No worries. Please feel free to reopen this issue if there is any problem.

@sbslee sbslee closed this as completed Jul 9, 2021
@khemlalnirmalkar
Copy link
Author

@sbslee its working, thank you so much,
I hope in future you can add more cool plots and type of analyses for the shotgun data :)
Khem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants