Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heatmap with prettified names? #24

Closed
mishkb opened this issue Jul 27, 2021 · 11 comments
Closed

Heatmap with prettified names? #24

mishkb opened this issue Jul 27, 2021 · 11 comments
Labels
enhancement New feature or request

Comments

@mishkb
Copy link

mishkb commented Jul 27, 2021

Hi @sbslee ,
Thanks for your very recent updates to heatmap (from clustermap) - just in time for what I need. Just wondering how I could incorporate your 'prettified names' function - to get taxa names as the y-axis labels such as is possible in the legend of the taxa_abundance_bar_plot output?

Thanks,
Michelle

@sbslee
Copy link
Owner

sbslee commented Jul 28, 2021

@mishkb,

Thanks for the great suggestion! I just added the pretty_taxa argument to the heatmap method in the 1.11.0-dev branch (the official release of 1.11.0 will be made within 10 days; I aim to make a Dokdo release every month). When you set pretty_taxa=True it will display only the smallest taxonomic rank like how it's done in the taxa_abundance_bar_plot method with the legend_short argument:

import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()

fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))

qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'

dokdo.heatmap(qza_file,
              normalize='log10',
              flip=True,
              ax=ax1)

dokdo.heatmap(qza_file,
              normalize='log10',
              flip=True,
              pretty_taxa=True,
              ax=ax2)

plt.tight_layout()
plt.savefig('heatmap.png')

heatmap

Is this what you had in your mind? Let me know if you need more help.

@sbslee sbslee added the enhancement New feature or request label Jul 28, 2021
@mishkb
Copy link
Author

mishkb commented Jul 28, 2021

This is wonderful - thanks for your prompt response. Based on your figures, this is just what I was looking for. I'll give it a whirl today.

@mishkb
Copy link
Author

mishkb commented Jul 28, 2021

OK @sbslee ,I've been able to implement that new function - thank you! Now I'm wondering...

  1. is possible to change the x-label to another label from the metadata file (e.g. at the moment the heatmap uses the first column as the label [#Sample_ID or similar], but I'd like to use an alternate column i.e. Sample_name.
  2. Is it possible to control the number of taxa that are plotted in the heatmap? I'd really only like to visualize to top taxa (20-50 say), as is possible for the taxa_abundance_bar_plot

Michelle

@sbslee
Copy link
Owner

sbslee commented Jul 29, 2021

@mishkb,

Thanks for the great ideas. I think I can implement both fairly easily, but let's take one step at a time. I just added the label_columns parameter to the dokdo.heatmap method. It works the same way as it does for the dokdo. taxa_abundance_bar_plot method.

import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()

fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))

qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'
metadata_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/sample-metadata.tsv'

dokdo.heatmap(qza_file,
              metadata=metadata_file,
              normalize='log10',
              flip=True,
              pretty_taxa=True,
              ax=ax1)

dokdo.heatmap(qza_file,
              metadata=metadata_file,
              normalize='log10',
              flip=True,
              pretty_taxa=True,
              label_columns=['sample-id', 'body-site'],
              ax=ax2)

plt.tight_layout()

heatmap

If you are happy with above, we can move on to the next (i.e. showing top taxa). Let me know what you think.

@mishkb
Copy link
Author

mishkb commented Jul 29, 2021

Works exactly as I was hoping - thanks @sbslee!
Does the heatmap function automatically align labels vertically or horizontally depending on the figure size/space available? (I think I have to look more at the plt.tight_layout() function and matplotlib.

@sbslee
Copy link
Owner

sbslee commented Jul 29, 2021

@mishkb,

Glad to hear that works for you. As for the rotation of tick labels, AFAIK, it is automatically determined by the seaborn.heatmap method (which is used by dokdo.heatmap). BTW, you can easily change the rotation of tick labels in the following way:

ax = dokdo.heatmap(...)

for ticklabel in ax.get_xticklabels():
    ticklabel.set_rotation(90)

It sounds like we are good to move on to the next task: showing only the top taxa. Stay tuned.

@mishkb
Copy link
Author

mishkb commented Jul 29, 2021

@sbslee I have just realised the 'prettified names" function actually truncates species names to just show the s__name, not g__Genus; s__species name. I just noticed this in the taxa_abundance_bar_plot plot and obviously, it also is the same now in heatmaps if I use that option.

Is there a way to return the last two ranks - so in your example,

>>> a = 'd__Bacteria;p__Actinobacteriota;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__Schaalia_radingae'
>>> print(dokdo.pname(a))
g__Actinomyces;s__Schaalia_radingae

I am using taxonomy from NCBI that separates the genus names and species names - so with the 'prettified names', it is impossible to tell which genus the species come from.

(BTW, I can make this a separate issue if that is better).

@sbslee
Copy link
Owner

sbslee commented Jul 29, 2021

@mishkb,

No worries about opening another issue. And thanks for the great suggestion. I myself have come across that issue before. The good news is I just updated the pname method to allow returning of more than one taxonomic rank:

import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()

a = 'd__Bacteria;p__Actinobacteriota;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__Schaalia_radingae'
dokdo.pname(a)

Will produce the usual:

's__Schaalia_radingae'

However, if you provide the levels option:

a = 'd__Bacteria;p__Actinobacteriota;c__Actinobacteria;o__Actinomycetales;f__Actinomycetaceae;g__Actinomyces;s__Schaalia_radingae'
dokdo.pname(a, levels=[6, 7])

It will give what you want:

'g__Actinomyces;s__Schaalia_radingae'

I extended this to the heatmap and taxa_abundance_bar_plot methods by adding the pname_kws argument to both methods:

fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))

qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'

dokdo.heatmap(qza_file,
              normalize='log10',
              flip=True,
              ax=ax1)

dokdo.heatmap(qza_file,
              normalize='log10',
              flip=True,
              pretty_taxa=True,
              pname_kws=dict(levels=[2, 3]),
              ax=ax2)

plt.tight_layout()

heatmap

qzv_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/taxa-bar-plots.qzv'
dokdo.taxa_abundance_bar_plot(qzv_file,
                              figsize=(10, 7),
                              level=6,
                              count=8,
                              legend_short=True,
                              pname_kws=dict(levels=[5, 6]),
                              artist_kwargs=dict(show_legend=True,
                                                 legend_loc='upper left'))
plt.tight_layout()

barplot

Updated documentations:

pname: https://dokdo.readthedocs.io/en/1.11.0-dev/dokdo_api.html#module-dokdo.api.pname
heatmap: https://dokdo.readthedocs.io/en/1.11.0-dev/dokdo_api.html#heatmap
taxa_abundance_bar_plot: https://dokdo.readthedocs.io/en/1.11.0-dev/dokdo_api.html#taxa-abundance-bar-plot

Does this work for you?

@mishkb
Copy link
Author

mishkb commented Jul 29, 2021

That looks great to me @sbslee - I hope yourself and others will find it useful as well. I really appreciate you making these adjustments and answering my questions.

@sbslee
Copy link
Owner

sbslee commented Jul 30, 2021

@mishkb,

No problem! I also added the count argument to the heatmap method; it's similar to the count argument in the taxa_abundance_bar_plot method:

import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()

fig, [ax1, ax2] = plt.subplots(1, 2, figsize=(15, 7))

qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table-l3.qza'
metadata_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/sample-metadata.tsv'

dokdo.heatmap(qza_file,
              metadata=metadata_file,
              normalize='log10',
              flip=True,
              pretty_taxa=True,
              ax=ax1)

dokdo.heatmap(qza_file,
              metadata=metadata_file,
              normalize='log10',
              flip=True,
              pretty_taxa=True,
              count=20,
              ax=ax2)

plt.tight_layout()

heatmap

Let me know if this is what you wanted.

@mishkb
Copy link
Author

mishkb commented Jul 30, 2021

Fantastic @sbslee! This has worked perfectly with my data. Thank you again for adding those options to your code. I'm so grateful that I found Dokdo and that you are still actively working on it.

@sbslee sbslee closed this as completed Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants