Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about legend and colorbar settings #3

Closed
faridrashidi opened this issue Jan 29, 2023 · 21 comments
Closed

Questions about legend and colorbar settings #3

faridrashidi opened this issue Jan 29, 2023 · 21 comments

Comments

@faridrashidi
Copy link

Dear Wubin,

Thank you very much for creating this incredibly useful package. I have three questions regarding some settings.

  1. How can I set the frameon to False for categorical legends?
  2. How can I set the colorbar attributes such as aspect or orientation?
  3. Apparently if row_split or col_split is not set to None axes are visible, how one in this case can turn them off?

Thank you in advance for your time.

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

Thank you for your questions. Those are very good questions.
(1). For the legend kws, there is a legend_kws in AnnotationBase.subclasses(), such as anno_label, anno_simple, anno_scatter et al. Let's take an example:

row_ha = HeatmapAnnotation(label=anno_label(df.AB, merge=True,rotation=15),
                           AB=anno_simple(df.AB,add_text=True),axis=1,
                           CD=anno_simple(df.CD,add_text=True,colors={'C':'red','D':'green','G':'blue'},
                                            legend_kws={'frameon':False}),
                           Exp=anno_boxplot(df_box, cmap='turbo'),
                           Scatter=anno_scatterplot(df_scatter), TMB_bar=anno_barplot(df_bar),
                           )

In the above example, I added legend_kws in CD=anno_simple(df.CD)..., but legend_kws is only working for the categorical legends.

(2). Similarly, you also can add parameters orientation and aspect in legend_kws.
(3). What is the meaning of "turn off"? If you don't need the row_split or col_split, just set them to None. If you want to remove the gap between split rows and cols, you can set row_split_gap or col_split_gap in ClusterMapPlotter function.

I hope these answers will be helpful to you. Thanks.

@faridrashidi
Copy link
Author

Thanks @DingWB for your quick response. A few more questions:

Regarding (2), I added legend_kws={'orientation':"horizontal"} to Exp in the above example but don't know why it looks like the following, do you have any suggestion?

SCR-20230128-tk0

In (3), I meant setting visibility of the axes spines to False. For instance, in the provided jupyter notebook example second cell i.e. "2. A quick example" if you change col_split=None and row_split=None the axes spines appear.

@faridrashidi
Copy link
Author

faridrashidi commented Jan 29, 2023

Two more quick questions and I would really appreciate your suggestion:

  1. Does PyComplexHeatmap support discrete matrix similar to the following example form R ComplexHeatmap?

  1. Can PyComplexHeatmap create annotation text labels similar to the following example from R ComplexHeatmap? i.e. Only a few important rows are annotated with the text.

2

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

  1. You can use anno_simple to plot discrete data.
  2. anno_label is what you are looking for.

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

For horizontal legend, I haven't implemented it now. But you can plot the legend separately , just like what I shown in the example.ipynb, but using legend_kws "horizonal".

For axes spine, could you please give me an example? How to reproduce it? It would be better if you can provide your code.

@faridrashidi
Copy link
Author

faridrashidi commented Jan 29, 2023

Here's an example:

from PyComplexHeatmap import *

df = pd.DataFrame(['AAAA1'] * 5 + ['BBBBB2'] * 5, columns=['AB'])
df['CD'] = ['C'] * 3 + ['D'] * 3 + ['G'] * 4
df['EF'] = ['E'] * 6 + ['F'] * 2 + ['H'] * 2
df['F'] = np.random.normal(0, 1, 10)
df.index = ['sample' + str(i) for i in range(1, df.shape[0] + 1)]
df_box = pd.DataFrame(np.random.randn(10, 4), columns=['Gene' + str(i) for i in range(1, 5)])
df_box.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['TMB1', 'TMB2'])
df_bar.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_scatter = pd.DataFrame(np.random.uniform(0, 10, 10), columns=['Scatter'])
df_scatter.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_heatmap = pd.DataFrame(np.random.randn(50, 10), columns=['sample' + str(i) for i in range(1, 11)])
df_heatmap.index = ["Fea" + str(i) for i in range(1, df_heatmap.shape[0] + 1)]
df_heatmap.iloc[1, 2] = np.nan

plt.figure(figsize=(6, 12))
row_ha = HeatmapAnnotation(label=anno_label(df.AB, merge=True,rotation=15),
                           AB=anno_simple(df.AB,add_text=True),axis=1,
                           CD=anno_simple(df.CD,add_text=True),
                           Exp=anno_boxplot(df_box, cmap='turbo'),
                           Scatter=anno_scatterplot(df_scatter), TMB_bar=anno_barplot(df_bar),
                           )
no_spine = ClusterMapPlotter(data=df_heatmap, top_annotation=row_ha, col_split=2, row_split=3, col_split_gap=0.5,
                     row_split_gap=1,label='values',row_dendrogram=True,show_rownames=True,show_colnames=True,
                     tree_kws={'row_cmap': 'Dark2'})

plt.figure(figsize=(6, 12))
with_spine = ClusterMapPlotter(data=df_heatmap, top_annotation=row_ha, col_split=None, row_split=None, col_split_gap=0.5,
row_split_gap=1,label='values',row_dendrogram=True,show_rownames=True,show_colnames=True,
tree_kws={'row_cmap': 'Dark2'})

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

I see. I understand what you mean now. You want to remove the border of the heatmap. To simplify, you can add one line of code: with_spine.ax_heatmap.set_axis_off().

I also modified my code and set 'ax_heatmap.set_axis_off()' on default when row_split and col_split are None. So, you can either add with_spine.ax_heatmap.set_axis_off() or uninstall the package (pip uninstall PyComplexHeatmap) and reinstall it (pip install git+https://github.com/DingWB/PyComplexHeatmap).

BTW, in my example, in anno_label, the line of anno_label is longer than that in your plot. Could you please help me check the reason? Did you set

plt.rcParams['figure.dpi'] = 100
plt.rcParams['savefig.dpi']=300

When you save it as a pdf file, does it looks normal in the pdf file? If it still doesn't work, you can also change the parameter height to 10 (for example) in label=anno_label(df.AB, merge=True,rotation=15,height=10).

image

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

If the color bar overlaps with each other in the figure legend area, you could increase the gap between legends by adding a parameter legend_gap (for example, legend_gap=10, the unit is mm) to ClusterMapPlotter. Might be helpful in the issue #3 (comment)
image

@faridrashidi
Copy link
Author

faridrashidi commented Jan 29, 2023

The issue is not because of the overlap. It's because when you use horizontal mode it doesn't rotate it. Here is an example:

from PyComplexHeatmap import *

df = pd.DataFrame(['AAAA1'] * 5 + ['BBBBB2'] * 5, columns=['AB'])
df['CD'] = ['C'] * 3 + ['D'] * 3 + ['G'] * 4
df['EF'] = ['E'] * 6 + ['F'] * 2 + ['H'] * 2
df['F'] = np.random.normal(0, 1, 10)
df.index = ['sample' + str(i) for i in range(1, df.shape[0] + 1)]
df_box = pd.DataFrame(np.random.randn(10, 4), columns=['Gene' + str(i) for i in range(1, 5)])
df_box.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['TMB1', 'TMB2'])
df_bar.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_scatter = pd.DataFrame(np.random.uniform(0, 10, 10), columns=['Scatter'])
df_scatter.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_heatmap = pd.DataFrame(np.random.randn(50, 10), columns=['sample' + str(i) for i in range(1, 11)])
df_heatmap.index = ["Fea" + str(i) for i in range(1, df_heatmap.shape[0] + 1)]
df_heatmap.iloc[1, 2] = np.nan

plt.figure(figsize=(6, 12))
row_ha = HeatmapAnnotation(label=anno_label(df.AB, merge=True,rotation=15),
                           AB=anno_simple(df.AB,add_text=True),axis=1,
                           CD=anno_simple(df.CD,add_text=True, legend_kws={"labelcolor": "black"}),
                           Exp=anno_boxplot(df_box, cmap='turbo', legend_kws={'orientation':"horizontal"}),
                           Scatter=anno_scatterplot(df_scatter), TMB_bar=anno_barplot(df_bar),
                           )
no_spine = ClusterMapPlotter(data=df_heatmap, top_annotation=row_ha, row_split=5, col_split_gap=0.5,
                     row_split_gap=1)

2da50c0c-712f-476a-a0e4-a5121a4e3307

Changing its aspect doesn't work either. Meaning colorbar width cannot be changed at all. Perhaps it's because as you mentioned you haven't implemented this feature yet.

It also seems when you set some attributes of legend_kws like labelcolor it won't be overwritten. See the example above.

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

Yes, I haven't implemented the horizontal figure legend. This is why I suggested you plot the figure legend separately using row_ha.plot_legends() and rotate it yourself in the pdf editor.
For the second question, there is a parameter in legend_kws, named color_text, default is True, which means the color of the label text would be the same as the marker in the legend. If you want to use a custom color instead of the color of the marker, you should first turn off the color_text, then set the labelcolor, for example, you can set

legend_kws={"color_text": False} #if color_test is False, then 'black' would be the default

# Or if you want to use another custom color (such as blue), instead of black, please set

legend_kws={'color_text':False,'labelcolor':'blue'}

Or would you happen to have any suggestions about how to make the labelcolor of legend more convenient for users?

By the way, do you want to contribute to this package? if you want, you can help me implement the horizontal figure legend.

@faridrashidi
Copy link
Author

faridrashidi commented Jan 29, 2023

I see, got it. I didn't know there is color_text. I think it's already convenient so don't have any suggestions.

I'm trying to understand the code a bit better which may take a while but I'd be happy to contribute to this package later.

There are also a couple of things that I haven't found a solution. For example,

  1. When you set the vmin and vmax in anno_simple, it doesn't reflect the settings into the colorbar:
from PyComplexHeatmap import *

df = pd.DataFrame(['AAAA1'] * 5 + ['BBBBB2'] * 5, columns=['AB'])
df['CD'] = ['C'] * 3 + ['D'] * 3 + ['G'] * 4
df['EF'] = ['E'] * 6 + ['F'] * 2 + ['H'] * 2
df['F'] = np.random.normal(0, 1, 10)
df.index = ['sample' + str(i) for i in range(1, df.shape[0] + 1)]
df_box = pd.DataFrame(np.random.randn(10, 4), columns=['Gene' + str(i) for i in range(1, 5)])
df_box.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['TMB1', 'TMB2'])
df_bar.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_scatter = pd.DataFrame(np.random.uniform(0, 10, 10), columns=['Scatter'])
df_scatter.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_heatmap = pd.DataFrame(np.random.randn(50, 10), columns=['sample' + str(i) for i in range(1, 11)])
df_heatmap.index = ["Fea" + str(i) for i in range(1, df_heatmap.shape[0] + 1)]
df_heatmap.iloc[1, 2] = np.nan

plt.figure(figsize=(6, 12))
row_ha = HeatmapAnnotation(label=anno_label(df.AB, merge=True,rotation=15),
                           AB=anno_simple(df.AB,add_text=True),axis=1,
                           CD=anno_simple(df.CD,add_text=True),
                           Exp=anno_simple(df_box, cmap='turbo', vmin=0, vmax=1),
                           Scatter=anno_scatterplot(df_scatter), TMB_bar=anno_barplot(df_bar),
                           )
no_spine = ClusterMapPlotter(data=df_heatmap, top_annotation=row_ha, row_split=5, col_split_gap=0.5,
                     row_split_gap=1)

  1. There is no way to change the width of a color bar. Say in the above example if I want to narrow down all color bar widths, is there any solution?

There are a couple more, I'll try to write them in this issue later.

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

That is an excellent question. Please try:

anno_simple(df_box.Gene1,vmin=0,vmax=1,legend_kws={'vmin':0,'vmax':1}),

vmin and vmax control the heatmap, while legend_kws={'vmin':0,'vmax':1} control the range in the figure legend.

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

Since you have already familiar with almost all of the parameters. Could you please help me with the documentation if you want to contribute when you have time later? @faridrashidi

@faridrashidi
Copy link
Author

faridrashidi commented Jan 29, 2023

Sure, wherever I found something is missing I'll update it.

Regarding my question in #3 (comment), suppose you have the following data. When I plot it, it is treated as values are continues while the values are only 0, 1 and 2. Is there any trick to make the heatmap discrete?

import numpy as np
import pandas as pd
from PyComplexHeatmap import *

data = pd.DataFrame(
    np.random.randint(0, 3, (5, 5)),
    columns=["a", "b", "c", "d", "e"],
    index=["c1", "c2", "c3", "c4", "c5"],
)

ClusterMapPlotter(data=data)

Also, I still don't have a solution for this:
There is no way to change the width of a color bar. Say in the above example if I want to narrow down all color bar widths, is there any solution?

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

Maybe you can try:

data = pd.DataFrame(
    np.random.randint(0, 3, (5, 5)),
    columns=["a", "b", "c", "d", "e"],
    index=["c1", "c2", "c3", "c4", "c5"],
)
plt.figure()
row_ha = HeatmapAnnotation(df=data,plot=True,plot_legend=False)
plt.show()

@faridrashidi
Copy link
Author

Yeah, that's an option but I thought maybe there is a solution that PyComplexHeatmap automatically detects discrete/object values and creates a discrete legend as well.

Bwt, it would be nice to have a function that returns example dataset that you often use, something similar to seaborn see https://seaborn.pydata.org/generated/seaborn.load_dataset.html

@faridrashidi
Copy link
Author

faridrashidi commented Jan 29, 2023

Is there any way to change the title of color bar position from its right to left? Because the ticklabels are in right so might look nicer if the title can be placed to the left.

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

For the width of the legend #3 (comment), the width of the legend is hard coding to be 4.5mm. Is it necessary to make the width into a parameter?

The label side of the colormap legend is also hard-coded to be the same as legend_side. By default, the legend_side='right', so the default label side for the colormap legend title is also right. But you can change it to the left using the following script (for example, you want to change the label side to the left for Exp)

# for example, cm is the output of ClusterMapPlotter.
for ax in cm.legend_axes[0].figure.axes: #cm.ax.figure.axes
    if ax.get_ylabel() == "Exp":
        #ax.yaxis.set_ticks_position('left') #you can also change the tick to the left
        ax.yaxis.set_label_position('left')

@DingWB
Copy link
Owner

DingWB commented Jan 29, 2023

@faridrashidi I have already added a parameter legend_width into the ClusterMapPlotter and HeatmapAnnotation. Please uninstall the package and install it again. See the legend_gap and legend_width in updated examples.ipynb

@faridrashidi
Copy link
Author

Wow, that's so nice!

I followed your suggestion for annotationg rows or columns with texts in #3 (comment) and came up with the following example. You may add it to the examples.ipynb if you'd like to.

from PyComplexHeatmap import *
import numpy as np
import pandas as pd

df = pd.DataFrame(['AAAA1'] * 5 + ['BBBBB2'] * 5, columns=['AB'])
df['CD'] = ['C'] * 3 + ['D'] * 3 + ['G'] * 4
df['EF'] = ['E'] * 6 + ['F'] * 2 + ['H'] * 2
df['F'] = np.random.normal(0, 1, 10)
df.index = ['sample' + str(i) for i in range(1, df.shape[0] + 1)]
df_box = pd.DataFrame(np.random.randn(10, 4), columns=['Gene' + str(i) for i in range(1, 5)])
df_box.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_bar = pd.DataFrame(np.random.uniform(0, 10, (10, 2)), columns=['TMB1', 'TMB2'])
df_bar.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_scatter = pd.DataFrame(np.random.uniform(0, 10, 10), columns=['Scatter'])
df_scatter.index = ['sample' + str(i) for i in range(1, df_box.shape[0] + 1)]
df_heatmap = pd.DataFrame(np.random.randn(50, 10), columns=['sample' + str(i) for i in range(1, 11)])
df_heatmap.index = ["Fea" + str(i) for i in range(1, df_heatmap.shape[0] + 1)]
df_heatmap.iloc[1, 2] = np.nan


df_rows = pd.DataFrame(index=df_heatmap.index)
df_rows['selected'] = np.where(df_heatmap['sample4'] > 0.5, df_heatmap.index, None)


plt.figure(figsize=(6, 12))
row_ha = HeatmapAnnotation(selected=anno_label(df_rows.selected, colors="black"), AB=anno_simple(df_heatmap.sample1), axis=0)
ClusterMapPlotter(data=df_heatmap, left_annotation=row_ha, row_split=5, col_split_gap=0.5, row_split_gap=1)

@DingWB
Copy link
Owner

DingWB commented Jan 30, 2023

Thank you, this is an excellent example. I have modified it and added it to the example.ipynb and documentation (see anno_label section in https://dingwb.github.io/PyComplexHeatmap/build/html/notebooks/examples.html#anno_label:).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants