Adding orders of magnitude to fractional values in plot_bar_stacked #43

magruca · 2018-11-27T15:39:39Z

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

type: bug

type: feature

Environment:

Chartify version(s): beta (pip install)
Operating System(s): Linux 4.19.3-300.fc29.x86_64 x86_64
Python version(s): 3.7

What happened:
I'm getting an extra two orders of magnitude (or more if I set a floating point on the tick value) for values with floating points in stacked bar charts.

What you expected to happen:
When setting the y axis tick values to '0%", I should get percentage values ranging from 0-100%.

How to reproduce it (as minimally and precisely as possible):
Plot all chromosome data using stacked bar plot

RdGy = chartify.color_palettes['RdGy']
shifted_RdGy = RdGy.shift_palette('black', percent=20)
shifted_RdGy.show()

(chartify.Chart(blank_labels=True,
x_axis_type='categorical')
.style.set_color_palette('diverging', palette=shifted_RdGy)
.plot.bar_stacked(
data_frame=results,
categorical_columns='#ID',
categorical_order_by='labels',
categorical_order_ascending=True,
numeric_column='Covered_percent',
stack_column='sample',
normalize=False)
.set_legend_location('outside_right', orientation='vertical')
.axes.set_yaxis_tick_format('0.0%')
.axes.set_xaxis_tick_orientation('vertical')
.show('png'))

Anything else we need to know?:
Included the df and ipynb if you want to replicate.
Allen_concatData.txt
pileup.zip

cphalpert · 2018-11-28T15:22:22Z

Hi @magruca

It looks like percent formatting is working as expected (see screenshot below).

Percents are expected to be expressed as decimals with 0.0 - 1.0 mapping to 0% to 100%. E.g. 0.1 will be displayed as 10%.

In your case you can divide the Covered percent by 100 to get it to work.
results['Covered_percent'] = results['Covered_percent'] / 100.

magruca · 2018-11-28T15:56:55Z

The problem I'm having isn't with the normal bar plot but with the stacked bar plot. Your explanation of the percent function makes sense, and the base functions might be the same between both plots -- I haven't looked into it -- but with or without the percent, I'm getting an extra couple orders of magnitude:

Code (same as above, just removing the axis tick format):

(chartify.Chart(blank_labels=True,
                x_axis_type='categorical')
 .style.set_color_palette('diverging', palette=shifted_RdGy)
 .plot.bar_stacked(
     data_frame=results,
     categorical_columns='#ID',
     categorical_order_by='labels',
     categorical_order_ascending=True,
     numeric_column='Covered_percent',
     stack_column='sample',
     normalize=False)
 .set_legend_location('outside_right', orientation='vertical')
 .axes.set_xaxis_tick_orientation('vertical')
 .show('png'))

with Covered_percent data formatted as such:

6.91
5.53
5.38
4.03
4.89

I do not have the same issue with a standard bar plot:

RdGy = chartify.color_palettes['RdGy']
shifted_RdGy = RdGy.shift_palette('black', percent=20)
shifted_RdGy.show()

color_order = [
        'SRR1105736', 'SRR1224574', 'SRR1105738', 'SRR1105739', 'SRR1105740', 'SRR1105737', 'SRR1105741', 'SRR1224573'
    ]
sample_order = [
        'SRR1105736', 'SRR1224574', 'SRR1105738', 'SRR1105739', 'SRR1105740', 'SRR1105737', 'SRR1105741', 'SRR1224573'
    ]

ch = chartify.Chart(blank_labels=True, y_axis_type='categorical')
ch.style.set_color_palette('sequential', palette=shifted_RdGy)
ch.axes.set_yaxis_label('Sample ID')
ch.axes.set_xaxis_label('Chr1 Coverage (%)')
ch.plot.bar(
    data_frame=results,
    categorical_columns='sample',
    numeric_column='Covered_percent',
    categorical_order_by=sample_order,
    categorical_order_ascending=False,
    color_column='sample',
    color_order=color_order)
ch.axes.set_xaxis_tick_format('0%')
ch.show('png')

Thanks for the help!

cphalpert · 2018-11-28T16:08:43Z

Are you sure that it's not an issue with the input data?

Summing over ID in your sample data gives me totals in the 100-200 range, which is consistent with what's shown in the stacked bar graph that you shared:

results.groupby('#ID')['Covered_percent'].sum()

magruca · 2018-11-28T16:48:20Z

Ah I see, thank you! So in my mind I was imaging a stacked bar plot with each categorical value having a max numerical value based on the largest sample value (e.g 32% in this case for chr1), rather than summing the values from each sample, then extending past each other as if they were stacked in order behind each other on the z-axis.

cphalpert · 2018-11-28T17:22:38Z

You could use this if the order of the largest sample is consistent across #ID:

sample_order = results.groupby('sample')['Covered_percent'].max().sort_values().index
xaxis_order = ['chr' + str(number) for number in range(1, 21)] + ['chrX', 'chrY']

results = results.sort_values(['#ID', 'Covered_percent'])
results['incremental_percent'] = results.groupby('#ID')['Covered_percent'].apply(lambda x: x - x.shift().fillna(0))

# Plot all chromosome data using stacked bar plot

RdGy = chartify.color_palettes['RdGy']
shifted_RdGy = RdGy.shift_palette('black', percent=20)
shifted_RdGy.show()

ch = (chartify.Chart(blank_labels=True,
                x_axis_type='categorical')
 .style.set_color_palette('diverging', palette=shifted_RdGy)
     )

ch.plot.bar_stacked(
     data_frame=results,
     categorical_columns='#ID',
     categorical_order_by=xaxis_order,
     numeric_column='incremental_percent',
     stack_column='sample',
     stack_order=sample_order,
     normalize=False)
ch.set_legend_location('outside_right', orientation='vertical')
# ch.axes.set_yaxis_tick_format('0%')
ch.axes.set_xaxis_tick_orientation('vertical')
ch.show()

If the order changes I'd try something like:


RdGy = chartify.color_palettes['RdGy']
shifted_RdGy = RdGy.shift_palette('black', percent=20)
shifted_RdGy.show()

ch = (chartify.Chart(blank_labels=True,
                x_axis_type='categorical')
 .style.set_color_palette('diverging', palette=shifted_RdGy)
     )

ch.plot.parallel(
     data_frame=results,
     categorical_columns='#ID',
     categorical_order_by=xaxis_order,
     numeric_column='Covered_percent',
     color_column='sample',)
ch.set_legend_location('outside_right', orientation='vertical')
# ch.axes.set_yaxis_tick_format('0%')
ch.axes.set_xaxis_tick_orientation('vertical')
ch.show()```

magruca · 2018-11-28T18:13:54Z

Gotcha that makes sense -- thanks for the help! Sorry for the confusion on my end.

cphalpert closed this as completed Nov 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding orders of magnitude to fractional values in plot_bar_stacked #43

Adding orders of magnitude to fractional values in plot_bar_stacked #43

magruca commented Nov 27, 2018 •

edited

cphalpert commented Nov 28, 2018

magruca commented Nov 28, 2018 •

edited

cphalpert commented Nov 28, 2018

magruca commented Nov 28, 2018 •

edited

cphalpert commented Nov 28, 2018

magruca commented Nov 28, 2018

Adding orders of magnitude to fractional values in plot_bar_stacked #43

Adding orders of magnitude to fractional values in plot_bar_stacked #43

Comments

magruca commented Nov 27, 2018 • edited

cphalpert commented Nov 28, 2018

magruca commented Nov 28, 2018 • edited

cphalpert commented Nov 28, 2018

magruca commented Nov 28, 2018 • edited

cphalpert commented Nov 28, 2018

magruca commented Nov 28, 2018

magruca commented Nov 27, 2018 •

edited

magruca commented Nov 28, 2018 •

edited

magruca commented Nov 28, 2018 •

edited