Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plotting fails in pandas==0.25.0 #52

Closed
BioinfoTongLI opened this issue Jul 23, 2019 · 9 comments
Closed

plotting fails in pandas==0.25.0 #52

BioinfoTongLI opened this issue Jul 23, 2019 · 9 comments

Comments

@BioinfoTongLI
Copy link

Same code works in pandas==0.24.0
When updated to 0.25.0 it fails and return the following error:

Traceback (most recent call last):
......
  File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/dabest/_classes.py", line 1235, in plot
    out = EffectSizeDataFramePlotter(self, **all_kwargs)
  File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/dabest/plotter.py", line 377, in EffectSizeDataFramePlotter
    **group_summary_kwargs)
  File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/dabest/plot_tools.py", line 162, in gapped_lines
    quantiles = data.groupby(x)[y].quantile([0.25, 0.75])\
  File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 1908, in quantile
    interpolation=interpolation,
  File "/home/tongli/miniconda3/envs/maars/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 2248, in _get_cythonized_result
    func(**kwargs)  # Call func to modify indexer values in place
  File "pandas/_libs/groupby.pyx", line 692, in pandas._libs.groupby.group_quantile
TypeError: must be real number, not list
@josesho josesho added the bug label Jul 24, 2019
@josesho
Copy link
Member

josesho commented Jul 24, 2019

I also see this error, after upgrading to pandas==0.25.

The last traceback pointing to dabest is

~/anaconda3/envs/bleeding-edge/lib/python3.7/site-packages/dabest/plot_tools.py in gapped_lines(data, x, y, type, offset, ax, line_color, gap_width_percent, **kwargs)
    160 
    161     medians   = data.groupby(x)[y].median().reindex(index=group_order)
--> 162     quantiles = data.groupby(x)[y].quantile([0.25, 0.75])\
    163                                   .unstack()\
    164                                   .reindex(index=group_order)

Something has changed in df.groupby().quantile(). I will let you know when this is fixed. If you squash this earlier, feel free to do a pull request!

Thanks,
Joses

@josesho josesho added the urgent tackle these issues first label Jul 24, 2019
@josesho josesho added this to the v0.2.5 milestone Jul 24, 2019
@BioinfoTongLI
Copy link
Author

Hello, FYI, this is not an issue causes by dabest.
Something is broken while Cythonizing the groupby.quantile in the version 0.25 of Pandas.
pandas-dev/pandas#20405 (comment)
pandas-dev/pandas#27526 (comment)
I believe the best solution, in short term, is to force people stay at 0.24

@josesho
Copy link
Member

josesho commented Jul 30, 2019

Ah ok, thanks @BioinfoTongLI for following this up! Let's wait for pandas=0.25.1 to be released.

@schlegelp
Copy link

Just to add that I am still getting an error (albeit different from OP) with pandas 0.25.1 and dabest 0.2.4 running the iris example in the README:

In [6]: iris_dabest.mean_diff.plot();                                           
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-6-be7fec275da2> in <module>
----> 1 iris_dabest.mean_diff.plot();

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/dabest/_classes.py in plot(self, color_col, raw_marker_size, es_marker_size, swarm_label, contrast_label, swarm_ylim, contrast_ylim, custom_palette, swarm_desat, halfviolin_desat, halfviolin_alpha, float_contrast, show_pairs, group_summaries, group_summaries_offset, fig_size, dpi, swarmplot_kwargs, violinplot_kwargs, slopegraph_kwargs, reflines_kwargs, group_summary_kwargs, legend_kwargs)
   1233         del all_kwargs["self"]
   1234 
-> 1235         out = EffectSizeDataFramePlotter(self, **all_kwargs)
   1236 
   1237         return out

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/dabest/plotter.py in EffectSizeDataFramePlotter(EffectSizeDataFrame, **plot_kwargs)
    375                          gap_width_percent=1.5,
    376                          type=group_summaries, ax=rawdata_axes,
--> 377                          **group_summary_kwargs)
    378 
    379 

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/dabest/plot_tools.py in gapped_lines(data, x, y, type, offset, ax, line_color, gap_width_percent, **kwargs)
    160 
    161     medians   = data.groupby(x)[y].median().reindex(index=group_order)
--> 162     quantiles = data.groupby(x)[y].quantile([0.25, 0.75])\
    163                                   .unstack()\
    164                                   .reindex(index=group_order)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/groupby/groupby.py in quantile(self, q, interpolation)
   1951             indices = np.concatenate(arrays)
   1952             assert len(indices) == len(result)
-> 1953             return result.take(indices)
   1954 
   1955     @Substitution(name="groupby")

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/series.py in take(self, indices, axis, is_copy, **kwargs)
   4430 
   4431         indices = ensure_platform_int(indices)
-> 4432         new_index = self.index.take(indices)
   4433 
   4434         if is_categorical_dtype(self):

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/indexes/multi.py in take(self, indices, axis, allow_fill, fill_value, **kwargs)
   2030             allow_fill=allow_fill,
   2031             fill_value=fill_value,
-> 2032             na_value=-1,
   2033         )
   2034         return MultiIndex(

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/indexes/multi.py in _assert_take_fillable(self, values, indices, allow_fill, fill_value, na_value)
   2058                 taken = masked
   2059         else:
-> 2060             taken = [lab.take(indices) for lab in self.codes]
   2061         return taken
   2062 

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/indexes/multi.py in <listcomp>(.0)
   2058                 taken = masked
   2059         else:
-> 2060             taken = [lab.take(indices) for lab in self.codes]
   2061         return taken
   2062 

IndexError: index 6 is out of bounds for size 6

@josesho josesho reopened this Sep 6, 2019
@josesho josesho mentioned this issue Sep 6, 2019
@BioinfoTongLI
Copy link
Author

@josesho It's my bad, I shouldn't close this...
I thought it will be solved rapidly...
Clearly it is not the case. I saw the issues is closed in pandas repo.
Someone can try a nightly build of pandas to see if it works?

@josesho
Copy link
Member

josesho commented Sep 6, 2019

The pandas developers tried a patch with 0.25.1, but there's still a problem...

I've opened an issue with them pandas-dev/pandas#28312.

I think the best option for now is to stick with pandas==0.24.

@akdor1154
Copy link

this seems to work with pandas 0.25.3 (pandas issue reckons it was fixed back in .2 but I haven't tested that). Any chance of relaxing the version restriction?

@josesho
Copy link
Member

josesho commented Dec 20, 2019

Hi @akdor1154, sorry for the delay; I just got back in the office. Thanks for the heads-up, I'll check out the latest pandas version and fix it ASAP over Christmas break

@josesho
Copy link
Member

josesho commented Dec 26, 2019

Closed with #89

@josesho josesho closed this as completed Dec 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants