Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Change of behavior between 1.1.5 and 1.2.0 in plot functions #38736

Closed
2 of 3 tasks
pcolazurdo opened this issue Dec 27, 2020 · 6 comments · Fixed by #46413
Closed
2 of 3 tasks

BUG: Change of behavior between 1.1.5 and 1.2.0 in plot functions #38736

pcolazurdo opened this issue Dec 27, 2020 · 6 comments · Fixed by #46413
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Regression Functionality that used to work in a prior pandas version Visualization plotting
Milestone

Comments

@pcolazurdo
Copy link

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd
from pandas import Timestamp
import matplotlib.pyplot as plt

a= {'Argentina': {Timestamp('2020-11-17 00:00:00'): 56.0},
 'Brazil': {Timestamp('2020-11-17 00:00:00'): 103.0},
 'Guatemala': {Timestamp('2020-11-17 00:00:00'): 99.0},
 'Ireland': {Timestamp('2020-11-17 00:00:00'): 53.0},
 'Italy': {Timestamp('2020-11-17 00:00:00'): 19.0},
 'Spain': {Timestamp('2020-11-17 00:00:00'): 49.0},
 'Sweden': {Timestamp('2020-11-17 00:00:00'): 42.0},
 'US': {Timestamp('2020-11-17 00:00:00'): 86.0},
 'United Kingdom': {Timestamp('2020-11-17 00:00:00'): 32.0}}

b = {'Val': {'Afghanistan_confirmed': 50886},
 'Country': {'Afghanistan_confirmed': 'Afghanistan'},
 'Population': {'Afghanistan_confirmed': 37.171922},
 'Ratio': {'Afghanistan_confirmed': 1368.9364784527418}}
doubling_df = pd.DataFrame().from_dict(a)
df = pd.DataFrame().from_dict(b)
_ = doubling_df.plot()
_ = df.sort_values(by='Ratio', ascending=False)[:1]['Ratio'].plot.bar(color='blue')

Stack Trace

Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/matplotlib/axis.py", line 1523, in convert_units
ret = self.converter.convert(x, self.units, self)
File "/usr/local/lib/python3.8/site-packages/matplotlib/category.py", line 61, in convert
unit.update(values)
File "/usr/local/lib/python3.8/site-packages/matplotlib/category.py", line 211, in update
cbook._check_isinstance((str, bytes), value=val)
File "/usr/local/lib/python3.8/site-packages/matplotlib/cbook/init.py", line 2246, in _check_isinstance
raise TypeError(
TypeError: 'value' must be an instance of str or bytes, not a datetime.datetime

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/pablo/bug_report.py", line 22, in
_ = df.sort_values(by='Ratio', ascending=False)[:1]['Ratio'].plot.bar(color='blue')
File "/usr/local/lib/python3.8/site-packages/pandas/plotting/_core.py", line 1113, in bar
return self(kind="bar", x=x, y=y, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pandas/plotting/_core.py", line 955, in call
return plot_backend.plot(data, kind=kind, **kwargs)
File "/usr/local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/init.py", line 61, in plot
plot_obj.generate()
File "/usr/local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py", line 280, in generate
self._make_plot()
File "/usr/local/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py", line 1433, in _make_plot
ax.xaxis.update_units(self.ax_index)
File "/usr/local/lib/python3.8/site-packages/matplotlib/axis.py", line 1466, in update_units
default = self.converter.default_units(data, self)
File "/usr/local/lib/python3.8/site-packages/matplotlib/category.py", line 107, in default_units
axis.set_units(UnitData(data))
File "/usr/local/lib/python3.8/site-packages/matplotlib/axis.py", line 1541, in set_units
self.callbacks.process('units')
File "/usr/local/lib/python3.8/site-packages/matplotlib/cbook/init.py", line 229, in process
self.exception_handler(exc)
File "/usr/local/lib/python3.8/site-packages/matplotlib/cbook/init.py", line 81, in _exception_printer
raise exc
File "/usr/local/lib/python3.8/site-packages/matplotlib/cbook/init.py", line 224, in process
func(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/matplotlib/lines.py", line 648, in recache_always
self.recache(always=True)
File "/usr/local/lib/python3.8/site-packages/matplotlib/lines.py", line 652, in recache
xconv = self.convert_xunits(self._xorig)
File "/usr/local/lib/python3.8/site-packages/matplotlib/artist.py", line 175, in convert_xunits
return ax.xaxis.convert_units(x)
File "/usr/local/lib/python3.8/site-packages/matplotlib/axis.py", line 1525, in convert_units
raise munits.ConversionError('Failed to convert value(s) to axis '
matplotlib.units.ConversionError: Failed to convert value(s) to axis units: array([datetime.datetime(2020, 11, 17, 0, 0)], dtype=object)

Problem description

There was some change in behavior between pandas 1.1.5 and 1.2.0 where the code above crashes with the stack trace detailed. The solution for 1.2.0 to work is to add a plt.close('all') between the two plot functions, but this change of behavior is not documented in the release guide so it seems to be a bug.

This happens only when running this in a headless python setup (in the real code I use get_figure to save the images) - when running in a Jupyter notebook this code works as expected.

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 3e89b4c
python : 3.8.7.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.121-linuxkit
Version : #1 SMP Tue Dec 1 17:50:32 UTC 2020
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.2.0
numpy : 1.19.4
pytz : 2020.5
dateutil : 2.8.1
pip : 20.3.3
setuptools : 51.1.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.5.4
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@pcolazurdo pcolazurdo added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 27, 2020
@rhshadrach rhshadrach added Regression Functionality that used to work in a prior pandas version and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 28, 2020
@rhshadrach rhshadrach added this to the 1.2.1 milestone Dec 28, 2020
@rhshadrach
Copy link
Member

Thanks for reporting this! Further investigations and PRs to fix are most welcome.

@pcolazurdo
Copy link
Author

I'm working on bisecting it - I'll try to get more details later today

@pcolazurdo
Copy link
Author

The problematic commit is

commit fb379d8266492f917ed880f7619f3d0d9bc7c8db
Author: nrebena <nicolas.rebena@gmail.com>
Date:   Sat Nov 21 13:37:18 2020 +0100
Inconsistent indexes for tick label plotting (#28733)
* TST: Test for issues #26186 and #11465

* BUG: Generate the tick position in BarPlot using convert tools from matlab.

Generate the tick position in BarPlot using convert tools from matlab.

* TST: Modify tests/plotting/test_frame.test_bar_categorical

Ticklocs are now float also for categorical bar data (as they are
position on the axis). The test is changed to compare to a array of
np.float.

* TST: Fix test for windows OS

* TST: Add test for plotting MultiIndex bar plot

A fix to issue #26186 revealed no tests existed about plotting a bar
plot for a MultiIndex, but a section of the user guide visualization
did. This section of the user guide is now in the test suite.

* BUG: Special case for MultiIndex bar plot

* DOC: Add whatsnew entry for PR #28733

* CLN: Clean up in code and doc

* CLN: Clean up test_bar_numeric

* DOC Move to whatsnew v1.1

* FIX: Make tick dtype int for backwards compatibility

* DOC: Improve whatsnew message

* ENH: Add UserWarning when plotting bar plot with MultiIndex

* CLN: Remove duplicate code line

* TST: Capture UserWarning for Bar plot with MultiIndex

* TST: Improve test explanation

* ENH: Raise UserWarning only if redrawing on existing axis with data

* DOC: Move to whatsnew v1.2.9

Co-authored-by: Marco Gorelli <m.e.gorelli@gmail.com>

:040000 040000 2efae8e0b0ae2a2f34b8f01577d66bd7d8265791 fb5e7deb8677c0429df95d45818390c9ae739dbf M doc
:040000 040000 ea9a7fe4deec6846018cadc307c3b666e726f65f b91c7ddf9fe29db1ad7ecce5249ac69227f77913 M pandas

I don't understand the code enough to provide a valid PR

@mzeitlin11
Copy link
Member

Thanks for bisecting this @pcolazurdo! From that, looks like the regression can be isolated to the added unit update lines like:

ax.xaxis.update_units(self.ax_index)

That call leads to an attempt to convert the string x-axis from the second plot call to datetime (from 1st plotting call). Calling plt.show() or plt.close() between the plotting calls clears the axis data, so nothing fails.

cc @nrebena if you have thoughts here

@simonjayhawkins
Copy link
Member

The problematic commit is

#28733 has been reverted. needs tests to prevent regression

@simonjayhawkins simonjayhawkins modified the milestones: 1.2.1, Contributions Welcome Jan 18, 2021
@purna135
Copy link
Contributor

Hello, I would like to work on this issue if it's not entirely finished! I noticed that it's still open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions Regression Functionality that used to work in a prior pandas version Visualization plotting
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants