Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unyt2.5.0 breaks matplotlib's errorbar function #125

Closed
JBorrow opened this issue Jan 21, 2020 · 22 comments
Closed

Unyt2.5.0 breaks matplotlib's errorbar function #125

JBorrow opened this issue Jan 21, 2020 · 22 comments

Comments

@JBorrow
Copy link
Contributor

JBorrow commented Jan 21, 2020

  • unyt version: 2.5.0
  • Python version: 3.7.4
  • Operating System: MacOS Catalina, RHEL 7(?)

Description

unyt v2.5.0 is unable to create matplotlib plots that have an unyt_quantity as the axis limit when using the errorbar function, when a scatter is provided in the required 2XN format as a list of two unyt arrays.

What I Did

Example script (matplotlib3.1.2 and unyt2.5.0):

import matplotlib.pyplot as plt
import unyt

x = unyt.unyt_array([8, 9, 10], "cm")
y = unyt.unyt_array([8, 9, 10], "kg")
# It is convenient often to supply the 2XN required array
# in this format
y_scatter = [
    unyt.unyt_array([0.1, 0.2, 0.3], "kg"),
    unyt.unyt_array([0.1, 0.2, 0.3], "kg"),
]

x_lims = (unyt.unyt_quantity(5, "cm"), unyt.unyt_quantity(12, "cm"))
y_lims = (unyt.unyt_quantity(5, "kg"), unyt.unyt_quantity(12, "kg"))

plt.errorbar(x, y, yerr=y_scatter)
plt.xlim(*x_lims)
plt.ylim(*y_lims)

plt.show()

Output:

python3 test.py
Traceback (most recent call last):
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/axis.py", line 1550, in convert_units
    ret = self.converter.convert(x, self.units, self)
  File "/private/tmp/env/lib/python3.7/site-packages/unyt/mpl_interface.py", line 105, in convert
    return value.to(*unit)
AttributeError: 'list' object has no attribute 'to'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 14, in <module>
    plt.errorbar(x, y, yerr=y_scatter)
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/pyplot.py", line 2554, in errorbar
    **({"data": data} if data is not None else {}), **kwargs)
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/__init__.py", line 1599, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 3430, in errorbar
    barcols.append(self.vlines(xo, lo, uo, **eb_lines_style))
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/__init__.py", line 1599, in inner
    return func(ax, *map(sanitize_sequence, args), **kwargs)
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 1176, in vlines
    x = self.convert_xunits(x)
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/artist.py", line 180, in convert_xunits
    return ax.xaxis.convert_units(x)
  File "/private/tmp/env/lib/python3.7/site-packages/matplotlib/axis.py", line 1553, in convert_units
    f'units: {x!r}') from e
matplotlib.units.ConversionError: Failed to convert value(s) to axis units: [unyt_quantity(8, 'cm'), unyt_quantity(9, 'cm'), unyt_quantity(10, 'cm')]

Even wrapping the list in a call to unyt.unyt_array doesn't save the day.

@ngoldbaum
Copy link
Member

Unfortunately this is likely a bug in matplotlib. Maybe we need to make the matplotlib support opt-in somehow. Ping @l-johnston.

@JBorrow
Copy link
Contributor Author

JBorrow commented Jan 21, 2020

Personally I would prefer the matplotlib integration to be opt-in (at least until a major version revision) as it's quite a change to the way that you interact with the library.

Is there a way that there can be a top-level flag that gets set as to whether or not to respect the matplotlib integration?

@ngoldbaum
Copy link
Member

We could maybe do it like astropy with a context manager.

Sorry for the breakage! I did bump the minor version number to indicate the new feature but didn’t anticipate that we would need to test the entire matplotlib API to make sure everything works with ever possible type of plot.

@l-johnston would you be OK with me temporarily disabling matplotlib support and then cutting a new release to unbreak downstream users? We can then work on coming up with a way to add this without risking breakage for people who have already written code that make plots using unyt types.

@l-johnston
Copy link
Contributor

I am OK with temporarily disabling the feature.

@ngoldbaum
Copy link
Member

OK great, I have a doctors appointment this morning and won’t be able to get to it until this afternoon. If one of you would like to put in a PR in the meantime please feel free.

@l-johnston
Copy link
Contributor

The issue is that matplotlib simply passes the yerr=y_scatter list directly to our convert function since matplotlib knows that the y-axis is of the unyt_array type.

For parameters like yerr, matplotlib describes them as array_like where array_like means a sequence such as list or tuple. I have updated our convert function to work with list and tuple. @JBorrow , will supporting list and tuple be sufficient for your use cases?

@JBorrow
Copy link
Contributor Author

JBorrow commented Jan 21, 2020

Yes, it should be - but note that I also get crashing behaviour with the following:

import matplotlib.pyplot as plt
import unyt

x = unyt.unyt_array([8, 9, 10], "cm")
y = unyt.unyt_array([8, 9, 10], "kg")
# It is convenient often to supply the 2XN required array
# in this format
y_scatter = unyt.unyt_array([[0.1, 0.2, 0.3], [0.1, 0.2, 0.3]], "kg")

x_lims = (unyt.unyt_quantity(5, "cm"), unyt.unyt_quantity(12, "cm"))
y_lims = (unyt.unyt_quantity(5, "kg"), unyt.unyt_quantity(12, "kg"))

plt.errorbar(x, y, yerr=y_scatter)
plt.xlim(*x_lims)
plt.ylim(*y_lims)

plt.show()

I do not know however how this affects any of our other plotting routines (e.g. 2d histograms, imshow() with units, pcolormesh, etc.)

@l-johnston
Copy link
Contributor

With my update, your code example produces the plot:
fig1

I believe this is correct. I'll check some other plot types.

@JBorrow
Copy link
Contributor Author

JBorrow commented Jan 21, 2020

Can I also ask what controls what type of bracket goes around the units? This doesn't seem to be documented in the matplotlib docs.

@ngoldbaum
Copy link
Member

I think you want the content of this answer: https://stackoverflow.com/questions/42202845/settling-error-bar-default-rcparams-matplotlib

@l-johnston
Copy link
Contributor

l-johnston commented Jan 21, 2020

@JBorrow , the axis label format where the unit is surrounded by parentheses is common. My intent here is to provide the user with a good starting point for generating the complete axis label without having to deal with the latex syntax. The workflow proceeds as:

>>> x = [0, 0.01, 0.02]*s
>>> y = [3,4,5]*K
>>> ax = plt.figure().add_subplot()
>>> ax.plot(x, y, xunits="ms")
>>> unit_str = ax.xaxis.get_label().get_text()
>>> ax.set_xlabel("time " + unit_str)

The x-axis label would then be "time (ms)" after re-drawing.

@l-johnston
Copy link
Contributor

I checked hist, hist2d, imshow and they each can work using the update I have. For a histogram, the bin edges have to be specified in the same units as the data:

>>> data = np.random.normal(size=10000)*m
>>> bin_edges = np.linspace(data.min(), data.max(), 50)
>>> plt.hist(data, bins=bin_edges)

This produces the expected plot. But, an integer number for bins produces an exception where matplotlib is calculating the bin edges. This error occurs elsewhere in the matplotlib code, not our conversion interface.

I suspect there will be many such scenarios where matplotlib is performing calculations where one operand has units and one operand is unitless like the hist example.

@ngoldbaum
Copy link
Member

Maybe the simplest approach would be to remove the mpl_interface import from
the top-level unyt namespace and tell people that if they want to activate the plotting support they need to explicitly import unyt.mpl_interface. We could also in a second pass or in the same PR that removes the top-level import add a context manager that allows people to temporarily activate the plotting support for a block of code.

@l-johnston
Copy link
Contributor

I like the context manager option. I'll give it a try.

@JBorrow
Copy link
Contributor Author

JBorrow commented Jan 21, 2020

@JBorrow , the axis label format where the unit is surrounded by parentheses is common. My intent here is to provide the user with a good starting point for generating the complete axis label without having to deal with the latex syntax. The workflow proceeds as:

>>> x = [0, 0.01, 0.02]*s
>>> y = [3,4,5]*K
>>> ax = plt.figure().add_subplot()
>>> ax.plot(x, y, xunits="ms")
>>> unit_str = ax.xaxis.get_label().get_text()
>>> ax.set_xlabel("time " + unit_str)

The x-axis label would then be "time (ms)" after re-drawing.

Yeah, this is what I was getting at - really whether or not the units are written as (m) [m], or / m, as different people like different styles. I was wondering if this was a matplotlib thing and so could be customised using a stylesheet bug if this is done within unyt then I guess we're stuck with one?

In that case, can I make a case for square brackets, rather than rounded?

@l-johnston
Copy link
Contributor

The label style is up to us. Of the three styles, / m is technically correct being consistent with the SI standard, (m) is the most common, and [m] disfavored from a formal quantity notation perspective. I chose the most common. It's really just a starting point allowing the user to quickly see the plot and understand it.

@ngoldbaum
Copy link
Member

We could probably make that a keyword argument in the context manager’s initializer.

@l-johnston
Copy link
Contributor

ok. now the label style is settable in the CM:

>>> x = [0,1,2]*s
>>> y = [3,4,5]*m
>>> with MplUnitsCM(label_style="[]"):
...    plt.plot(x, y)
...    plt.show()

will produce:
fig2

>>> x = [0,1,2]*s
>>> y = [3,4,5]*m
>>> with MplUnitsCM(label_style="/"):
...    plt.plot(x, y)
...    plt.show()

will produce:
fig3

@JBorrow
Copy link
Contributor Author

JBorrow commented Jan 21, 2020

Would you be at all interested in adding a name field to the unyt.unyt_array class? I have implemented in this in a few libraries that use unyt and find it very useful. Then you can, on these axes, use the names of the (last) arrays that were plotted to form the axes labels.

E.g. you may have an object hierarchy like data.densities: unyt.unyt_array that has a field data.densities.name which is Densities. When this is plotted, you can automatically have Densities $[g cm^{-3}]$ written on the axes.

Perhaps this is out of scope of this change or the library as a whole, so feel free to ignore this.

@ngoldbaum
Copy link
Member

I think that’s not a bad idea and would probably make a lot of things in yt simpler too. That said that discussion makes more sense in the context of a new GitHub issue.

@ngoldbaum
Copy link
Member

closed by #126

@JBorrow
Copy link
Contributor Author

JBorrow commented Jan 22, 2020

Thanks both for the quick response to this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants