Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mplhep styles with contourf log scale causes hang #362

Closed
matthewfeickert opened this issue Feb 18, 2022 · 8 comments
Closed

mplhep styles with contourf log scale causes hang #362

matthewfeickert opened this issue Feb 18, 2022 · 8 comments

Comments

@matthewfeickert
Copy link
Member

matthewfeickert commented Feb 18, 2022

(Available as a GitHub Gist here)

I'm trying to do an interpolation of some points on a grid with mplhep v0.3.21 for an analysis at the moment. However, if I try to use matplotlib.pyplot.contourf with log spaces color levels using mplhep will cause a warning along the lines of

# Locator attempting to generate 18497 ticks ([0.15405694150420948, ..., 999.9912470033628]), which exceeds Locator.MAXTICKS (1000).

and it will then hang (I don't know if it resolves, as I kill it with a KeyboardInterrupt when it is clear it won't finish in a second or so).

If I comment out the use of plt.style.use(mplhep.style.ATLAS) (this happens with mplhep.style.ROOT too) then things proceed as normal and produce the plot in example.py.

example.py:
import matplotlib.colors as colors
import mplhep
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import ticker
from matplotlib.ticker import LogFormatterSciNotation
from numpy import random
from numpy.random import PCG64, SeedSequence
from scipy.interpolate import LinearNDInterpolator

# generate observed values (x, y)
x_step = 250
x_range = np.arange(750, 3000 + x_step, step=x_step)
y_step = 150
y_range = np.arange(700, 2000 + y_step, step=y_step)

bit_generator = PCG64(SeedSequence(0))
rng = random.default_rng(bit_generator)

x = []
y = []
for step in x_range:
    choose_n = rng.integers(low=0, high=y_range.size - 2)
    for value in y_range[: choose_n + 2]:
        x.append(step)
        y.append(value)

x = np.asarray(x)
y = np.asarray(y)

# Generate uniform data on the interval [1e-3, 100.]
# Uniform [a,b) = (b - a) * random_sample() + a
uniform_range = [1e-3, 100.0]
z = (uniform_range[1] - uniform_range[0]) * rng.random(x.size) + uniform_range[0]

# Generate a 2D grid
x_coords = np.linspace(min(x), max(x), 100)
y_coords = np.linspace(min(y), max(y), 100)
x_grid, y_grid = np.meshgrid(x_coords, y_coords)  # 2D grid for interpolation

# Interpolate with function of choice across the grid
# between the known values
interp_func = LinearNDInterpolator(list(zip(x, y)), z)
interp_Z = interp_func(x_grid, y_grid)

# plt.style.use(mplhep.style.ATLAS)  # Uncommenting causes hang
# Locator attempting to generate 18497 ticks ([0.15405694150420948, ..., 999.9912470033628]), which exceeds Locator.MAXTICKS (1000).
fig, ax = plt.subplots()

# plot interpolated values
real_valued_Z = interp_Z[~np.isnan(interp_Z)]
step_exp = 0.25
levels_exp = np.arange(
    np.floor(np.log10(real_valued_Z.min()) - 1),
    np.ceil(np.log10(real_valued_Z.max()) + 1) + step_exp,
    step=step_exp,
)
levels = np.power(10, levels_exp)

cs = ax.contourf(
    x_grid,
    y_grid,
    interp_Z,
    cmap="PuBu_r",
    levels=levels,
    norm=colors.LogNorm(vmin=levels.min(), vmax=levels.max()),
)
formatter = LogFormatterSciNotation(10, labelOnlyBase=False)
cbar = fig.colorbar(cs, format=formatter)

# plot observed values
scatter_size = 15.0
ax.scatter(x, y, s=scatter_size, color="white", edgecolors="black")

ax.set_xlabel(r"$x$")
ax.set_ylabel(r"$y$")
cbar.set_label(r"$z(x, y)$")

file_types = ["png", "pdf"]
for extension in file_types:
    fig.savefig(f"example.{extension}")

example

What is interesting, is if I attempt to get around this by using pcolormesh and then contour to create a similar-ish type of plot there is no issue and things work as expected as can be seen in example_pmesh.py.

example_pmesh.py:
import matplotlib.colors as colors
import mplhep
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import ticker
from matplotlib.ticker import LogFormatterSciNotation
from numpy import random
from numpy.random import PCG64, SeedSequence
from scipy.interpolate import LinearNDInterpolator

# generate observed values (x, y)
x_step = 250
x_range = np.arange(750, 3000 + x_step, step=x_step)
y_step = 150
y_range = np.arange(700, 2000 + y_step, step=y_step)

bit_generator = PCG64(SeedSequence(0))
rng = random.default_rng(bit_generator)

x = []
y = []
for step in x_range:
    choose_n = rng.integers(low=0, high=y_range.size - 2)
    for value in y_range[: choose_n + 2]:
        x.append(step)
        y.append(value)

x = np.asarray(x)
y = np.asarray(y)

# Generate uniform data on the interval [1e-3, 100.]
# Uniform [a,b) = (b - a) * random_sample() + a
uniform_range = [1e-3, 100.0]
z = (uniform_range[1] - uniform_range[0]) * rng.random(x.size) + uniform_range[0]

# Generate a 2D grid
x_coords = np.linspace(min(x), max(x), 100)
y_coords = np.linspace(min(y), max(y), 100)
x_grid, y_grid = np.meshgrid(x_coords, y_coords)  # 2D grid for interpolation

# Interpolate with function of choice across the grid
# between the known values
interp_func = LinearNDInterpolator(list(zip(x, y)), z)
interp_Z = interp_func(x_grid, y_grid)

plt.style.use(mplhep.style.ATLAS)  # Here there is no problem
fig, ax = plt.subplots()

# plot interpolated values
real_valued_Z = interp_Z[~np.isnan(interp_Z)]
step_exp = 0.25
levels_exp = np.arange(
    np.floor(np.log10(real_valued_Z.min()) - 1),
    np.ceil(np.log10(real_valued_Z.max()) + 1) + step_exp,
    step=step_exp,
)
levels = np.power(10, levels_exp)

pcm = ax.pcolormesh(
    x_grid,
    y_grid,
    interp_Z,
    cmap="PuBu_r",
    norm=colors.LogNorm(vmin=levels.min(), vmax=levels.max()),
    shading="auto",
)
formatter = LogFormatterSciNotation(10, labelOnlyBase=False)
cbar = fig.colorbar(pcm, format=formatter)
cs = ax.contour(
    x_grid,
    y_grid,
    interp_Z,
    cmap="PuBu_r",
)

# plot observed values
scatter_size = 15.0
ax.scatter(x, y, s=scatter_size, color="white", edgecolors="black")

ax.set_xlabel(r"$x$")
ax.set_ylabel(r"$y$")
cbar.set_label(r"$z(x, y)$")

file_types = ["png", "pdf"]
for extension in file_types:
    fig.savefig(f"example_pmesh.{extension}")

example_pmesh

@andrzejnovak Do you have any idea why this could be happening, and if there is a way to avoid it or fix it?

cc @kratsg

@andrzejnovak
Copy link
Member

andrzejnovak commented Feb 18, 2022

Oh sneaky, since the MRE doesn't seem to use any mplhep functions I think this is a matplotlib error actually. Probably one of the rcParams in

"axes.formatter.limits": "-2, 4",
"axes.formatter.use_mathtext": True,
"axes.autolimit_mode": "round_numbers",
"axes.unicode_minus": False,
"axes.xmargin": 0.0,
# x/y axis label locations
"xaxis.labellocation": "right",
"yaxis.labellocation": "top",
# xticks
"xtick.direction": "in",
"xtick.minor.visible": True,
"xtick.top": True,
"xtick.bottom": True,
"xtick.major.top": True,
"xtick.major.bottom": True,
"xtick.minor.top": True,
"xtick.minor.bottom": True,
"xtick.labelsize": "large",
"xtick.major.size": 5,
"xtick.minor.size": 3,
"xtick.color": "#000000",
# yticks
"ytick.direction": "in",
"ytick.left": True,
"ytick.right": True,
"ytick.major.left": True,
"ytick.major.right": True,
"ytick.minor.left": True,
"ytick.minor.right": True,
"ytick.minor.visible": True,
"ytick.labelsize": "large",
"ytick.major.size": 14,
"ytick.minor.size": 7,

Could you test them with your code and https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.rc_context.html#matplotlib.pyplot.rc_context ?

@matthewfeickert
Copy link
Member Author

@andrzejnovak Ah good point! Indeed, the problem is

"ytick.minor.visible": True,

If I use the matplotlib.rc_context to override mplhep.style.ATLAS for just this then things work as expected

from matplotlib import rc_context

...

plt.style.use(mplhep.style.ATLAS)

rc_options = {"ytick.minor.visible": False}  # yticks
with rc_context(rc_options):
    fig, ax = plt.subplots()
    ...
Updated example.py:
import matplotlib.colors as colors
import mplhep
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import rc_context
from matplotlib.ticker import LogFormatterSciNotation
from numpy import random
from numpy.random import PCG64, SeedSequence
from scipy.interpolate import LinearNDInterpolator

# generate observed values (x, y)
x_step = 250
x_range = np.arange(750, 3000 + x_step, step=x_step)
y_step = 150
y_range = np.arange(700, 2000 + y_step, step=y_step)

bit_generator = PCG64(SeedSequence(0))
rng = random.default_rng(bit_generator)

x = []
y = []
for step in x_range:
    choose_n = rng.integers(low=0, high=y_range.size - 2)
    for value in y_range[: choose_n + 2]:
        x.append(step)
        y.append(value)

x = np.asarray(x)
y = np.asarray(y)

# Generate uniform data on the interval [1e-3, 100.]
# Uniform [a,b) = (b - a) * random_sample() + a
uniform_range = [1e-3, 100.0]
z = (uniform_range[1] - uniform_range[0]) * rng.random(x.size) + uniform_range[0]

# Generate a 2D grid
x_coords = np.linspace(min(x), max(x), 100)
y_coords = np.linspace(min(y), max(y), 100)
x_grid, y_grid = np.meshgrid(x_coords, y_coords)  # 2D grid for interpolation

# Interpolate with function of choice across the grid
# between the known values
interp_func = LinearNDInterpolator(list(zip(x, y)), z)
interp_Z = interp_func(x_grid, y_grid)

plt.style.use(mplhep.style.ATLAS)

# c.f. https://github.com/scikit-hep/mplhep/issues/362
rc_options = {"ytick.minor.visible": False}  # yticks
with rc_context(rc_options):
    fig, ax = plt.subplots()

    # plot interpolated values
    real_valued_Z = interp_Z[~np.isnan(interp_Z)]
    step_exp = 0.25
    levels_exp = np.arange(
        np.floor(np.log10(real_valued_Z.min()) - 1),
        np.ceil(np.log10(real_valued_Z.max()) + 1) + step_exp,
        step=step_exp,
    )
    levels = np.power(10, levels_exp)

    cs = ax.contourf(
        x_grid,
        y_grid,
        interp_Z,
        cmap="PuBu_r",
        levels=levels,
        norm=colors.LogNorm(vmin=levels.min(), vmax=levels.max()),
    )
    formatter = LogFormatterSciNotation(10, labelOnlyBase=False)
    cbar = fig.colorbar(cs, format=formatter)

    # plot observed values
    scatter_size = 15.0
    ax.scatter(x, y, s=scatter_size, color="white", edgecolors="black")

    ax.set_xlabel(r"$x$")
    ax.set_ylabel(r"$y$")
    cbar.set_label(r"$z(x, y)$")

    file_types = ["png", "pdf"]
    for extension in file_types:
        fig.savefig(f"example.{extension}")

Similarly, I can get things to fail without using mplhep at all with just

from matplotlib import rc_context

...

rc_options = {"ytick.minor.visible": False}  # yticks
with rc_context(rc_options):
    fig, ax = plt.subplots()
    ...

I've updated the Gist too.

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Feb 18, 2022

"ytick.minor.visible": True is used by all of the LHC experiment styles

"ytick.minor.visible": True,

"ytick.minor.visible": True,

"ytick.minor.visible": True,

"ytick.minor.visible": True,

and as mplhep.style.ROOT is mplhep.style.CMS

ROOT = CMS # Leave as default

that's why we see it there too.

This is at least functionally resolved (:+1: ), but I'll follow up with matplotlib people to see if I am writing bad code here for what I want to do.

@andrzejnovak
Copy link
Member

Thanks @matthewfeickert

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Feb 18, 2022

Fun note! While making these examples using generated random data so I could make these examples public I learned that

from numpy import random
random.seed(0)

is considered bad and we should instead be doing things like

from numpy import random
from numpy.random import PCG64, SeedSequence
rng = random.default_rng(PCG64(SeedSequence(0)))  # Generator(PCG64) at 0x7F00A8E519E0
rng.random()  # 0.6369616873214543

Seems this is related to NEP 19 and there's a blog post more on this (Stop using numpy.random.seed()).

@andrzejnovak
Copy link
Member

@matthewfeickert can we close this?

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Mar 22, 2022

can we close this?

Yeah, I'll reopen if needed. Once I get a better summary of what is happening in matplotlib I can add a note here as well.

@matthewfeickert
Copy link
Member Author

matthewfeickert commented Mar 29, 2022

So @tacaswell was indeed right (from the IRIS-HEP Slack chats)

Can you check with the main branch? We may have fixed that recently. We made big improvements to how ticks in colorbars works (now basically the same as everything else) but that exposed just how many "special" things were in there

as going following the instructions in matplotlib/matplotlib#9994 (comment) on how to find the nightly matplotlib wheels that GHA builds, which I'll recap here:

  1. Go to the Build CI wheels GHA workflow on matplotlib's GitHub and select the latest PR build.
  2. Select the "wheels" artifact at the bottom of the build page and download it (might as well do this manually rather than use curl as there is no stable URL that you can consistently hit with this method):

matplotlib-artifact

  1. Unzip the zipfile
$ mkdir matplotlib_wheels
$ unzip wheels.zip -d matplotlib_wheels/
  1. Install the wheel that matches your CPython install. e.g, for me:
$ python -m pip install --upgrade matplotlib_wheels/*cp39-cp39-manylinux*x86_64.whl
$ python -m pip show matplotlib
Name: matplotlib
Version: 3.6.0.dev1922+gccf5115389
Summary: Python plotting package
Home-page: https://matplotlib.org
Author: John D. Hunter, Michael Droettboom
Author-email: matplotlib-users@python.org
License: PSF
Location: /home/feickert/.pyenv/versions/3.9.6/envs/mplhep-bug/lib/python3.9/site-packages
Requires: cycler, fonttools, kiwisolver, numpy, packaging, pillow, pyparsing, python-dateutil
Required-by: mplhep

With this preview of the next matplotlib release my example file with the mplhep defaults (the minimal edits to get that back are to use

$ git diff
diff --git a/example.py b/example.py
index 1d29ef4..d31f19e 100644
--- a/example.py
+++ b/example.py
@@ -46,7 +46,7 @@ interp_Z = interp_func(x_grid, y_grid)
 plt.style.use(mplhep.style.ATLAS)
 
 # c.f. https://github.com/scikit-hep/mplhep/issues/362
-rc_options = {"ytick.minor.visible": False}  # yticks
+rc_options = {"ytick.minor.visible": True}  # yticks
 with rc_context(rc_options):
     fig, ax = plt.subplots()
 

)

then things work fine. 👍

example


edit: matplotlib now has nightly wheels, so to get them just do

$ python -m pip install \
  --upgrade \
  --pre \
  --index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple \
  --extra-index-url https://pypi.org/simple \
  matplotlib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants