Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lazy_loading matplotlib submodules can cause errors #1721

Closed
agucova opened this issue Jun 14, 2023 · 11 comments · Fixed by #1722
Closed

lazy_loading matplotlib submodules can cause errors #1721

agucova opened this issue Jun 14, 2023 · 11 comments · Fixed by #1722
Labels
Upstream/dependency bug Another package is causing us trouble!
Milestone

Comments

@agucova
Copy link

agucova commented Jun 14, 2023

Librosa uses lazy_loader to lazily load some libraries. This implementation has the known flaw (see here, here and here) of not triggering certain import side effects, effectively breaking libraries that rely on this, like matplotlib. This can cause weird and hard to replicate bugs, like this one, which was filed initially against matplotlib and then to matplotlib-inline.

My best guess is that bugs related to this are probably more widespread than would be apparent, as users would report these bugs to the wrong projects most of the time.

It's not clear to me there is a good solution to this. As far as I know, the only reliable solution would be to patch matplotlib itself to support lazy loading, like numpy did.

To Reproduce

In the following case, one should be able to access pyplot (or run one of its methods): 1

import librosa.display

import matplotlib
import matplotlib.pyplot

matplotlib.pyplot

Instead, one gets:

AttributeError: module 'matplotlib' has no attribute 'pyplot'

Just removing the librosa.display import makes it work again.

Edit: Removed reference to importlib's implementation, as I thought Librosa was using it (and not lazy_loader).

Footnotes

  1. Yes, both imports are necessary to replicate because the python import system is weird.

@agucova
Copy link
Author

agucova commented Jun 14, 2023

@bmcfee Why was this marked off topic? AFAIK this is not an upstream problem, lazy_loader is working as expected (note I've corrected a reference to importlib which was wrong).

@bmcfee
Copy link
Member

bmcfee commented Jun 14, 2023

@agucova sorry for that, you responded before I was able to finish typing the response below.

I'm flagging this as off topic, not because it's irrelevant, but because there is nothing we can do from within librosa to resolve this issue. As far as I can tell, we're doing things correctly here and the problems lie in other packages. If there's something incorrect about how we're using lazy-loader that could be improved, I'd be happy to fix it.

I do agree that it's a bigger problem. So far, it's mostly been users of pyinstaller raising issues about it, and from what I can tell, a patch has already been implemented over there (but not yet released).

In the specific example you provide, you could resolve the issue by importing librosa.display last instead of first. (Confirmed on my dev machine that this works.) This is probably good style anyway, as lower-level imports usually should come first, even if it's not supposed to really matter. We could add some language in the documentation to advise this use pattern to avoid problems if folks think that would be helpful, but I expect that users would probably not easily find this.

As far as I know, the only reliable solution would be to patch matplotlib itself to support lazy loading, like numpy did.

Not knowing much of the details here, I think this might be reasonable. Relying on import side effects seems like an antipattern, though I'm sure they have good reasons for doing it.

@agucova
Copy link
Author

agucova commented Jun 14, 2023

@bmcfee sorry! I got ahead of myself.

In the specific example you provide, you could resolve the issue by importing librosa.display last instead of first. (Confirmed on my dev machine that this works.) This is probably good style anyway, as lower-level imports usually should come first, even if it's not supposed to really matter. We could add some language in the documentation to advise this use pattern to avoid problems if folks think that would be helpful, but I expect that users would probably not easily find this.

As much as I agree with this, the example provided is only implemented in one module for simplicity of replication.

The original bug that prompted me to notify the maintainers here (even though I'm not a librosa user), involved another library (matplotlib-inline) importing matplotlib themselves after the user had imported librosa.display, so this issue wouldn't be able to be fixed by the user.

I'm not sure what librosa users could do to work around that particular bug unless matplotlib chooses to fix this (or matplotlib-inline implements a workaround only for librosa users, which seems unlikely). AFAIK this affects all Jupyter Notebook users that import librosa.display.

Anyways, I'm in a crusade to warn users of weird errors like this to save them some time (as I lost ~2 hours when debugging the same bug in a package I was maintaining). I've filed issues with both importlib and lazy_loader so they can document this flaw.

@bmcfee
Copy link
Member

bmcfee commented Jun 14, 2023

As much as I agree with this, the example provided is only implemented in one module for simplicity of replication.

The original bug that prompted me to notify the maintainers here (even though I'm not a librosa user), involved another library (matplotlib-inline) importing matplotlib themselves after the user had imported librosa.display, so this issue wouldn't be able to be fixed by the user.

Ah, fair enough. Thanks for clarifying. At any rate, I think it's unreasonable to expect end-users to have a full working knowledge of the dependency tree for every package, so it's probably not a practicable solution to have them import in a specific order anyway.

I'm not sure what librosa users could do to work around that particular bug unless matplotlib chooses to fix this (or matplotlib-inline implements a workaround only for librosa users, which seems unlikely). AFAIK this affects all Jupyter Notebook users that import librosa.display.

I guess this would also come up with every other package that uses lazy-loader and matplotlib (eg, scikit-image, networkx, etc)? That seems like a growing list, so an mpl-side fix wouldn't necessarily be specific to us.

Maybe the better question is: is there something we could do to fix it, short of reverting the use of lazy-loader entirely?

@bmcfee
Copy link
Member

bmcfee commented Jun 14, 2023

Looking at this a bit more closely, it's possible that I was conflating this specific behavior with the more general issue that's been popping up with pyinstaller. The latter has to do with lazy loading full stop, while it seems like maybe this one could be isolated only to lazy-loading across package boundaries. Would that make sense?

If that's the case, I think we could potentially roll back the lazy loading of matplotlib modules:

if TYPE_CHECKING:
import matplotlib
import matplotlib.cm as mcm
import matplotlib.axes as mplaxes
import matplotlib.ticker as mplticker
import matplotlib.pyplot as plt
from matplotlib.collections import QuadMesh, PolyCollection
from matplotlib.lines import Line2D
from matplotlib.path import Path as MplPath
from matplotlib.markers import MarkerStyle
from matplotlib.colors import Colormap
else:
matplotlib = lazy.load("matplotlib")
mcm = lazy.load("matplotlib.cm")
mplaxes = lazy.load("matplotlib.axes")
mplticker = lazy.load("matplotlib.ticker")
plt = lazy.load("matplotlib.pyplot")

and just import them directly. I think in practice this wouldn't change anything for our users; we can still lazy-load librosa.display from __init__.py so that users don't have to explicitly import the submodule, but the display module could remain optional so that users don't need to install / import matplotlib if they're not using it. I'd be okay with that compromise if it smooths things over for users.

The only wriggle here is that if we're going to forbid lazy loading across package boundaries, we should probably do it across the board, right? We have a few other optional dependencies that are lazy-loaded (resampling libraries, mainly). However, we lazy-load the entire package there, so maybe the cross-package submodule issue would not be a problem?

I'm curious for @stefanv's input on this.

@stefanv
Copy link

stefanv commented Jun 15, 2023

I responded on scientific-python/lazy-loader#55 (comment)

I don't yet understand whether it is a flaw in the lazy loader, but perhaps rather matplotlib relying on side-effects of the standard import mechanism (which makes perfect sense, since up to now there has been only one such import mechanism) but I'll look whether we can build in a workaround.

I only see the behavior for matplotlib, but I'd like to distill which assumption is at play.

@bmcfee
Copy link
Member

bmcfee commented Jun 15, 2023

Thanks @stefanv ! I mainly wanted to make sure this wasn't just me misunderstanding how this is supposed to work, ie whether lazyloading submodules from other packages is specifically discouraged. But having read through the spec again, I don't think we're really doing anything wrong here.

That said, I'll still look into the solution proposed above just for the case of matplotlib as a stopgap solution until the dust settles.

@bmcfee bmcfee added Upstream/dependency bug Another package is causing us trouble! and removed off topic Issues unrelated to librosa labels Jun 15, 2023
@bmcfee bmcfee changed the title Use of LazyLoader breaks other libraries lazy_loading matplotlib submodules can cause errors Jun 15, 2023
@bmcfee bmcfee added this to the 0.10.1 milestone Jun 15, 2023
@bmcfee
Copy link
Member

bmcfee commented Jun 15, 2023

Took the liberty of updating the title here, since it does (for us) appear to be much more specifically related to matplotlib and not lazy-loading in general. The PR #1722 fixes this while retaining lazy-loading benefits in general from our side, so I'll slot this into the next minor release (0.10.1).

@bluenote10
Copy link
Contributor

Just got bitten by this as well. Looks like it can manifest itself in other ways as well. For instance, this

import librosa.display
import matplotlib.pyplot as plt

fig, axes = plt.subplots(2, 1)

errors with AttributeError: module 'matplotlib' has no attribute 'axes' instead.

In the specific example you provide, you could resolve the issue by importing librosa.display last instead of first. (Confirmed on my dev machine that this works.) This is probably good style anyway, as lower-level imports usually should come first, even if it's not supposed to really matter.

Please note that this isn't really a sustainable solution. When working in larger teams, CI enforced standardization of imports via isort is becoming more and more standard. These days, modern IDEs can largely solve the problem of auto-managing imports, resulting in a pretty big productivity boost. One simply shouldn't have to spend time manually maintaining imports, especially not having to worry about a certain order. In general, Python imports should be order independent. Anything along "it only works in this particular order" isn't sustainable, because it is very fragile and a highly likely waste of developer time.

Note that since isort defaults to alphabetical ordering, the enforced order implies that librosa.display comes before matplotlib.pyplot, which makes it particularly likely to end up with the wrong order here.

I'll slot this into the next minor release (0.10.1).

Thanks for fixing this!

@stefanv
Copy link

stefanv commented Jun 18, 2023

Please note that this isn't really a sustainable solution.

We are all well aware of this. I am trying to see if I can implement a solution in lazy_loader, but if not we may remove the .load function or warn about these side-effects.
The lazy_loader package is really mostly about internal lazy loading of library submodules, which is not affected.

@KilnOfTheSecondFlame
Copy link

Note that since isort defaults to alphabetical ordering, the enforced order implies that librosa.display comes before matplotlib.pyplot, which makes it particularly likely to end up with the wrong order here.

For anyone having this exact problem (as we did in our project), you can at least tell isort to skip the import of matplotlib and preserve the order until this is properly handled:

import matplotlib.pyplot as plt  # isort:skip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Upstream/dependency bug Another package is causing us trouble!
Development

Successfully merging a pull request may close this issue.

5 participants