Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve headlessness detection for backend selection. #17396

Merged
merged 1 commit into from
Oct 11, 2020

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented May 13, 2020

We currently check the $DISPLAY environment variable to autodetect
whether we should auto-pick a non-interactive backend on Linux, but that
variable can be set to an "invalid" value. A realistic use case is for
example a tmux session started interactively inheriting an initially
valid $DISPLAY, but to which one later reconnects e.g. via ssh, at which
point $DISPLAY becomes invalid.

Before this PR, something like

DISPLAY=:123 MPLBACKEND= MATPLOTLIBRC=/dev/null python -c 'import pylab'

(where we unset matplotlibrc to force backend autoselection) would crash
when we select qt and qt fails to initialize as $DISPLAY is invalid (qt
unconditionally abort()s via qFatal() in that case).
With this PR, we correctly autoselect a non-interactive backend.

Note that we were already relying on $DISPLAY being correctly set
before, and this PR also doesn't return "invalid display" if we can't
load X11, so this should not make anything worse on Wayland (at worst
we'll just fail to detect headlessness like before).

Also supporting Wayland was easy enough, so this also closes #18377.

PR Summary

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

display = os.environ.get('DISPLAY')
if display is None or not re.search(r':\d', display):
raise RuntimeError('Invalid DISPLAY variable')
if is_x11_build and matplotlib._c_internal_utils.invalid_display():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this check also need to land in wx/tk/gtk flavors of the backends?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the important fix is in cbook._get_running_interactive_framework() because that's what'll determine fallback. The one in qt5 only affects people who directly instantiate qt canvases without going through the pyplot machinery (but then it's just unlikely that they're going through this path to create their QApplication; likely they're doing it themselves and this check doesn't help). tk and gtk throw proper exceptions when DISPLAY is invalid. wx apparently also abort()s, but as for qt I doubt that a check would really help.

dlsym(libwayland_client, "wl_display_connect");
int (* wl_display_disconnect)(struct wl_display*) =
dlsym(libwayland_client, "wl_display_disconnect");
if (wl_display_connect && wl_display_connect

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be wl_display_connect && wl_display_disconnect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, thanks

#ifdef _WIN32
#include <Objbase.h>
#include <Shobjidl.h>
#include <Windows.h>
#endif

static PyObject* mpl_display_is_invalid(PyObject* module)
{
#ifdef __linux__
Copy link

@mstoeckl mstoeckl Aug 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would still make sense to check that environment variables DISPLAY, or WAYLAND_DISPLAY exist before hand. First, to avoid libwayland's behavior of defaulting to connect to 'wayland-0', even if no environment variables are set; second, because dlopen is very slow -- one one computer, calling dlopen followed by dlclose on libX11.so and libwayland-client.so costs between 1.4 msec (marginal cost), 2 msec (initial cost, hot cache), and 6 msec (uncached, initial cost). Importing matplotlib already costs me 240msec every script load. (Aside: It's not necessary to check env variable WAYLAND_SOCKET, because the socket named by it cannot be reused over two successive wl_display_connect calls.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check the speed of getenv()? (I genuinely do not know whether it is much faster than dlopen().)

Copy link

@mstoeckl mstoeckl Aug 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getenv is very fast, since the environment variables are provided to a program at the start, just like the command line arguments. My benchmark of getenv uses around 0.0002 msec to call both getenv("WAYLAND_DISPLAY") and getenv("DISPLAY").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then.

@anntzer anntzer force-pushed the guillotine branch 2 times, most recently from 9aa515f to 3e17318 Compare August 30, 2020 19:01
@anntzer
Copy link
Contributor Author

anntzer commented Aug 31, 2020

I inverted the logic to display_is_valid as now we should(?) support all relevant cases (X and Wayland), and this behavior is much easier to document.

src/_c_internal_utils.c Outdated Show resolved Hide resolved
src/_c_internal_utils.c Outdated Show resolved Hide resolved
src/_c_internal_utils.c Outdated Show resolved Hide resolved
src/_c_internal_utils.c Outdated Show resolved Hide resolved
We currently check the $DISPLAY environment variable to autodetect
whether we should auto-pick a non-interactive backend on Linux, but that
variable can be set to an "invalid" value.  A realistic use case is for
example a tmux session started interactively inheriting an initially
valid $DISPLAY, but to which one later reconnects e.g. via ssh, at which
point $DISPLAY becomes invalid.

Before this PR, something like
```
DISPLAY=:123 MPLBACKEND= MATPLOTLIBRC=/dev/null python -c 'import pylab'
```
(where we unset matplotlibrc to force backend autoselection) would crash
when we select qt and qt fails to initialize as $DISPLAY is invalid (qt
unconditionally abort()s via qFatal() in that case).
With this PR, we correctly autoselect a non-interactive backend.
@anntzer
Copy link
Contributor Author

anntzer commented Sep 1, 2020

yes to all

@@ -83,7 +148,7 @@ static PyMethodDef functions[] = {
"always returns None."},
{"Win32_SetForegroundWindow",
(PyCFunction)mpl_SetForegroundWindow, METH_O,
"Win32_SetForegroundWindow(hwnd)\n--\n\n"
"Win32_SetForegroundWindow(hwnd, /)\n--\n\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the slash doing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This marks hwnd as being positional-only (https://www.python.org/dev/peps/pep-0570/), which it is (being implemented with METH_O).

@timhoffm timhoffm merged commit 4c35d32 into matplotlib:master Oct 11, 2020
@anntzer anntzer deleted the guillotine branch October 11, 2020 17:19
QuLogic added a commit to QuLogic/matplotlib that referenced this pull request Dec 9, 2020
Since matplotlib#17396, backends now check for Wayland settings as part of
headless detection. However, the headless tests do not override those
settings. When running on Wayland, they thus think the display exists
when they are not supposed to.
@QuLogic QuLogic mentioned this pull request Dec 9, 2020
3 tasks
QuLogic added a commit to QuLogic/matplotlib that referenced this pull request Dec 9, 2020
Since matplotlib#17396, backends now check for Wayland settings as part of
headless detection. However, the headless tests do not override those
settings. When running on Wayland, they thus think the display exists
when they are not supposed to.
QuLogic added a commit to QuLogic/matplotlib that referenced this pull request Dec 9, 2020
Since matplotlib#17396, backends now check for Wayland settings as part of
headless detection. However, the headless tests do not override those
settings. When running on Wayland, they thus think the display exists
when they are not supposed to.
QuLogic added a commit to QuLogic/matplotlib that referenced this pull request Dec 10, 2020
Since matplotlib#17396, backends now check for Wayland settings as part of
headless detection. However, the headless tests do not override those
settings. When running on Wayland, they thus think the display exists
when they are not supposed to.
@rnhmjoj
Copy link
Contributor

rnhmjoj commented May 25, 2021

I came here by bisecting: this commit breaks the automatic selection of an interactive backend for me.
When running matplotlib (built with either Tk, GTK3, QT5) in an X session and with a valid DISPLAY variable, the agg backend is chosen.

EDIT: It looks like the problem is in the mpl_display_is_valid function.
EDIT 2: It's the dlopen() failing because it can't find the shared object (NixOS here). I don't think there is any other way beside patching.

@anntzer
Copy link
Contributor Author

anntzer commented Jul 18, 2021

Sorry for not noticing your message earlier (feel free to ping us if no one replies, sometimes no one watches old threads...). Your diagnosis seems reasonable, but is there any way to find the path to libX11.so/libwayland-client.so on nixos without linking them? attn @QuLogic, perhaps you may know a bit about this?

@QuLogic
Copy link
Member

QuLogic commented Jul 19, 2021

I think the point of NixOS is to hide all libraries between packages, unless they need them. So unless you can find a way mark X/Wayland as needed by Python/Matplotlib, I'm not sure what else can be done.

@rnhmjoj
Copy link
Contributor

rnhmjoj commented Jul 19, 2021

That's correct: there is no way to detect the library at runtime and it's the way is supposed to be.
There is something that could be done, though: if the python package were to fail at build time or signal in some way which shared libraries it needs at runtime, it could prevent this kind of problem in the future. I'm not too familiar with setuptools, so I'm not sure it can be done.

@anntzer
Copy link
Contributor Author

anntzer commented Jul 28, 2021

Would it work for nixos to simply force _c_internal_utils to link with libX11.so/libwayland-client.so? (this could be done easily by patching setupext.py to add the right entry under libraries=...) I don't know think we want to do that by default (as we certainly don't want to force a dependency on either libraries on other OSes), though.

@jicksaw
Copy link

jicksaw commented Dec 2, 2022

For any other timetravellers, the detection issue on NixOS was apparently fixed in NixOS/nixpkgs#124439. The fix requires matplotlib to be built with enableTk/Gtk/Qt.

I got here because you can't easily enable it when using poetry2nix nix-community/poetry2nix#514 (comment) and just using MPLBACKEND=tkagg had no effect. My workaround is to boldly mpl.use("tkagg")

I guess MPLBACKEND=tkagg would work if it was assumed the session is interactive when one of the interactive backends is set in it? Dunno what other effects that would have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Matplotlib picks a headless backend on Linux if Wayland is available but X11 isn't
7 participants