Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Python interpreter becomes slow at reading inputs after plotting with matplotlib #27515

Closed
aradley opened this issue Dec 14, 2023 · 7 comments · Fixed by #27528
Closed
Milestone

Comments

@aradley
Copy link

aradley commented Dec 14, 2023

Bug summary

I am experiencing an issue on two separate computers where after plotting with matplotlib the Python interpreter becomes very slow at reading new inputs. This is disruptive for coding and debugging.

I have provided a video below to show how the issue can be reproduced. This video was recorded on a new M3 Mac with a fresh install of Python and Visual Studio Code. The problem is reproducible outside of Visual Studio Code, i.e. when I run Python from the terminal, and I also experienced this issue on (slightly) older M2 mac.

screen-recording_2lZ84DeT.mp4

Python version: 3.9.6
Installed packages:
anndata 0.10.3
array-api-compat 1.4
cESFW 0.0.1
contourpy 1.2.0
cycler 0.12.1
dill 0.3.7
exceptiongroup 1.2.0
fonttools 4.46.0
h5py 3.10.0
importlib-resources 6.1.1
joblib 1.3.2
kiwisolver 1.4.5
llvmlite 0.41.1
matplotlib 3.8.2
multiprocess 0.70.14
natsort 8.4.0
numba 0.58.1
numpy 1.26.2
p-tqdm 1.4.0
packaging 23.2
pandas 2.1.4
pathos 0.3.1
Pillow 10.1.0
pip 23.3.1
plotly 5.18.0
ppft 1.7.6.7
pynndescent 0.5.11
pyparsing 3.1.1
python-dateutil 2.8.2
pytz 2023.3.post1
scikit-learn 1.3.2
scipy 1.11.4
seaborn 0.13.0
setuptools 69.0.2
six 1.16.0
tenacity 8.2.3
threadpoolctl 3.2.0
tqdm 4.66.1
tzdata 2023.3
umap-learn 0.5.5
wheel 0.42.0
zipp 3.17.0

Note that cESFW is one of my packages, which you can install from https://github.com/aradley/cESFW, but I would be very surprised if this was the issue as it uses basic Python packages.

Code for reproduction

import numpy as np
import matplotlib.pyplot as plt
import time

# Before plotting, reading and creating the below array is very fast.
start_1 = time.time()

Test_Array_1 = np.array(["An incredibly simple array that has lots of characters in it to show how slow VSCode starts to run after plotting something.",
                       "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                       "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                       "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah"])

end_1 = time.time()
print(end_1 - start_1)

# Create and show a simple plot
plt.plot(np.arange(10))
plt.show()

# Now if I create the same array as before, the python interpreter takes much longer to read the input.
start_2 = time.time()

Test_Array_2 = np.array(["An incredibly simple array that has lots of characters in it to show how slow VSCode starts to run after plotting something.",
                       "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                       "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                       "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah"])

end_2 = time.time()
print(end_2 - start_2)
# After plotting, the interpreter is magnitudes slower (ratio given by below calculation)
print((start_2 - end_2)/(start_1 - end_1))

Actual outcome

Outputs easily obtained from above code.

Expected outcome

Outputs easily obtained from above code.

Additional information

No response

Operating system

macOS 14.1.2 (23B2091)

Matplotlib Version

3.8.2

Matplotlib Backend

MacOSX

Python version

3.9.6

Jupyter version

No response

Installation

pip

@tacaswell
Copy link
Member

How exactly are you running this code? Are you sending the whole script at once or copy-pasting one line at a time into a shell (I'm on poor internet and can not watch the video)?

One theory is that this sounds like the "app nap" issue again.

My memory of the problem is that osX will "nap" applications that you are not using. When you show the plot Python becomes a GUI application and is now eligible for napping. When you close the window it looks to the OS like you are no longer using that application (because how can you use a program without a GUI?) and so Python only wakes up every so often making it slow.

An alternate theory is that this is the event hook for osx we addded behaving badly.

Things to test:

  • if you downgrade Matplotlib does it start working better?
  • if you use a different backend does it start working better?
  • if you use IPython (or any non-readline based prompt) does it work better?
  • Does this depend on putting a small number of long strings into numpy arrays (they are dtype <U124 which while valid is an odd thing to put in an array)?
  • if you add plt.ion() to the top does it get better?

Contrary to me expectations running this as a script on linux show a small effect (6-8x slow down but like 1e-6 vs 8e-6). Adding plt.ion() removes the slow down, but using plt.ion() with plt.show(block=True) returns the slow down.

import numpy as np
import matplotlib.pyplot as plt
import time
import gc

# plt.switch_backend("agg")
plt.ion()
# Before plotting, reading and creating the below array is very fast.
pre = []
for j in range(50):
    start_1 = time.monotonic()
    for j in range(100):
        Test_Array_1 = np.array(
            [
                "An incredibly simple array that has lots of characters in it to show how slow VSCode starts to run after plotting something.",
                "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
            ]
        )

    end_1 = time.monotonic()
    print(end_1 - start_1)
    pre.append(end_1 - start_1)
# Create and show a simple plot
plt.plot(np.arange(10))
plt.show(block=True)
# gc.collect()
post = []
for j in range(50):
    # Now if I create the same array as before, the python interpreter takes much longer to read the input.
    start_2 = time.monotonic()
    for j in range(100):
        Test_Array_2 = np.array(
            [
                "An incredibly simple array that has lots of characters in it to show how slow VSCode starts to run after plotting something.",
                "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
                "Blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah",
            ]
        )

    end_2 = time.monotonic()
    print(end_2 - start_2)
    post.append(end_2 - start_2)
# After plotting, the interpreter is magnitudes slower (ratio given by below calculation)
print(f"{np.mean(pre)=}, {np.std(pre)}")
print(f"{np.mean(post)=}, {np.std(post)}")

print(np.mean(post) / np.min(pre))

print(Test_Array_1.dtype)

is the slightly modified script I've been testing with via python test.py

@aradley
Copy link
Author

aradley commented Dec 14, 2023

I just downgraded to matplotlib 3.6 (random choice for downgrade) and it removed the problem, and this problem only started a month of so ago, so it appears to be an issue with the new version?

@aradley
Copy link
Author

aradley commented Dec 14, 2023

Adding "plt.ion()" did not improve the issue for matplotlib==3.8.2

@oscargus
Copy link
Contributor

Thanks for checking!

It turns out that there are modification to both 3.8.1 (#27221 and #26970) and 3.8.2 (#27290) for the mac backend, that, at least theoretically without knowing the details, possibly may cause something like this. Would it be possible for you to downgrade to 3.8.1 (and if the problem is still there, to 3.8.0)? (Note that 3.7.4 has the same fix as 3.8.2, so if the problem is still on 3.8.0, 3.7.3 is the next to try).

@aradley
Copy link
Author

aradley commented Dec 14, 2023

The problem is not fixed in 3.8.1, appears improved but not gone in 3.8.0 (seems weird but that's what I found), and appears entirely gone in 3.7.3.

@oscargus
Copy link
Contributor

Thanks for checking again! Then there a few more PRs that can have caused it. I think @greglucas is the one with the most knowledge in the area (and he has authored some of those PRs as well).

@greglucas
Copy link
Contributor

The reason for this is that we are running the main runloop for 0.01 seconds each new character we get, which adds up in a hurry in this situation. I just put up a PR to remove that pause in #27528, which sped things up for me locally.

@QuLogic QuLogic added this to the v3.8.3 milestone Dec 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants