Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrapping the menu_and_tabs example into a @ui.page("/") causes continuous 100% CPU load #2482

Closed
pplno opened this issue Jan 31, 2024 · 14 comments · Fixed by #2867
Closed

Wrapping the menu_and_tabs example into a @ui.page("/") causes continuous 100% CPU load #2482

pplno opened this issue Jan 31, 2024 · 14 comments · Fixed by #2867
Assignees
Labels
bug Something isn't working
Milestone

Comments

@pplno
Copy link

pplno commented Jan 31, 2024

Description

When I take the menu_and_tabs example and wrap it into a @ui.page("/") like below, the process then runs with continous 100% CPU load when Tab A or Tab C are active. The CPU load drops to 0% when Tab B is active. The issue doesn't persist for the original menu_and_tabs example.

I reproduced the issue in a fresh .venv with nicegui==1.4.13 under Python 3.10 and 3.12

from nicegui import ui
@ui.page("/")
def index_page() -> None:

    with ui.header().classes(replace='row items-center') as header:
        ui.button(on_click=lambda: left_drawer.toggle(), icon='menu').props('flat color=white')
        with ui.tabs() as tabs:
            ui.tab('A')
            ui.tab('B')
            ui.tab('C')

    with ui.footer(value=False) as footer:
        ui.label('Footer')

    with ui.left_drawer().classes('bg-blue-100') as left_drawer:
        ui.label('Side menu')

    with ui.page_sticky(position='bottom-right', x_offset=20, y_offset=20):
        ui.button(on_click=footer.toggle, icon='contact_support').props('fab')

    with ui.tab_panels(tabs, value='A').classes('w-full'):
        with ui.tab_panel('A'):
            ui.label('Content of A')
        with ui.tab_panel('B'):
            ui.label('Content of B')
        with ui.tab_panel('C'):
            ui.label('Content of C')

ui.run()

edit: in the meantime I have reproduced the issue on a second machine.
The smallest code to reproduce the problem appears to be the ui.tabs themselves

from nicegui import ui

@ui.page("/")
def index_page() -> None:
    
    with ui.tabs():
        ui.tab('A')
        ui.tab('B')
        ui.tab('C')
        ui.tab('D')

ui.run()
@pplno pplno changed the title Wrapping the menu_and_tabs example into a @ui.page("/") causes continuous 100% CPU load when Wrapping the menu_and_tabs example into a @ui.page("/") causes continuous 100% CPU load Jan 31, 2024
@falkoschindler
Copy link
Contributor

Wow @pplno, that's very strange.
I couldn't reproduce it immediately though.
What OS and browser are you using? Does the CPU remain at 100% when closing the tab, even after waiting?

@pplno
Copy link
Author

pplno commented Feb 1, 2024

I'm on Windows 11 with Chrome 121.0.6167.140 and Edge 121.0.2277.83. Both browser show the same behaviour.
The CPU goes down to 0% when you close the tab, but not when unfocusing it.

In the meantime the routing / paging itself turned out to be the culprit, as I can reproduce the behaviour even without any tabs on the page (see minimal code below). As mentioned above, the tabs itself without paging also show no increased CPU load.

from nicegui import ui

@ui.page('/')
def index_page():
    ui.label('Test')

ui.run()

However, by continously refreshing the page for n times I can bring down the CPU load to 0%, just to make it pop again after n+1. I cannot see any pattern to the number of times n that it takes though. No difference between hard and soft refreshes.

Also, my initial comment that there are "innocent" tabs was wrong, as I tried to reproduce it and the "innocent" tab actually change erratically. I guess it's just cycling through refreshes until the magic n-th refresh makes a tab act innocent.

Edit: third machine (Ryzen 9 5900x), Windows 10, fresh .venv with nicegui==1.4.13 and Python 3.12.1 with Chrome 121.0.6167.140, completely different, now more or less open home network: same behaviour.

Just to clarify: with 100% CPU load I'm referring to 100% of a single core the process is running on, e.g. ~8% of my 12-core Ryzen 9 5900x - but still enough to make my fans spin on all 3 machines.
Am I missing something here? Is it just the expected load for a nicegui application?

@falkoschindler falkoschindler added the help wanted Extra attention is needed label Feb 2, 2024
@rodja
Copy link
Member

rodja commented Feb 2, 2024

Is it just the expected load for a nicegui application?

No, this should not happen. Can anyone else reproduce this issue?

@anjomro
Copy link

anjomro commented Feb 8, 2024

I stumbled over the same or a very similar problem. I was trying to create a page with many cards (~200) that consisted of an image loaded from an url and a title. When the initial request was done and the all the elements were visible (except the images which are loaded lazily) the page froze completely. Even Firefox showed a popup notifying me about a performance issue with the page and advised to stop the process. Similar to the described issue one CPU core is maxed out for the duration of the freeze (~30s).

I wrote this simple showcase to see if it made a difference if the mentioned @ui.path() decorator was used. The navbar / header is mainly there to see if the page is stuck, as you can see that very easily by hovering the mouse over a tab or button and see if it focuses correctly when hovering.

from nicegui import ui

def spam():
    with ui.header().classes(replace='row items-center') as header:
        ui.button(on_click=lambda: left_drawer.toggle(), icon='menu').props('flat color=white')
        with ui.tabs() as tabs:
            ui.tab('A')
            ui.tab('B')
            ui.tab('C')
        with ui.link(target="/foo"):
            ui.button("Foo")

    with ui.row():
        for i in range(200):
            with ui.link(target=f"/doc/{i}"):
                with ui.card().tight().classes("w-72"):
                    ui.image(f"https://picsum.photos/300/200?{i}").classes("h-full")
                    # element.get_img_ui(7)
                    with ui.card_section():
                        ui.label(f"Title {i}")


@ui.page("/foo")
def foo():
    spam()


spam()

ui.run()

And indeed there is a significant difference:

CPU Usage on / (Without @ui.path())
nicegui-without-path

CPU Usage on /foo (With @ui.path()) - Stuck for ~15s
nicegui-with-path

I've conducted the tests on Linux Mint 21 (uses Ubuntu 22.04 LTS as base) with Firefox 121 and an AMD Ryzen 5 5600G.

I've used the Firefox Profiler for making a recording of the calls during the loading when the problem was appearing (different code than the one above). I disabled prod_js so that the method names are meaningful.

https://share.firefox.dev/3SPlbGz

@oidex
Copy link

oidex commented Feb 28, 2024

@rodja An additional nugget of info:
Running into the same problem as the OP on windows 10, python 3.11, nicegui 1.4.17.
However, running the same code in a docker container works without excessive CPU usage.

@afullerx
Copy link
Contributor

afullerx commented Apr 2, 2024

I'm also seeing the issue of the original poster. (Windows 10, Python 3.12, NiceGUI 1.4.19). Upon loading a @ui.page() generated page, the process randomly ends up in either a normal state (using ~0.1% of my CPU) or a high-CPU state (maxing out one CPU core). Strangely, whichever state it ends up in, displaying a dialog box causes it to toggle into the other state. I reproduced it with Firefox, Chrome, and Native-mode.

After using VizTracer to capture a trace of the program in both normal and max-CPU states. I discovered that the CPU hogging code was in the Outbox.loop() function in outbox.py. There were two separate tasks running this loop, consuming the entire CPU core. After putting in some code to instrument the function, I determined the following line is the culprit:

await asyncio.sleep(0.01)

(from Outbox.loop() in outbox.py)

There seems to be a bug in the asyncio event loop or sleep function where it gets into a state where sleep times are not respected. This results in the two tasks running this loop being resumed approximately every 50μs instead of 10ms as intended. Consequently, these two tasks take turns being executed on every iteration of the event loop, consuming all spare cycles of the core while allowing the interface to stay responsive.

A workaround I've found is to set the sleep duration on the above line to 0.016 (16ms). I could easily trigger the bug with all sleep times 15ms or less, but never for sleeps of 16ms or greater. I assume this has something to do with the standard Windows timer resolution of ~15.6ms.

I think this is probably a Windows only issue. If changing the sleep duration is unacceptable for all platforms, perhaps it could be changed on startup or installation for Windows users only.

If anyone is interested, I can provide the captured traces of both normal and CPU-maxing behavior.

Also, the issue in #2482 (comment) seems to be unrelated.

@rodja
Copy link
Member

rodja commented Apr 2, 2024

Wow, thanks for pointing into the direction of Windows timer resolution. I found the following:

@afullerx
Copy link
Contributor

afullerx commented Apr 2, 2024

Yah, I saw those when trying to figure this out. This seems to be a more serious bug though as I'm seeing a complete collapse in the functioning of asyncio.sleep. It goes from returning in 10-11ms as specified in the call to returning in < 50μs and the task resuming in the next asyncio loop iteration. Thats 10,000μs vs 50. A factor of 200 isn't just an inaccuracy in the Windows clock. In this case, it's the difference between <1% CPU usage and 100%. And there's also the issue of it toggling between completely normal behavior and non-functional behavior at random.
But yes, hopefully they fix it and, until then, at least it's documented here that any Windows users encountering this issue can try changing

await asyncio.sleep(0.01) 

to

await asyncio.sleep(0.016) 

in the Outbox.loop() function of outbox.py. This completely resolved the issue for me.

Love this project so just wanted to do my part. It would be interesting to know if this solution worked for @pplno or @oidex. I have a feeling it's affecting many others without them realizing it as you'd have to be monitoring your CPU usage, notice your fans spinning up, or your battery draining because the NiceGUI interface does continue to function.

@falkoschindler falkoschindler added the bug Something isn't working label Apr 2, 2024
@falkoschindler falkoschindler added this to the 1.4.21 milestone Apr 2, 2024
@falkoschindler falkoschindler self-assigned this Apr 3, 2024
@falkoschindler falkoschindler removed the help wanted Extra attention is needed label Apr 3, 2024
@afullerx
Copy link
Contributor

afullerx commented Apr 5, 2024

@falkoschindler I see you assigned this issue to yourself. You might want to hold off doing anything as I'm going to submit a small PR with a better solution than what I outlined above. It'll take me a few days because I need to get the test suite up and running.

@falkoschindler
Copy link
Contributor

falkoschindler commented Apr 5, 2024

@afullerx Ah, thanks for giving me a heads up!

    • I was about to replace the asyncio.sleep calls in outbox.py with the use of asyncio.Event, so that we don't need check the outbox in regular intervals but only when items have been added.
    • Furthermore, while at it, I wanted to remove the check_interval for methods involving JavaScript calls with asyncio events as well.
    • Last but not least I wanted to replace the sleep of 0.01s in native_mode.py with a more robust delay of 0.016s.

What do you have in mind?

@afullerx
Copy link
Contributor

afullerx commented Apr 5, 2024

@falkoschindler Wow, Number 1 is exactly what I was going to submit. It's done, just need to fully test. It has the added benefit that it drops CPU usage when idle by 80% on my system. Then 3 would not be necessary but may help avoid future issues with the broken asyncio.sleep method on Windows.

@falkoschindler
Copy link
Contributor

Great! I'm looking forward to your pull request. You can focus on number 1 and I'll add 2 and maybe 3 later. 🙂

@falkoschindler
Copy link
Contributor

For number 2 (removing the check_interval) I created PR #2827. But since this PR solves only part of this issue, I didn't link both tickets so that the issue doesn't get auto-closed. Sorry for the back-and-forth.

@me21
Copy link
Contributor

me21 commented Apr 12, 2024

Newer CPython issue link (after migration to GitHub): python/cpython#75720

@falkoschindler falkoschindler modified the milestones: 1.4.22, 1.4.23 Apr 15, 2024
falkoschindler added a commit that referenced this issue Apr 16, 2024
…2482) (#2867)

* Add asyncio events to outbox

* Moved event clearing as a minor optimization

* clean up timeout argument

* Change timeout duration

* Handle connection timeout

* Add log import

* remove extraneous whitespace

* code review and improvement

* Add asyncio.TimeoutError for Python < 3.11

* Fix test hangs

---------

Co-authored-by: Falko Schindler <falko@zauberzeug.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants