Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High CPU usage on Windows #5591

Closed
blagoev opened this issue Jun 15, 2022 · 4 comments
Closed

High CPU usage on Windows #5591

blagoev opened this issue Jun 15, 2022 · 4 comments

Comments

@blagoev
Copy link
Contributor

blagoev commented Jun 15, 2022

We are witnessing a high CPU usage on Windows for some time now. The problem was first noticed on our CI which runs on GH Actions and two core VMs. We managed to reproduce it locally by setting an affinity of the process to only two cores (simulating the environment on GH Actions) and this is the profile information we managed to capture.

This seems the offending function

Function Name	Total CPU [unit, %]	Self CPU [unit, %]	Module
| - realm::util::network::Service::IoReactor::wait_and_advance	35143 (97.53%)	7557 (20.97%)	realm_dart

the code with highest CPU usage is

            } while (ret == 0 &&
                     (duration_cast<milliseconds>(steady_clock::now() - started).count() < max_wait_millis));

This takes 89% of the execution time.

Here a screenshot of the profiling session
image

Note that because of an debug assertion on Windows the SDK is actually paused and not running. But these background threads are continuing to pump the CPU.
Here is another screenshot using ProcessExplorer
image

The code is prefixed with this comment which shows that we have a special path for Windows in that function wait_and_advance

// Windows does not have a single API call to wait for pipes and
// sockets with a timeout. So we repeatedly poll them individually
// in a loop until max_wait_millis has elapsed or an event happend.
//
// FIXME: Maybe switch to Windows IOCP instead.

// Following variable is the poll time for the sockets in
// miliseconds. Adjust it to find a balance between CPU usage and
// response time:

We think this is of high priority since it taxes the CPU to 100% and makes any Realm Sync operation run really slow.

EDIT: This is witnessed on Windows

@blagoev blagoev changed the title High CPU usage High CPU usage on Windows Jun 15, 2022
@nirinchev
Copy link
Member

Note that if this turns out to be difficult to fix, we can also just prioritize platform networking for data and get rid of our custom networking implementation altogether.

@sync-by-unito
Copy link

sync-by-unito bot commented Jun 15, 2022

➤ Jonathan Reams commented:

I think fixing this issue is an unknown amount of work right now. Just "switching to Windows IOCP" would likely solve this issue, but would also be a substantial re-write of how we do networking on windows. We could also pull in ASIO - which our networking library is somewhat based on - which has actually good IOCP support and use that as an alternate networking impl on windows. I have no good estimates what that would do for binary size or performance, but it won't peg any CPUs - at least not for the same reasons. We could also just bump up the priority of platform networking projects on platforms that support windows.

Regardless of what we do, I think this is likely a forever bug that's not going to be solvable within the next month or two without re-shuffling some other priorities.

@sync-by-unito
Copy link

sync-by-unito bot commented Jun 21, 2022

➤ James Stone commented:

We have merged a mitigation in #5594 I am reducing the priority of this, but we can keep it open until we have a better long term solution.

@sync-by-unito sync-by-unito bot closed this as completed Apr 18, 2023
@sync-by-unito
Copy link

sync-by-unito bot commented Apr 18, 2023

➤ marysiapietraszewska commented:

Will be fixed by introducing platform networking

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants