Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not send duplicated wakeups to main loop #1082

Closed
wants to merge 1 commit into from

Conversation

blahgeek
Copy link
Contributor

Problem

On macOs, when a process produces lots of outputs (e.g. seq 1 1000000000), Kitty's refresh rate would go down significantly (down to ~1fps).

Cause

The wait_for_events() function in main loop would take much more time when there's lots of outputs, to handle multiple wakeup events from IO thread.

Solution

Only send single wakeup to main thread on each loop.

@kovidgoyal
Copy link
Owner

incrementing and comparing integers is not atomic, so you cant really use it for cross-thread synchronization. And while you could possibly use atomic operations for integer increment and comparison, wait_for_events() and wakeup_main_loop() themselves are not atomic. which means you could end up missing wakeups if the timing is not right

@kovidgoyal
Copy link
Owner

also as far as I know, wait_for_events() should wakeup just once even if there are multiple calls to wakeup_main_loop(). Is that not the case on macOS?

@blahgeek
Copy link
Contributor Author

Each call to wakeup_main_loop() dispatches an event to main loop and wait_for_events() would need to handle all of them, which takes a long time (~0.5 seconds in macOS while running seq, not sure about linux).

Atomic integer is not required here since only one thread would alter the value, the other thread just reads.

There's indeed small chances of missing wakeups (when a wakeup is to be sent between wait_for_events() and main_loop_id += 1), but there's always timeout for waiting.

I would try to work out another solution without causing missing wakeups if I got some free time.

@kovidgoyal
Copy link
Owner

yeah but the event is an empty event, it does not actually require any handling, beyond reading it off the queue. I'm surprised that is expensive in cocoa. It might be possible to further optimize it in _glfwPlatformPollEvents in cocoa_window.m Check the event type and if it is NSEventTypeApplicationDefined dont call sendEvent with it.

@blahgeek
Copy link
Contributor Author

I tried to not send out NSEventTypeApplicationDefined events in _glfwPlatformPollEvents, it doesn't help much.

@kovidgoyal
Copy link
Owner

Ok well I suppose we can use something along the lines of what you propose, the chances of delayed input processing are pretty low even without using mutexes, which might be expensive.

@Luflosi
Copy link
Contributor

Luflosi commented Oct 21, 2018

Even if only one thread is writing and one thread is reading, without using mutexes there will still be data races and thus undefined behaviour. See https://en.wikipedia.org/wiki/Race_condition#Software

@blahgeek
Copy link
Contributor Author

Yes you're right. Memory fence should be used.

@kovidgoyal
Copy link
Owner

An alternate approach is to make use of the fact that the main thread only parses input if input_delay time has passed. So there is no need for the IO loop to wakeup the main loop more often than that. I dont however know if the input delay is large enough to materially affect performance or not.

@blahgeek
Copy link
Contributor Author

blahgeek commented Oct 23, 2018

Does this commit increases maximum input delay to be input_delay * 2?

Edit: more than that. Given input_delay=100, if first poll call takes 99 ms and second poll call takes forever, the wakeup would never be triggered?

@kovidgoyal
Copy link
Owner

poll now takes a timeout. So the second poll call can take at most 1ms in your example. and why would maximum input delay double? just as poll now takes a timeout in the I/O loop, the main loop check for input delay also takes a timeout (sets maximum wait time).

@blahgeek
Copy link
Contributor Author

Ah, my bad, I though both IO loop and main loop (parse function) would wait for input_delay, I was wrong, they use new_input_at timestamp.

👍

@blahgeek
Copy link
Contributor Author

For the poll timeout, I would suggest using ceil(time_delta) to prevent unnecessary multiple timeouts.

@kovidgoyal
Copy link
Owner

the poll timeout is wrong anyway right now, the value needs to be converted from seconds to milliseconds

@kovidgoyal
Copy link
Owner

fixed and also used ceil

@kovidgoyal
Copy link
Owner

kovidgoyal commented Oct 23, 2018 via email

@kovidgoyal
Copy link
Owner

Oh and not to mention that in your scenario above the main_loop would set max_wait_time to 1 ms and wakeup itself 1ms later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants