Fix output stuttering using a ticket lock #10653

lhecker · 2021-07-14T00:58:56Z

SRWLOCK, as used by std::shared_mutex, is a inherently unfair mutex
and makes no guarantee whatsoever whether a thread may acquire the lock
in a timely manner. This is problematic for our renderer which relies on
being able to acquire the lock in a timely and predictable manner.
Drawing stalls of up to one minute have been observed in tests.

This issue can be solved with a primitive ticket lock, which is 10x
slower than a SRWLOCK but still sufficiently fast for our use case
(10M locks per second per thread). It's likely that any non-trivial lock
duration will diminish the difference to "negligible".

Validation Steps Performed

It still blends ✔️

skyline75489 · 2021-07-14T04:52:29Z

May I ask what exactly is the "stuttering" you mentioned? Also how can I repro it locally.

zadjii-msft · 2021-07-14T13:34:24Z

Are there any before/after benchmarks on this?

WSLUser · 2021-07-14T14:44:36Z

Would something like https://github.com/AlexeyAB/object_threadsafe be a good alternative that also improves speed?

lhecker · 2021-07-14T14:51:26Z

@WSLUser While that project shows very impressive benchmark numbers, it's an unfair lock design as well and would lead to the same issue that the SRWLOCK has in our use case.

lhecker · 2021-07-14T16:53:48Z

@zadjii-msft I've finished benchmarking the new lock.

Before:

Function Name	Total CPU [unit, %]	Self CPU [unit, %]
ControlCore::_connectionOutputHandler	7863 (12.83 %)	2 (0.00 %)
┣ StateMachine::ProcessString	7840 (12.79 %)	308 (0.50 %)
┣ [External Call] RtlAcquireSRWLockExclusive	11 (0.02 %)	11 (0.02 %)
┣ [External Call] RtlReleaseSRWLockExclusive	5 (0.01 %)	5 (0.01 %)
┣ _ReceivedOutputHandlers	5 (0.01 %)	0 (0.00 %)

After:

Function Name	Total CPU [unit, %]	Self CPU [unit, %]
ControlCore::_connectionOutputHandler	7592 (13.10 %)	1 (0.00 %)
┣ StateMachine::ProcessString	7572 (13.07 %)	284 (0.49 %)
┣ til::ticket_lock::lock	15 (0.03 %)	1 (0.00 %)
┣ [External Call] RtlpWakeByAddress	2 (0.00 %)	2 (0.00 %)
┣ _ReceivedOutputHandlers	2 (0.00 %)	0 (0.00 %)

The difference appears to be within the margin of error.

I'd love to make a video of the stuttering/freezes, but they only occur when the stars align, since it heavily depends on how the scheduler feels like on any given day. It's pretty spectacular when it happens though, because it seemingly freezes the screen for multiple seconds.
In fact I'm about 100% certain that this is exactly what happened in the later parts of the refterm v2 YouTube video, when WT seemingly freezes when printing long lines. If you didn't see it yet, I recommend checking out those freezes - it's pretty "spectacular" indeed.

src/cascadia/PublicTerminalCore/HwndTerminal.cpp

src/inc/til/ticket_lock.h

DHowett · 2021-07-14T16:59:17Z

src/inc/til/ticket_lock.h

+            for (;;)
+            {
+                const auto current = _now_serving.load(std::memory_order_acquire);
+                if (current == ticket)


there is provably no case where now_serving will jump multiple steps above the expected ticket value, correct?

It is possible, if someone where to call unlock() multiple times.
But if you don't, it works exactly like the Wikipedia article describes the algorithm.

DHowett

It's interesting to consider the work we've done to make sure we were using a reader/writer lock correctly, and possibly how little of an impact it actually had on our performance. Is there a possible net loss for operations that require reading, as they now contend with operations that require writing? Ideally, reading would be much more common and therefore "fast-pathable".

lhecker · 2021-07-14T17:39:02Z

@DHowett A shared lock, like the one we used before this PR, would only help with concurrent reads, as locking the mutex as a reader prevents any writers from progressing.
And I don't believe we have any high-frequency concurrent reads into our console buffer. We only use it for:

Retrieving the current selected text - low read rate
When hovering or clicking links - low read rate
When hovering a cell (I guess? it's UpdateHoveredCell which I don't understand what it's for) - low read rate
Accessibility cursor notifications (CurrentCursorPosition event) - might be high-ish read rate?
Rendering - high read rate, high lock duration

We can't really build a lock that fast-paths reads, because that would lead to an unfair lock again.
However we could build a lock which allows concurrent reads, but the lock design would then be much more complicated than it currently is.

DHowett

I love this.

As a note, I edited your pull request body to match the commit message style of the repo.

DHowett · 2021-07-14T19:51:47Z

src/cascadia/TerminalCore/Terminal.hpp

+    //
+    // But we can abuse the fact that the surrounding members rarely change and are huge
+    // (std::function is like 64 bytes) to create some natural padding without wasting space.
+    til::ticket_lock _readWriteLock;


If you add the alignas now, would that make it safer (in case somebody changes the order) and not compromise the layout by adding wasted padding space?

I ask because... this is a micro-optimization somebody could easily accidentally remove or not fully understand and fail to maintain.

alignas would add 56 bytes of padding space and force _readWriteLock to be 64 bytes large.
I had hoped that the comment I added there would be enough to prevent others from accidentally changing the order. It someone would move it to be next to the later members in this class it would double the overhead of the ticket lock (from 0.03% to 0.06% CPU usage), but I think the comment should be an okay-ish reminder to not do that. I hope?

I think the comment is OK. I wonder if we'd ever be possessed to remove all the std::functions though and then the trick wouldn't work anymore.

DHowett · 2021-07-14T19:58:42Z

@msftbot make sure @miniksa signs off

ghost · 2021-07-14T19:58:46Z

Hello @DHowett!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

I'll only merge this pull request if it's approved by @miniksa

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

miniksa · 2021-07-14T23:26:58Z

src/cascadia/TerminalCore/Terminal.cpp

 {
-    return std::shared_lock<std::shared_mutex>(_readWriteLock);
+    return std::unique_lock{ _readWriteLock };


If there's no difference any more between Locking for Writing and Reading thanks to the mechanics of the ticket lock... wouldn't it make sense to have one call the other to make that super clear until the future revision where you build that more complicated fair read/write lock for funsies?

miniksa · 2021-07-14T23:39:41Z

src/cascadia/TerminalCore/Terminal.hpp

+    //
+    // But we can abuse the fact that the surrounding members rarely change and are huge
+    // (std::function is like 64 bytes) to create some natural padding without wasting space.
+    til::ticket_lock _readWriteLock;


I think the comment is OK. I wonder if we'd ever be possessed to remove all the std::functions though and then the trick wouldn't work anymore.

ghost · 2021-08-31T17:09:48Z

🎉Windows Terminal Preview v1.11.2421.0 has been released which incorporates this pull request.:tada:

Handy links:

Fix output stuttering using a ticket lock

e683f21

lhecker force-pushed the dev/lhecker/ticket-lock branch from 903f3c6 to e683f21 Compare July 14, 2021 01:13

lhecker added Area-Quality Stability, Performance, Etc. Product-Terminal The new Windows Terminal. labels Jul 14, 2021

DHowett requested a review from miniksa July 14, 2021 16:55

DHowett reviewed Jul 14, 2021

View reviewed changes

src/cascadia/PublicTerminalCore/HwndTerminal.cpp Show resolved Hide resolved

DHowett reviewed Jul 14, 2021

View reviewed changes

Address reviewer comments

b52a9c1

DHowett approved these changes Jul 14, 2021

View reviewed changes

ghost added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jul 14, 2021

miniksa approved these changes Jul 14, 2021

View reviewed changes

ghost merged commit f68324c into main Jul 14, 2021

ghost deleted the dev/lhecker/ticket-lock branch July 14, 2021 23:41

miniksa mentioned this pull request Jul 15, 2021

SSH from bash can freeze the console when non regular characters are outputed. #53

Closed

carlos-zamora mentioned this pull request Aug 12, 2021

Prevent deadlock in UIA Move API #10937

Merged

lhecker mentioned this pull request Aug 30, 2021

console running WSL is unresponsive when running program with lots of output #198

Closed

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix output stuttering using a ticket lock #10653

Fix output stuttering using a ticket lock #10653

lhecker commented Jul 14, 2021 •

edited by DHowett

Loading

skyline75489 commented Jul 14, 2021

zadjii-msft commented Jul 14, 2021

WSLUser commented Jul 14, 2021

lhecker commented Jul 14, 2021

lhecker commented Jul 14, 2021 •

edited

Loading

DHowett Jul 14, 2021

lhecker Jul 14, 2021 •

edited

Loading

DHowett left a comment

lhecker commented Jul 14, 2021 •

edited

Loading

DHowett left a comment

DHowett Jul 14, 2021

DHowett Jul 14, 2021

lhecker Jul 14, 2021

miniksa Jul 14, 2021

DHowett commented Jul 14, 2021

ghost commented Jul 14, 2021

miniksa Jul 14, 2021

miniksa Jul 14, 2021

ghost commented Aug 31, 2021

Fix output stuttering using a ticket lock #10653

Fix output stuttering using a ticket lock #10653

Conversation

lhecker commented Jul 14, 2021 • edited by DHowett Loading

Validation Steps Performed

skyline75489 commented Jul 14, 2021

zadjii-msft commented Jul 14, 2021

WSLUser commented Jul 14, 2021

lhecker commented Jul 14, 2021

lhecker commented Jul 14, 2021 • edited Loading

Before:

After:

DHowett Jul 14, 2021

Choose a reason for hiding this comment

lhecker Jul 14, 2021 • edited Loading

Choose a reason for hiding this comment

DHowett left a comment

Choose a reason for hiding this comment

lhecker commented Jul 14, 2021 • edited Loading

DHowett left a comment

Choose a reason for hiding this comment

DHowett Jul 14, 2021

Choose a reason for hiding this comment

DHowett Jul 14, 2021

Choose a reason for hiding this comment

lhecker Jul 14, 2021

Choose a reason for hiding this comment

miniksa Jul 14, 2021

Choose a reason for hiding this comment

DHowett commented Jul 14, 2021

ghost commented Jul 14, 2021

miniksa Jul 14, 2021

Choose a reason for hiding this comment

miniksa Jul 14, 2021

Choose a reason for hiding this comment

ghost commented Aug 31, 2021

lhecker commented Jul 14, 2021 •

edited by DHowett

Loading

lhecker commented Jul 14, 2021 •

edited

Loading

lhecker Jul 14, 2021 •

edited

Loading

lhecker commented Jul 14, 2021 •

edited

Loading