Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We've been trying to reach you about your WriteCharsLegacy's extended Emoji support #15567

Merged

Conversation

lhecker
Copy link
Member

@lhecker lhecker commented Jun 16, 2023

This is a complete rewrite of the old WriteCharsLegacy function
which is used when VT mode is disabled as well as for all interactive
console input handling on Windows. The previous code was almost
horrifying in some aspects as it first wrote the incoming text into a
local buffer, stripping/replacing any control characters. That's not
particular fast and never was. It's unknown why it was like that.

It also measured the width of each glyph to correctly determine the
cursor position and line wrapping. Presumably this used to work quite
well in the original console code, because it would then just copy
that local buffer into the destination text buffer, but with the
introduction of the broken and extremely slow OutputCellIterator
abstraction this would end up measuring all text twice and cause
disagreements between WriteCharsLegacy's idea of the cursor position
and OutputCellIterator's cursor position. Emoji input was basically
entirely broken. This PR fixes it by passing any incoming text
straight to the TextBuffer as well as by using its cursor positioning
facilities to correctly implement wrapping and backspace handling.

Backspacing over Emojis and an array of other aspects still don't work
correctly thanks to cmdline.cpp, but it works quite a lot better now.

Related to #8000
Closes #8839
Closes #10808

Validation Steps Performed

  • Printing various Unicode text ✅
  • On an fgets() input line
    • Typing text works ✅
    • Inserting text works anywhere ✅
    • Ctrl+X is translated to ^X ✅
    • Null is translated to ^@ ✅
      This was tested by hardcoding the OutputMode to 3 instead of 7.
    • Backspace only advances to start of the input ✅
    • Backspace deletes the entire preceding tab ✅
    • Backspace doesn't delete whitespace preceding a tab ✅
    • Backspacing a force-wrapped wide glyph unwraps the line break ✅
    • Backspacing ^X deletes both glyphs ✅
    • Backspacing a force-wrapped tab deletes trailing whitespace ✅
  • When executing
    fputs("foo: ", stdout);
    fgets(buffer, stdin);
    pressing tab and then backspace does not delete the whitespace
    that follows after the "foo:" string (= sOriginalXPosition).

@lhecker lhecker added Product-Conhost For issues in the Console codebase Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-Server Down in the muck of API call servicing, interprocess communication, eventing, etc. Product-Terminal The new Windows Terminal. labels Jun 16, 2023
@DHowett
Copy link
Member

DHowett commented Jun 16, 2023

Does this change any of the bugs in #10808?

@zadjii-msft
Copy link
Member

please never rename this PR

src/host/cmdline.h Outdated Show resolved Hide resolved
DHowett added a commit that referenced this pull request Jun 22, 2023
Base automatically changed from dev/lhecker/8000-bool-cleanup to main June 22, 2023 23:24
Copy link
Member

@carlos-zamora carlos-zamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I just want to let this sit a little longer while you address some of the comments before approving. I also want to play with a build of it for a bit while you do that. Other than that, this is basically an approve haha.

src/host/_stream.cpp Outdated Show resolved Hide resolved
src/host/_stream.cpp Outdated Show resolved Hide resolved
@@ -142,565 +161,270 @@ using Microsoft::Console::VirtualTerminal::StateMachine;
const til::CoordType sOriginalXPosition,
const DWORD dwFlags,
_Inout_opt_ til::CoordType* const psScrollY)
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might just be because I'm looking at the diff, but there used to be a lot of comments walking through the code here before. Since this is a pretty high-traffic area, I'd like to see more comments, if possible. At least breaking it down into steps so that it's easier to maintain in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally didn't find the old comments helpful when making this change, which is why I wrote them from scratch in the new code. But I only documented the parts that I felt like where particularly confusing, which is primarily the backspace handling for tab characters. I avoided commenting something like "Ctrl-X gets written out as ^X in interactive mode" because that's just describing what it does and not why. There's no reason for why it does many of the things the way it does. ☹️ Are there any parts where you felt like a comment would be needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Here's a few that would be nice:

  • L180: what is "Delayed EOL Wrap" and what we're expected to do when we encounter it (move back to left boundary)
  • L193: what is "ENABLE_PROCESSED_OUTPUT" mode? Why do we do _writeCharsLegacyUnprocessed instead of the regular one?

That's kinda it, honestly. The switch-case statement is pretty nice for the rest of this function!

src/host/_stream.cpp Outdated Show resolved Hide resolved
Comment on lines -844 to +546
WC_LIMIT_BACKSPACE,
0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of funny... Backspace in regular VT mode (and usual console mode) just means "move the cursor one column backwards, but that's it". There's no line wrapping or anything else. It's just backwards cursor movement.

WC_DESTRUCTIVE_BACKSPACE used to control whether this backwards movement overwrote the characters with whitespace. I renamed that flag to WC_INTERACTIVE because "destructive backspace" describes the implementation and not the wanted behavior. This is especially so since WC_DESTRUCTIVE_BACKSPACE always implied that WC_LIMIT_BACKSPACE was not set. It's an interactive prompt after all, so backspacing needs to be able to wrap lines back up.

WC_LIMIT_BACKSPACE was used everywhere where WC_DESTRUCTIVE_BACKSPACE wasn't used.

This turned these two flags basically redundant and as implying the opposites. The new WC_INTERACTIVE combines them into one and now better describes the actual intention of the caller. And so this spot now gets a 0 because it's non-interactive, whereas all the cmdline.cpp parts get WC_INTERACTIVE.

@microsoft-github-policy-service microsoft-github-policy-service bot added Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-2 A description (P2) Impact-Correctness It be wrong. Impact-Visual It look bad. labels Jun 25, 2023
@github-actions

This comment has been minimized.

@lhecker
Copy link
Member Author

lhecker commented Jun 25, 2023

@j4james While writing this PR I remembered a flaw in the new TextBuffer::WriteLine API. ROW::SetWrapForced is called whenever the resulting RowWriteState::columnEnd exceeds the given columnLimit:

if (state.columnEnd >= state.columnLimit)
{
r.SetWrapForced(wrapAtEOL);
}

But what if a margin is set? Wouldn't this update the flag even if you aren't even writing past the line width, because the columnLimit parameter is then the right margin and not the line width?

auto lineWidth = textBuffer.GetLineWidth(cursorPosition.y);
if (cursorPosition.x <= rightMargin && cursorPosition.y >= topMargin && cursorPosition.y <= bottomMargin)
{
lineWidth = std::min(lineWidth, rightMargin + 1);
}

Was this intentional when you added margin support? I don't think it should do that since that flag is only for reflowing text on resize. I'd be happy to fix it if it wasn't (it would also simplify this PR, hence my question).

@j4james
Copy link
Collaborator

j4james commented Jun 25, 2023

Was this intentional when you added margin support?

No. I was aware that it wasn't correct, but it's always been broken, regardless of margins, so it wasn't something I wanted to address in that PR.

When it comes to VT output, the wrap flag can't be set in the WriteLine call, because you can write up to the edge of the screen without actually wrapping. It's only when the delayed EOL wrap triggers a line feed that we should be setting that flag. I'm not sure about the legacy console APIs, since they don't have the concept of delayed wrap, but I suspect they may also be better off setting the flag outside the WriteLine call.

There are also a number of edge cases we need to worry about which don't necessarily have obvious answers. Like the handling of double-width/double-height lines, and what happens when you wrap outside the horizontal margins. If we want to get wrapping working correctly in all scenarios, I think we may need to spec it first. Although if there are some easy fixes you want to include in this PR, I'd say go for it.

@lhecker
Copy link
Member Author

lhecker commented Jun 26, 2023

Interesting... When I remove the SetWrapForced() call from the adaptDispatch path, an array of unit tests break, among others ConptyOutputTests::InvalidateUntilOneBeforeEnd. It's because it changes the lineWrapped parameter to PaintBufferLine which is now false... As it should be? How did the "exact wrap" bug fix work if we still had this (my) broken code in there? I'm not sure I get it 100%. 🤔

Edit: I also don't get the broken VIM reflow unit tests. Why do they assert on wrap being set, when vim doesn't wrap? It just writes up to the line end and then emits a CUP. There's no wrap. I don't get it...

@j4james
Copy link
Collaborator

j4james commented Jun 27, 2023

How did the "exact wrap" bug fix work if we still had this (my) broken code in there?

I wasn't aware there was an exact wrap bug fix. The issue is still open an easily reproducible (see #3088). As for why the unit tests are failing, I have no idea. But I'd recommend trying to get things working correctly in conhost first before looking at the conpty side of things, because that introduces a whole new set of problems when it comes to line wrapping.

Comment on lines +144 to +145
// TODO: A row should not be marked as wrapped just because we wrote the last column.
// It should be marked whenever we write _past_ it (above, _DoLineFeed call). See GH#15602.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I put it as a code documentation so if the bus(-factor) hits all of us, it's still there and documented. #15602 is the "todo workitem" basically. 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: A row should not be marked as wrapped just because we wrote the last column.
// It should be marked whenever we write _past_ it (above, _DoLineFeed call). See GH#15602.
// TODO GH#15602: A row should not be marked as wrapped just because we wrote the last column.
// It should be marked whenever we write _past_ it (above, _DoLineFeed call).

Ah, sorry. I saw the bare TODO and just flagged it and moved on hahaha. Here's a code suggestion to make it match the general styling :P

@@ -142,565 +161,270 @@ using Microsoft::Console::VirtualTerminal::StateMachine;
const til::CoordType sOriginalXPosition,
const DWORD dwFlags,
_Inout_opt_ til::CoordType* const psScrollY)
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Here's a few that would be nice:

  • L180: what is "Delayed EOL Wrap" and what we're expected to do when we encounter it (move back to left boundary)
  • L193: what is "ENABLE_PROCESSED_OUTPUT" mode? Why do we do _writeCharsLegacyUnprocessed instead of the regular one?

That's kinda it, honestly. The switch-case statement is pretty nice for the rest of this function!

Copy link
Member

@carlos-zamora carlos-zamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trust you to add the comments (1) if you feel it's necessary and or (2) in a well manner, so I'm gonna go ahead and sign off so it's off my radar. Fantastic work, my dude :)

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is likely to break a lot of stuff that we can't ever know about... but I think that trying to tackle it is the right thing to do. Honestly. We can't let it stay rotted.

}
while (it != end)
{
const auto nextControlChar = std::find_if(it, end, [](const auto& wch) { return !IS_GLYPH_CHAR(wch); });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet you're going to vectorize this ;P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this function is a little faster than the VT parser, because it's less complex, but if that changes... 😄

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I only have questions!

// subtract the previously determined amount of ' ' from the '\t'.
glyphCount -= gsl::narrow_cast<til::CoordType>(backupEnd - backupBeg);

// Can the above code leave glyphCount <= 0? Let's just not find out!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL.

}
else
// Control chars in interactive mode where previously written out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Control chars in interactive mode where previously written out
// Control chars in interactive mode were previously written out

.columnBegin = pos.x,
.columnLimit = previousColumn,
};
textBuffer.Write(pos.y, textBuffer.GetCurrentAttributes(), state);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, but not necessarily concerning. We're writing the spaces with the current attribute rather than the one that we're overstriking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, that's also the same as the current implementation. 🫤
But I think this is fine because the branch is pretty much exclusive for COOKED_READ and it only supports one attribute with which all text is written. (I think?)

// Now that the wide glyph is presumably gone, we can move up a line.
if (pos.x == 0 && pos.y != 0 && textBuffer.GetRowByOffset(pos.y - 1).WasDoubleBytePadded())
{
moveUp();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought moveUp handled this, but I am guessing that we don't want to always move up just only sometimes

Copy link
Member Author

@lhecker lhecker Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say the previous loop moved from a pos.x of 2 to a pos.x of 0, like when you delete the first (wide) glyph of a line. This would satisfy the first condition here. Normally we wouldn't unwrap the line now, but if the line is only wrapped because it was wide-glyph padded, then the previous padding whitespace doesn't actually exist. 😅

AdjustCursorPosition(screenInfo, CursorPosition, dwFlags & WC_KEEP_CURSOR_VISIBLE, psScrollY);
}
const auto pos = cursor.GetPosition();
const auto tabCount = gsl::narrow_cast<size_t>(NUMBER_OF_SPACES_IN_TAB(pos.x));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is NUMBER_OF_SPACES_IN_TAB used anywhere else? If not, should we just move it into this file to improve mental locality?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's used in a number of other places as well. I'd keep it for a bit, because it's a fun historical artifact. 😅

// since you just tabbed yourself past the end of the row, set the wrap
textBuffer.GetRowByOffset(cursor.GetPosition().y).SetWrapForced(true);
textBuffer.GetRowByOffset(pos.y).SetWrapForced(false);
pos.y = pos.y + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we did an adjustment earlier for delayed cursor position. Does this accidentally make us commit TWO newlines when we're in delayed and emit a LF?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have a test build of terminal off a pre-merge branch, I might be able to try to repro this...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't repro the thing I was worried about, but now I don't understand why ;D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code that I thought would break:

#include <windows.h>
int main() {
	CONSOLE_SCREEN_BUFFER_INFO csbi;
	HANDLE h = GetStdHandle(STD_OUTPUT_HANDLE);
	GetConsoleScreenBufferInfo(h, &csbi);
	DWORD m;
	GetConsoleMode(h, &m);
	SetConsoleMode(h, ENABLE_PROCESSED_OUTPUT|DISABLE_NEWLINE_AUTO_RETURN);
	WriteConsoleW(h, L"\r\n", 2, NULL, NULL);
	for(int i = 0; i < csbi.dwSize.X-1; ++i) {
		WriteConsoleW(h, L".", 1, NULL, NULL);
	}
	WriteConsoleW(h, L"X", 1, NULL, NULL);
	WriteConsoleW(h, L"\n", 1, NULL, NULL);
	WriteConsoleW(h, L"\rGAP?\r\n", 7, NULL, NULL);
	SetConsoleMode(h, m);
	return 0;
}

what I was worried would happen:

.....X
<- this blank line should not be here
GAP

It didn't happen. But why?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing the ENABLE_WRAP_AT_EOL_OUTPUT flag. Although this case doesn't work correctly over conpty anyway, because there isn't a VT equivalent to the Windows console style of wrapping (at least as far as I know). I think we'd have to manually force the cursor position to move to the next line to make it work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised now that it's weirder than I initially thought. Over conpty the blank line ends up below the "GAP" text, i.e. something like this:

.....X
GAP
<- extra blank line here

Although I suppose that does make sense. The "GAP" is rendered in the wrong place, but conhost still knows where the cursor is meant to be, so the next time it actually needs to move the cursor (e.g. to draw the prompt), it's going to set it to the correct position.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. It's a combo flag. Ugh, I should know this stuff. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it does introduce a gap! But it also does so on conhost 22621. It likely doesn't matter. It's not a new bug, it's an existing caveat.

Copy link
Member Author

@lhecker lhecker Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as far as I can tell this works identical to the current WCL:

if (cursor.IsDelayedEOLWrap() && fWrapAtEOL)
{
const auto coordDelayedAt = cursor.GetDelayedAtPosition();
cursor.ResetDelayEOLWrap();
// Only act on a delayed EOL if we didn't move the cursor to a different position from where the EOL was marked.
if (coordDelayedAt.x == CursorPosition.x && coordDelayedAt.y == CursorPosition.y)
{
CursorPosition.x = 0;
CursorPosition.y++;
AdjustCursorPosition(screenInfo, CursorPosition, WI_IsFlagSet(dwFlags, WC_KEEP_CURSOR_VISIBLE), psScrollY);
CursorPosition = cursor.GetPosition();
}
}
// As an optimization, collect characters in buffer and print out all at once.
XPosition = cursor.GetPosition().x;
til::CoordType i = 0;
auto LocalBufPtr = LocalBuffer;
while (*pcb < BufferSize && i < LOCAL_BUFFER_SIZE && XPosition < coordScreenBufferSize.width)
{

It first does the same delay wrap branch and then enters the primary loop which may assemble LFs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, when using the legacy console APIs you're meant to get a gap. The delayed wrap thing is a VT-specific feature that doesn't apply here. We should possibly even be dropping these delayed wrap checks from WriteCharsLegacy because that flag should never be set in normal legacy usage - it's only likely to occur if you're mixing and matching VT output with legacy output, in which case there's no expected "correct" behavior.

The bug is that we're not getting the gap in the right place in Windows Terminal, at least as far as I can see.

@lhecker lhecker added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jun 30, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot merged commit 94e6b91 into main Jun 30, 2023
15 checks passed
@microsoft-github-policy-service microsoft-github-policy-service bot deleted the dev/lhecker/8000-WriteCharsLegacy branch June 30, 2023 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-Server Down in the muck of API call servicing, interprocess communication, eventing, etc. AutoMerge Marked for automatic merge by the bot when requirements are met Impact-Correctness It be wrong. Impact-Visual It look bad. Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase Product-Terminal The new Windows Terminal.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WriteCharsLegacy: Accumulated bugs strange behavior with tab[char] +line wrap + backspace in conhost/conpty
5 participants