Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REP of SP characters produces incorrect results #15390

Open
jdebp opened this issue May 20, 2023 · 2 comments
Open

REP of SP characters produces incorrect results #15390

jdebp opened this issue May 20, 2023 · 2 comments
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Help Wanted We encourage anyone to jump in on these. Issue-Bug It either shouldn't be doing this or needs an investigation. Product-Terminal The new Windows Terminal.
Milestone

Comments

@jdebp
Copy link

jdebp commented May 20, 2023

Windows Terminal version

1.17.1023

Details

1.16.10261 had significant problems with REP of Unicode block graphics and other such characters. That's mostly fixed in 1.17.1023, with one exception: REP of Supplementary Plane characters still produces incorrect results, with the characters repeated in the wrong places on the line or sometimes not at all.

I've experienced this with MouseText characters like U+0001FB81 in particular, and have not (of course!) tested every SP code point.

I did some screenshots.

@jdebp jdebp added Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels May 20, 2023
@j4james
Copy link
Collaborator

j4james commented May 20, 2023

As currently implemented, REP just repeats the last wchar_t that was output (which on Windows is 16-bit), so for anything outside the BMP, you're just going to be repeating the second half of a surrogate pair.

If we don't want to support supplementary planes, maybe we should filter out anything in the surrogate range, rather than writing out garbage. That way apps could at least detect whether it was supported or not.

@carlos-zamora carlos-zamora added Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Product-Terminal The new Windows Terminal. and removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Jun 7, 2023
@carlos-zamora carlos-zamora added this to the Terminal v1.20 milestone Jun 7, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot added the In-PR This issue has a related PR label Jun 12, 2023
@lhecker lhecker removed the In-PR This issue has a related PR label Jun 12, 2023
@j4james
Copy link
Collaborator

j4james commented Jul 24, 2023

@lhecker Regarding your comment here: #15751 (comment)

I'm not entirely sure which part in our code base doesn't handle surrogate pairs, etc., but I suspect it can be discovered by setting a breakpoint on AdeptDispatch::_FillRect.

This issue isn't related to FillRect - it's a limitation of the StateMachine implementation, which works with 16-bit wchar_t elements. In the case of the REP operation, it's repeating the last such character. See here:

case CsiActionCodes::REP_RepeatCharacter:
// Handled w/o the dispatch. This function is unique in that way
// If this were in the ITerminalDispatch, then each
// implementation would effectively be the same, calling only
// functions that are already part of the interface.
// Print the last graphical character a number of times.
if (_lastPrintedChar != AsciiChars::NUL)
{
const size_t repeatCount = parameters.at(0);
std::wstring wstr(repeatCount, _lastPrintedChar);
_dispatch->PrintString(wstr);
}

If that happens to be the second half a surrogate pair, it's not going to work correctly.

But this is just one symptom of the problem. For another example, compare the difference between these two statements, which use an emoji as the final character of a CSI sequence:

printf '[\e[✅]\n'
printf '[\e[💓]\n'

The first outputs [], while the second shows an error glyph [�]. Admittedly this is a ridiculous edge case, but if you are working on a rewrite of the state machine code, it would nice if we could fix the underlying problem, and not just hack in a workaround for REP. Not essential though.

@zadjii-msft zadjii-msft modified the milestones: Terminal v1.20, Backlog Oct 4, 2023
@zadjii-msft zadjii-msft added the Help Wanted We encourage anyone to jump in on these. label Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Help Wanted We encourage anyone to jump in on these. Issue-Bug It either shouldn't be doing this or needs an investigation. Product-Terminal The new Windows Terminal.
Projects
None yet
Development

No branches or pull requests

5 participants