Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode escaping for method names is sometimes broken #2651

Closed
bradwilson opened this issue Jan 22, 2023 Discussed in #2583 · 1 comment · Fixed by #2657
Closed

Unicode escaping for method names is sometimes broken #2651

bradwilson opened this issue Jan 22, 2023 Discussed in #2583 · 1 comment · Fixed by #2657

Comments

@bradwilson
Copy link
Member

Discussed in #2583

Originally posted by tsawyer999 September 15, 2022
Hello,

I am trying to display the name for a test:

Parameter "RETURN VALUE" is always added

I tried those two ways and was not able to accomplish the expected result.

   [Fact]
   public void Parameter_X22RETURN_VALUEX22_is_always_added()

Result:

Parameter "RETURN VALUEX22 is always added

   [Fact]
   public void Parameter_X22RETURN_VALUE_X22_is_always_added()

Result:

Parameter "RETURN VALUE_" is always added

Can someone give me a hand to achieve the desired result?

Thank you very much!

This is a bug in the encoding routine in DisplayNameFormatter:

static void TryConsumeEscapeSequence(
FormatContext context,
char @char,
int allowedLength)
{
var escapeSequence = new char[allowedLength];
var consumed = 0;
while (consumed < allowedLength && context.HasMoreText)
{
var nextChar = context.ReadNext();
escapeSequence[consumed++] = nextChar;
if (IsHex(nextChar))
continue;
context.Buffer.Append(@char);
context.Buffer.Append(escapeSequence, 0, consumed);
return;
}
context.Buffer.Append(char.ConvertFromUtf32(HexToInt32(escapeSequence)));
}

Lines 216 adds the literal characters that it already consumed into the output string. Unfortunately, the logic is failing because of these steps:

  • It sees the U in VALUEX22 and tries to interpret that as a Unicode escape (4 character)
  • It then sees the E, which is a valid hex digit, and goes back for another
  • It then sees the X, which is not a valid hex digit; it adds UEX to the output, then loops back to look for more escapes.

Unfortunately, because it consumed the X (and put it literally into the output), the algorithm now starts up at 22, which of course is not an escape sequence, so it puts those into the output as literal values.

The expected encoding is Parameter "RETURN VALUE" is always added

The actual encoding is Parameter "RETURN VALUEX22 is always added

@koenigst
Copy link
Contributor

I will create a PR for this. Any thoughts on how the look-ahead should be implemented?

koenigst added a commit to koenigst/xunit that referenced this issue Jan 29, 2023
koenigst added a commit to koenigst/xunit that referenced this issue Jan 29, 2023
koenigst added a commit to koenigst/xunit that referenced this issue Jan 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants