Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid bit masking in KeyCode - KeyCode.CharMask is missing 65536 values #3287

Open
dodexahedron opened this issue Mar 3, 2024 · 10 comments
Assignees
Labels
bug testing Issues related to testing v2 For discussions, issues, etc... relavant for v2 work-in-progress
Milestone

Comments

@dodexahedron
Copy link
Collaborator

dodexahedron commented Mar 3, 2024

So, while working on the hotkey stuff, Key and KeyCode became very relevant.

And simplified logic that follows the rules for unicode resulted in a small number of test failures, yet the binary math was saying the tests are wrong - not the code.

So I took a look at the values.

KeyCode.CharMask is defined as 0x000F_FFFF, which is 20 bits.

KeyCode.MaxCodePoint is 0x0010_FFFF, which is correct, per the unicode spec, but that's 21 bits.

KeyCode.CharMask, however, is not the correct mask for that. It drops the high bit, resulting in 65536 missing values, because it ignores a range of 16 bits the way it is written.

Here's the binary explanation:

// Bit counts are in columns with the number below them being the decimal 10s place in the first row and 1s place in the second row.
// For example:
//              0
//              1
// means 1
// and
//              1
//              5
// means 15.
//
// Now for the breakdown...
//
// KeyCode.MaxCodePoint:
// 0x0010_FFFF = 0b_0000_0000_0001_0000_1111_1111_1111_1111
// 10s                           2 2111 1111 1110 0000 0000
//  1s                           1 0987 6543 2109 8765 4321
// To mask that, you must have a 1 in every column. Columns with 0 are dropped.
// This is a 21-bit value.
// But our value for KeyCode.CharMask is:
// 0x000F_FFFF = 0b_0000_0000_0000_1111_1111_1111_1111_1111
// 10s                             2111 1111 1110 0000 0000
//  1s                             0987 6543 2109 8765 4321
// This is a 20-bit value, which means a max value of 1_048_575
//
// But the highest bit is dropped.
//
// In mask terms, you're dropping half the possible values.
// But values from 0x10_FFFF to 0x1F_FFFF aren't defined, so we have 4 bits we still need to mask,
// but that we just won't care about later and which CANNOT be used for any other purpose.
//
// 0x10_FFFF = 1_114_111 in decimal. Thus, it can represent 1_114_112 code points (0 counts).
// 0x0F_FFFF = 1_048_575 in decimal. Thus, it can represent 1_048_576 code points.
// Thus, 0x1_0000 or 65536 values are missing, which CharMask should be able to cover.
//
// You can also simply look at it from the left (most significant bits), which is how we do it in
// networking (subnet masks are 32-bit binary masks against the address).
//
// We are missing the 12th bit from the left. 2^12 is 4096.
// 4 bits are missing, which means 2^4 values, at that magnitude.
// So, 16 * 4096 (magnitude of 12 bits) is 65536.
//
// The correct mask for 21 bits is 0x1F_FFFF, because those 4 bits still matter when bit 21 is 0.
//
// While the MaxCodePoint value is correct, because the standard defines it to be so,
// the values from 0x01_0000 to 0x0F_0000 (inclusive) are missing.
// Those are the 65535 missing values.
//
// Then, we have KeyCode.SpecialMask, which is 0xFFF0_0000:
// 0xFFF0_0000 = 0b_1111_1111_1111_0000_0000_0000_0000_0000
// 10s              3332 2222 2222 2111 1111 1110 0000 0000
//  1s              2109 8765 4321 0987 6543 2109 8765 4321
//
// That's the correct complement of 0x0F_FFFF.
// But 0x0F_FFFF isn't correct, so this is also not correct.
//
// The correct value is:
// 0xFFE0_0000 = 0b_1111_1111_1110_0000_0000_0000_0000_0000
// 10s              3332 2222 2222 2111 1111 1110 0000 0000
//  1s              2109 8765 4321 0987 6543 2109 8765 4321
//
// In network parlance, that'd be a /11 (an IPv4 subnet mask of 255.224.0.0)

So it's an off-by-one error, but it's off by 1 in the 17th position.

As a result, anything that uses CharMask or SpecialMask will lose (or steal, for SpecialMask) the highest-order bit of the character value, if that bit is set.

That's a data corruption bug and treating the binary values correctly leads to 2 KeyTests test cases failing, and also to almost 200 other tests in other fixtures failing, because of either test case values or code that depend on the broken values. I have not yet begun that investigation, but I imagine it's a little of both and that a fairly small number of changes in a small number of places will probably fix them.

Fixing only the values for CharMask and SpecialMask to be CharMask = 0x1F_FFFF and SpecialMask = 0xFFE0_0000 (so, swapping the 21st bit to CharMask, where it belongs, and dropping it from SpecialMask) resolves the KeyTest failures and quite a few of the other failures, leaving a few dozen more to track down.

Now I just need to diagnose the root cause of the other test failures. Most seem to be around the F keys, which seem to be a fair amount of what's often used for non-printable keyboard input test cases.

The fixes for this are important to the work I'm already doing in TextFormatter, so they're just going in that branch for now.

@dodexahedron dodexahedron self-assigned this Mar 3, 2024
@dodexahedron dodexahedron added bug work-in-progress testing Issues related to testing v2 For discussions, issues, etc... relavant for v2 labels Mar 3, 2024
@dodexahedron dodexahedron added this to the v2 milestone Mar 3, 2024
@dodexahedron
Copy link
Collaborator Author

dodexahedron commented Mar 3, 2024

I have other problems with that enum that probably won't matter until a future version of unicode, but I'm ignoring those for now because this is fine for V2 and, so long as masks are defined correctly, shifting things left by 1 shouldn't be terribly difficult as a break-fix, if that happens before some other solution is implemented or Microsoft switches to UTF-8 and back-ports it to all windows versions (hey - we all have dreams, right?).

Although......

That being said, .net8 is UTF-8 by default for console encoding. So... Has anyone tried treating stuff as Utf8 natively, yet?

You have to work with Span<byte> more than string, but not having to do all that damn conversion all over the place would kill like half the work the library has to do.

@dodexahedron
Copy link
Collaborator Author

dodexahedron commented Mar 3, 2024

I think this belongs in this issue more than #3275 but it's still related to both.

So, here's what I did on this end of things and what the result is:

As with any time I've touched multi-type files, types in CursesDriver and ConsoleDriver got split out to their own files.

That's of course a non-change to the program.

But I ended up looking at CursesDriver.ProcessInput (I don't remember the original reason, but it doesn't matter because the problem is fixed).

There were lots of points where cleanup was possible, but there was also a ton of copying going on.

It looks like the HandleEscSeqResponse method had well-intentioned thoughts on performance applied, given everything was by ref for it (Kudos, whoever did that in the first place).

Unfortunately, that's not how the compiler actually compiled that code, and there were still plenty of places where value types were being copied just for temporary use and then discarded.

So, I created a private ref struct to encapsulate the data relevant to those code paths.

A ref struct is a stack-only type. Neither an instance of a ref struct or any of its properties or fields can ever escape its ref-safe context, which basically means it can never outlive the original referent.

This is how Span<T> is defined, so if you are familiar with that at all, the restrictions are the same. Here's a doc for some details: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/ref-struct

Anyway...

This allows the method and the methods it calls to operate on the same references to stack memory, resulting in a serious speedup and reduction in allocations.

A ref struct can never exist on the heap, so it's a guarantee that there are no heap allocations if you're working with a ref struct.

They can't even be reference BY things on the heap (which is why they cannot be elements of an array, for example).

So...

I turned the 5 values that ProcessInput handles and passes to that HandleEscSeqResponse method into this ref struct:

/// <summary>
/// A private byRefLike struct to simplify calls and keep everything really on the stack.
/// </summary>
/// <remarks>
/// This is a ref struct. It cannot escape the stack and the compiler will enforce that if you try.
/// </remarks>
private ref struct EscapeSequenceData
{
    public ref int code;
    public ref KeyCode k;
    public ref int wch;
    public Key keyEventArgs;
    public ConsoleKeyInfo [] cki;
}

It has two reference types in it, which is why those two aren't also ref. That would be an unnecessary extra layer of indirection.

Then I refactored the HandleEscSeqResponse into a method that takes one ref to that type and returns the same ref at the end, doing all its work in-place on that struct.

Just to show progression, I also kept the same variable names, initially, in the ProcessInput method, but made them refs to their corresponding fields in the struct, meaning now the old code was already using the new struct.

Once that was done and verified passing, I removed those now-redundant references, one by one, until all that was left was the struct and one strange place there's a separate int (wch2) defined that I didn't want to mess with, in the moment, so it was left alone. I'm sure it can probably get nuked too, but whatev.

Then I did some more cleanup in the ProcessInput, mostly around using switches instead of large if/else constructs.

The end result is that, vs the code before those changes, for the tests covering them, there has been somewhere between a 10-20x speedup and a reduction in heap allocations and copies and such during those methods significant enough to be visible in the profiler, even though they're all small and seemingly innocent values. This basically means the compiler must have been generating wrapper classes for the old handler that took 5 refs.

With the ref struct and the arrays being collection expressions, the compiler can very aggressively optimize that, now.

Now...

Not to oversell that speedup, though... On the test machine, we're talking those tests taking 0.004 seconds instead of 0.07 seconds, for example. It's within the realm of human perception, especially for a fast typist, but it isn't like this made the whole program noticeably faster beyond a few dozen milliseconds of input lag.

But hey, every bit helps, right?

I have plenty more to do in there before I'm actually finished with it and go back to the TextFormatter class.

I've already reduced the number of warnings, suggestions, and tips from the analyzer for CursesDriver from an almost full scroll bar (when I started) to this, at this point:
image

After I finish with this, though, I think I'll put in a PR at a working commit so others can rebase and whatnot, if it isn't too difficult to separate (it might be).

@tig
Copy link
Collaborator

tig commented Mar 3, 2024

Re HandleEscSeqResponse see

That code is fragile, not well tested, and hard to understand. We need a robust ANSI esc sequence parsing engine...

@tig
Copy link
Collaborator

tig commented Mar 3, 2024

Oh, as I did all the driver refactoring work in the past 6 months, CursesDriver has been the most neglected. Note that it still does not support TrueColor.

@dodexahedron
Copy link
Collaborator Author

Ha yeah there's a lot of original code in there. It's some of the most obviously different.

And the PInvoke mechanism with pre-caching the delegates and whatnot did make sense back then, but is now just needless complexity. I'm finding it hard not to just go ahead and address whatever I run into in there. 😅

@dodexahedron
Copy link
Collaborator Author

dodexahedron commented Mar 3, 2024

Honestly, some of the old generated code in there makes some sense, and the intent behind the KeyCode enum also makes some sense (more than the old stuff). Obviously, the class full of constants is a C-ism that the code generator that was used didn't understand makes more sense in C# as an enum. But oddly enough there's value both ways, so I'm torn on that one.

But, with the way the compiler is treating the uint, it's actually treating literals of the enum as values to be cast at run-time, not as constants, and that's not something I expected to see.

I think the best solution lies somewhere in between, honestly.

The way we use that enum all over the place, we need to be able to treat it as a numeric value. But uint enums can't do that without a run-time cast (oddly unless they are subjected to an arithmetic operation, in which case they ARE implicitly numeric).

I'm wondering if it's a caveat of uint (seems likely), so I'm gonna run to the docs to check the spec on that. I know uint and ulong aren't CLS compliant, at least, so I wouldn't be surprised at all if they're less capable than their signed brethren.

If that turns out to be the case, we might be able to just turn it into a long, for now, and isolate both 32-bit halves for each purpose.

But...

I think the best ultimate solution is in a well-defined struct that keeps things as more primitive structures and exposes an enum (with implicit cast operators) for seamless use both as a numeric value (also using cast operators) and as an enumerated value, for the convenience it gives for things like configuration with the built-in behaviors of enums.

Now... Since it's all just 32-bit binary values, in the current code, I'm using Unsafe.As<int,KeyCode>, Unsafe.As<int,uint>, etc to do the equivalent of a c-style static cast of one pointer type to another, which is pretty much the cheapest way to convert a type in dotnet (or anywhere else, really), and is perfectly fine so long as you are sure the values are compatible. Since we know they're compatible, being 32-bit values, it's not "unsafe" at all. :)

@dodexahedron
Copy link
Collaborator Author

dodexahedron commented Mar 3, 2024

Hell, it may even be advantageous to store two 32-bit values inside a Vector64 or a wider type, to enable some wicked fast operations over sequences of values (which would end up getting SSE-ified by the compiler at that point). Not to mention we have direct access to all those wide instructions via those classes, which are great for a lot of that bitwise math.

However the data is stored, we can expose it in any way we like, for the public API.

It's probably a good idea for us to abstract away the implementation a little bit, anyway, to protect against breakage around version changes.

Actually...

Thinking about it a little more...

That's actually very likely a case where the strategy of doing multiple possible cases and just selecting the right one is significantly faster than anything else.

What I mean is that, for example, if there's a method like the one I'm working on right now that has a bunch of conditionals based on the same or slightly different values, which then do more math based on that decision, you load the values and constants into a Vector128 or something and then apply all of those operations at once at the start. Then the conditional is just a compare and jump, and the value is already done, all in fewer clock cycles, even though the processor threw away 3/4 of the values it computed.

@dodexahedron
Copy link
Collaborator Author

dodexahedron commented Mar 3, 2024

Damn. The compiler behavior around those enums is exactly as the first line describing enum conversions states (emphasis mine):

For any enumeration type, there exist explicit conversions between the enumeration type and its underlying integral type. If you cast an enum value to its underlying type, the result is the associated integral value of an enum member.

That seriously sucks! :(

That even makes things like that Curses class that has the switch to convert from one enum to another even more expensive than it seems. Extremely minor, of course, in the grand scheme of things, but I'm just wondering how I didn't know this (or how I forgot it more likely lol) in the 23+ years I've been using C#. 😆

I suppose I do tend to avoid large enums in favor of formal structs or classes, because of the lack of flexibility in enums such as not being able to write operators. Extension methods are a pity prize. I want operators, damn it!

@dodexahedron
Copy link
Collaborator Author

dodexahedron commented Mar 3, 2024

On the topic of the Curses interop....

That old code was from the mono days.

I bet we can do a clean implementation in less work than it would take to untangle that thing.

The majority of that class is PInvokes and some code that is honestly partially redundant to other, newer, better code we have elsewhere.

Maybe I should shift my strategy in that direction, rather than trying too hard to fix this thing up?

@dodexahedron
Copy link
Collaborator Author

Side note: This is one of several open issues and some unreported issues that #3443 is partially aimed at helping to resolve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug testing Issues related to testing v2 For discussions, issues, etc... relavant for v2 work-in-progress
Projects
Status: No status
Development

No branches or pull requests

2 participants