Extract correct value in HIDParse when on big-endian LP64 host #1016

nbriggs · 2021-04-15T19:52:51Z

Closes issue #1015

Fixes case of wrong values being extracted on 64-bit hosts because of a dependency on unsigned long being 32-bit in drivers/hidparser.c on big-endian hosts.

Maintainer UPDATE: Practical problem detailed in #778

jimklimov · 2021-04-19T18:28:16Z

Interesting catch, trying to wrap my head around it - did not dig in this part of code before. Answering to both the issue and PR here :)

As for the questioned intent, I assume it was to convert the number representation on the wire into one suitable for computation on current architecture - although I can't say OTOH whether the development was or was not tested on various architectures. At least seems the addition of ever greater numbers (after shifting the byte in the buffer) ensures that this byte lands into the memory layout proper for the CPU for the larger word.

I am not sure what to take about the comment about values that "will be treated as signed" given the use of unsigned temporary holder and byte-by-byte copy. Is that about the further consumer of this conversion, outside the PRed lines?

Looking how the original routine is tuned for up-to 4 byte numbers, could an uint32_t valTmp be a better portable choice? It seems like the original code went to great lengths to ensure that all bits of the buffer are zeroed except for the least-significant byte copied from the input. I am not sure the suggested code would behave identically across various compilers and their optimizations - I can imagine it placing the byte from array into earlier "dirty" long-sized memory area (and shifting that altogether), or using a temporary byte-sized buffer (per array type) and shifting it by 8 bits into... same thing? zero?

Also, just looking at it, can 8*i be simplified to i<<3 or would that be counter-productive in regard to different layouts of words in a sequence of bytes on different systems? ;)

jimklimov · 2021-04-19T18:31:12Z

CCing @aquette as one of few people who significantly contributed to this part of the codebase and hopefully better grasping what was supposed to go on here.

To me currently it seems like ensuring that valTmp is always 32-bit should be sufficient.

nbriggs · 2021-04-19T19:02:21Z

Hi @jimklimov -- thanks for looking at this. Yeah, the intent is to convert from the little-endian wire format to the host format, and to deal with alignment issues by doing a byte-by-byte transfer to pParser->Value for further processing.

The explanation about signed vs unsigned: pParser->Value is declared long, so signed. Consider only the case of a little-endian system: On an ILP32 system: copying 1 or 2 bytes cannot set the sign bit of Value, however copying 4 bytes can potentially set the sign bit if it was set in the input. On an LP64 system: none of the copies can affect the sign bit of Value since it's 8 bytes.

Then consider the big-endian ILP32 systems: the code use memcpy of each individual byte to move it to the most-significant byte of valTmp and then shifts it right by an appropriate amount. Since valTmp is declared unsigned, this can never produce a negative value regardless of whether it's 1, 2, or 4 bytes moved. When that's assigned to (signed) pParser->Value what was the expected result in the 4-byte case, compared to the little-endian case?

In the big-endian LP64 case: the "appropriate amount" to shift is just wrongly calculated, hence the bug I ran into.

jimklimov · 2021-04-19T19:11:20Z

Now wondering if pParser->Value should be re-declared int32_t to match the 4 bytes max that it gets from the wire.

On LP64, would assigning a byte-range value to a 32-bit word, and shifting by a certain amount of bits, do the right thing or not?

And whether shifting is the unambiguously right thing for the conversion purpose actually, vs. just multiplying by a factor (1, 256, 256*256, 256*256*256) assuming the wire protocol guarantees byte order in the ReportDesc array vs. significance in the bigger word...

nbriggs · 2021-04-19T19:16:39Z

I was wondering if pParser->Value should be declared uint32_t... but you also have to know what the expected behavior is in the 4-byte case, which was NOT obvious to me on reading the USB-IF HID procotol spec. There's a fair amount of code in the driver that is trying to cope with differing uses of these bytes, casting them in various ways, and adjusting for the range offsets that are sometimes provided.

nbriggs · 2021-04-19T19:35:24Z

The HID Device Class spec is available as PDF here. The HID Power Device guidance is here -- and it only uses 8, 16, and 24-bit values (the BatteryCapacity). Since the communications spec seems to only support 1, 2, and 4-byte values I presume the BatteryCapacity is transmitted as a 4-byte value, and it will never be negative.

I suspect that in the context of NUT, the HID parser code won't ever deal with a 32-bit number with the sign bit set.

jimklimov · 2021-04-19T19:37:52Z

I think the convention of using the same (logically highest) bit for signedness is the same on x86(_64) and SPARC(64), despite the different order of bytes in memory, so casting int/uint is "safe" (e.g. taking the bytes from wire representation into the temporary variable all processed as unsigned, which simplifies the math compared to mixing four signed bytes somehow, and casting the result to signed if needed). Unfortunately I do not currently have access to a native machine to check that in practice.

I wonder if something like https://github.com/illumos/illumos-gate/blob/9ecd05bdc59e4a1091c51ce68cce2028d5ba6fd1/usr/src/uts/common/sys/byteorder.h#L110 can be applied.

nbriggs · 2021-04-19T19:55:47Z

The only difference is the order that the values from a register are stored in successive bytes in memory. I have access to a (real Solaris 11.4) SPARC system, with an [APC - no; it's really a ups.model = "IBM 1000VA/750W Tower UPS UPS LI T 1000"] UPS on USB, which is why I found this problem.

If we agreed that the values constructed out of the bytes in the report should always be positive, then

                        for (int i = 0; i < ItemSize[pParser->Item & SIZE_MASK]; i++) {
                            pParser->Value += pParser->ReportDesc[(pParser->Pos)+i] << (8*i);
                        }

will do the right thing independent of the endian-ness and word size... except that there's the possibility of overflow (or wrapping) on an ILP32 system if the 4th byte of the ReportDesc is greater than 0x7F, unless we change pParser->Value to an unsigned value.

nbriggs · 2021-04-19T21:19:11Z

In the hid1_11.pdf (device class spec linked above), we find:

5.8 Format of Multibyte Numeric Values
Multibyte numeric values in reports are represented in little-endian format, with the least significant byte at the lowest address. The Logical Minimum and Logical Maximum values identify the range of values that will be found in a report. If Logical Minimum and Logical Maximum are both positive values then a sign bit is unnecessary in the report field and the contents of a field can be assumed to be an unsigned value. Otherwise, all integer values are signed values represented in 2’s complement format. Floating point values are not allowed.
The least significant bit in a value is stored in bit 0, the next more significant in bit 1 and so on up to the size of the value.

Which explains some of the gyrations -- such as the function FormatValue, which is used to reinterpret the bits in pParser->Value as signed 1, 2, or 4-byte values. The code in FormatValue will produce the same result independent of whether the input value is signed or unsigned. I'm still looking to see whether there are other uses of pParser->Value that are sensitive to its signedness.

jimklimov · 2021-04-21T09:48:16Z

...and down went Alice into the rabbit hole ;)

For a quick attempt, did you try to change those ints/longs to particular-width types - did that or did not "just" fix the practical problem (and if possible to test, did it break or not behavior on x86/x86_64 builds)?

nbriggs · 2021-04-21T18:49:02Z

After reading the code (and looking at the git history for hidparser.c and hidtypes.h), I believe that the pParser->Value field should be uint32_t, and that the big- and little-endian paths in hidparser.c should be coalesced, extracting the value byte-by-byte, shifting (left) and adding, which also won't have any dependency on the size of the data type. I'll also update any functions which work with Value as a long to match the new uint32_t (e.g., FormatValue)

I'll run the experiment on the SPARC system in both 32- and 64- bit compilation mode (with the IBM UPS), and also make sure that it compiles and runs on an x86 (not x64) with an "APC RS 1500G" UPS.

If that produces the expected result, I'll update this branch and let you know... OK?

jimklimov · 2021-04-21T20:10:31Z

Sounds great, thank you for the effort and investigation!

…

On Wed, Apr 21, 2021, 20:49 Nick Briggs ***@***.***> wrote: After reading the code (and looking at the git history for hidparser.c and hidtypes.h), I believe that the pParser->Value field should be uint32_t, and that the big- and little-endian paths in hidparser.c should be coalesced, extracting the value byte-by-byte, shifting (left) and adding, which also won't have any dependency on the size of the data type. I'll also update any functions which work with Value as a long to match the new uint32_t (e.g., FormatValue) I'll run the experiment on the SPARC system in both 32- and 64- bit compilation mode (with the IBM UPS), and also make sure that it compiles and runs on an x86 (not x64) with an "APC RS 1500G" UPS. If that produces the expected result, I'll update this branch and let you know... OK? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1016 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMPTFDKUFJ372U5OQXPXFDTJ4M3DANCNFSM43AEW6UA> .

nbriggs · 2021-04-21T22:13:26Z

It gets a little stranger -- if I compile the original code, none of my changes, in 32-bit mode on the SPARC system it produces a number of values which are Size: 32, Value: -1, such as

[D1] Path: UPS.OutletSystem.Outlet.[2].DelayBeforeShutdown, Type: Feature, ReportID: 0x8d, Offset: 0, Size: 32, Value: -1

and my modified driver, when compiled in 32-bit mode, produces the exact same result.

The original driver, compiled in 64-bit mode on the SPARC fails miserably -- the problem that sent me here in the first place -- but my modified driver, compiled in 64-bit mode on the SPARC produces:

[D1] Path: UPS.OutletSystem.Outlet.[2].DelayBeforeShutdown, Type: Feature, ReportID: 0x8d, Offset: 0, Size: 32, Value: 0

I've verified that it's NOT changing the type of Value from long to uint32_t, or changing the parameter type in FormatValue, so there's still some detective work to be done.

I will update the branch to have the changes as I proposed them in the previous note, and then I'll work forward from there.

nbriggs · 2021-04-21T23:14:08Z

Gah... Github! I didn't intend to close the PR, I just pushed an update to the branch. I guess I'll generate another PR when I'm completely ready.

jimklimov requested a review from aquette April 19, 2021 18:29

nbriggs closed this Apr 21, 2021

nbriggs force-pushed the big-endian-HIDParse branch from 8b61446 to 8877335 Compare April 21, 2021 23:11

nbriggs mentioned this pull request Apr 23, 2021

Extract values in HIDParse correctly when compiled in 64-bit (LP64) mode #1024

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract correct value in HIDParse when on big-endian LP64 host #1016

Extract correct value in HIDParse when on big-endian LP64 host #1016

nbriggs commented Apr 15, 2021 •

edited by jimklimov

Loading

jimklimov commented Apr 19, 2021

jimklimov commented Apr 19, 2021

nbriggs commented Apr 19, 2021

jimklimov commented Apr 19, 2021 •

edited

Loading

nbriggs commented Apr 19, 2021

nbriggs commented Apr 19, 2021

jimklimov commented Apr 19, 2021

nbriggs commented Apr 19, 2021 •

edited

Loading

nbriggs commented Apr 19, 2021

jimklimov commented Apr 21, 2021

nbriggs commented Apr 21, 2021

jimklimov commented Apr 21, 2021 via email

nbriggs commented Apr 21, 2021

nbriggs commented Apr 21, 2021

Extract correct value in HIDParse when on big-endian LP64 host #1016

Extract correct value in HIDParse when on big-endian LP64 host #1016

Conversation

nbriggs commented Apr 15, 2021 • edited by jimklimov Loading

jimklimov commented Apr 19, 2021

jimklimov commented Apr 19, 2021

nbriggs commented Apr 19, 2021

jimklimov commented Apr 19, 2021 • edited Loading

nbriggs commented Apr 19, 2021

nbriggs commented Apr 19, 2021

jimklimov commented Apr 19, 2021

nbriggs commented Apr 19, 2021 • edited Loading

nbriggs commented Apr 19, 2021

jimklimov commented Apr 21, 2021

nbriggs commented Apr 21, 2021

jimklimov commented Apr 21, 2021 via email

nbriggs commented Apr 21, 2021

nbriggs commented Apr 21, 2021

nbriggs commented Apr 15, 2021 •

edited by jimklimov

Loading

jimklimov commented Apr 19, 2021 •

edited

Loading

nbriggs commented Apr 19, 2021 •

edited

Loading