Inconsistent file version generation between big and little endian CPU architectures #637

Techn0core · 2021-08-03T15:50:09Z

It looks like the git commit hash is having it's digits 1&2 swapped with 3&4, and then converted to base 10.

A commit starting with a1b2 becomes x.y.z.45729, which converted back to base 16 is b2a1. Do I have that right?

Personally I think it would be very convenient to just convert it without the swap, so we can use a simple base 16 to 10 conversion to get the first 4 digits of the source commit hash, and advertise this option in the readme.md. Was there a reason for this swapping? Could it be added as a setting to version.json?

Thanks,

AArnott · 2021-08-07T20:57:52Z

That's an interesting idea. First of all, you know you can use the nbgv tool to convert from a version number to a commit ID, yes? nbgv get-commits 1.2.3.45729 will do that for you.

The other slight complication is that this commit ID in the 4th integer position is not exactly 16-bits -- because 0xffff is not an allowed value. So there is always the ambiguity when you see 65534 as the last integer that it may mean the original commit ID started with fffe or ffff, since we force an ffff value to be represented the same as fffe given the WinPE file version limitation. nbgv get-commits already takes this into account when matching the commit id.

When converting the commit ID to a decimal number, we take the first 16 bits, subtract one if its value is 0xffff, and then represent it as a ushort value directly. On little-endian systems, a 16-bit value is stored in memory backwards, so if the commit ID starts with 0xabcd, a ushort with that exact same memory will interpret the number as 0xcdab (52,651 in decimal). On a big-endian system, I suspect the same 16-bit lead to the commit ID would end up creating a decimal value of 43,981.

That difference between endian systems is a problem, since NB.GV strives for deterministic versioning. So for the sake of a consistent version generation no matter which CPU is running NB.GV, it does seem like a standard should be adopted, and for the use case you mention, choosing big-endian makes sense. That is to say, we should swap the two bytes around if on a little-endian system before interpreting as a ushort.

AArnott · 2021-08-07T21:58:02Z

One complication is that this will be a breaking change for version=commit matching, since the nbgv tool will now be interpreting the 4th integer in the version as a big endian representation of the first two bytes of the commit ID instead of little endian, which means no version will match that was produced with an older version of NB.GV.

I may have to adjust the tool to recognize both for sake of backward compatibility.

… endian This fixes a non-deterministic version computation that varied based on endianness of the processor. Most computers use little-endian, which produced a decimal version that when converted back to hex with a typical calculator would have the first two commit ID bytes in swapped positions. Now with this change, we "fix" the endianness. We choose big endian since it won't swap the first two bytes like little endian did. But this presents another problem: all the generated versions now have a different value for the 4th integer component, and `nbgv get-commits` will no longer match a commit when it built using the little-endian version of NB.GV. I'll fix this in a subsequent commit so that this CLI tool will match on commits allowing the order to be swapped. Fixes #637

AArnott added the bug label Aug 7, 2021

AArnott changed the title ~~Could we get the calculated portion of the file version to be a straight hexadecimal to decimal conversion?~~ Inconsistent file version generation between big and little endian CPU architectures Aug 7, 2021

AArnott self-assigned this Aug 7, 2021

AArnott mentioned this issue Aug 8, 2021

Always generate commit ID component of version as big endian #638

Merged

AArnott added this to the v3.5 milestone Aug 8, 2021

AArnott closed this as completed in #638 Aug 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent file version generation between big and little endian CPU architectures #637

Inconsistent file version generation between big and little endian CPU architectures #637

Techn0core commented Aug 3, 2021 •

edited

Loading

AArnott commented Aug 7, 2021

AArnott commented Aug 7, 2021

Inconsistent file version generation between big and little endian CPU architectures #637

Inconsistent file version generation between big and little endian CPU architectures #637

Comments

Techn0core commented Aug 3, 2021 • edited Loading

AArnott commented Aug 7, 2021

AArnott commented Aug 7, 2021

Techn0core commented Aug 3, 2021 •

edited

Loading