-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tout est Terrible. Endian problems with original RFC 4122 + case problems #119
Comments
@safinaskar, I have actually reviewed your errata for the original RFC4122 in terms of big endian and little endian and I am fairly certain the current Draft 04 has no ambiguity in regards to that topic as is pertains to the items in my this specific draft. Please let me know if you feel otherwise but I tried to go out of my way and provide concise verbiage around this topic.
This is a pretty good problem and the topic of "text encodings change then or provide alternatives" is being worked under Also CCing @ben221199 who as been reviewing the original RFC4122 for a fresh coat of paint on the v1 through v5 topics from that RFC. |
I'm mentioned I see. The standard of this repository will add UUIDv6 and only that (as far as I know). This standard will update RFC 4122, because it adds version 6 to RFC 4122. I'm also working on another standard. That standard will be deprecate RFC 4122 (and also this standard, I think), because it will redefine what already was in RFC 4122. The idea is that the standard redefines all existing UUID variants (Apollo, NCS, Microsoft, Full-UUID, etc.) and versions (v1-v6) and directly registers the variants and versions in a IANA registry, so that there will be no doubt about it. If you (@safinaskar) think that RFC 4122 (and its errata) aren't fully correct, that standard is the place to be to fix it, as I think I want to register different serialization types too. Also, I want to post a timeline here, because I think this could be order to release this standards in (but the second and third could also be swapped):
Hope this explaination helps a bit. |
Cool, thanks! |
Well, of course not? The dashed-hex form of a UUID is one of infinitely many encodings of the actual UUID, which is a specific sequence of 16 bytes. The UUID is the bytes, not the string — all encoded forms of a UUID must be decoded before they can be compared.
Is a sequence of bytes represented as ASCII hexadecimal not invariant to endianness? Given a dashed-hex string representation of a UUID e.g.
? |
@peterbourgon , in 2013 I tried to create my program in C, which converts UUID from binary representation to ASCII (or vice-versa) based on RFC 4122 and I failed because of endian problems. So the text is ambiguous. Unfortunately I don't remember exact details, i. e. I don't remember which parts of the standard caused problems. But you still can see my errata and response of one editor to my errata in RFC errata tracker ( https://www.rfc-editor.org/errata/eid3546 ): he acknowledges problems exist |
I can imagine that GUID can do some strange things. Everything is normal, except when you are Microsoft: Bytes: This is HxD, that only uses Microsofts GUID encoding, not the normal UUID encoding, so you don't get the expected Wikipedia mentions it too:
Note that this quote mentions variants in combination with encoding and COM/OLE. I think this text is partly written by an idiot. |
Wow! You learn something new, and frequently terrible, every day. |
Topic moved covered as per errata of https://github.com/ietf-wg-uuidrev/rfc4122bis |
In 2013 I reported this errata to original UUID RFC (RFC 4122): https://www.rfc-editor.org/errata/eid3546 . In short, big endian / little endian are not used consistently in original RFC, so (in my opinion) it is impossible to create working translator from binary UUID form to textual and vice-versa based on RFC text alone. That problem still is not fixed. So, I propose to replace original UUID RFC with newer version instead of merely adding new UUID types. Obviously, all existing errata ( https://www.rfc-editor.org/errata_search.php?rfc=4122 ) should be considered when creating this new version.
Back in 2013 I tried to write my own UUID C library based on RFC text alone, and I failed precisely because of that endian problem.
Also, original RFC has the second problem: some implementations generate textual UUIDs in uppercase, some - in lower case. This complicates textual comparing of UUIDs and sometimes lead to bugs. My proposal: textual UUIDs should always be generated in lower case (i. e. consuming uppercase UUIDs is OK, but generating - no). This proposal justifies replacing original RFC, too. If you are not convinced, then, please read this text: "Tout est Terrible" ( https://ferd.ca/tout-est-terrible.html ). The author talks about various problems, how they complicate writing usual programs, including UUIDs uppercase/lowercase. (Okey, if you are not convinced, at least, please, mandate lowercase UUIDs for new types only.)
The text was updated successfully, but these errors were encountered: