-
-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UUID v1 is somewhat predictable #578
Comments
Yes, this is why we chose UUID v1. It's extremely expensive for postgres to write an id that is smaller than an existing id because it has to split the pages - this reduces write performance significantly. Because UUIDv1 is prefixed with the date stamp it is (generally) an increasing value and thus fairly safe to use as the PK unless you're expecting hundreds or thousands of inserts per second (and even in those cases the pages are probably still in memory so splitting them is less expensive than cold reads). Neither option we recommend give you cryptographically secure random IDs; for that I'd recommend using a second column with a unique index and a simple int/bigint primary key, then don't expose the primary key. We do not provide any help in achieving this however 😞 |
@benjie some of the comments to that article mention a couple of points:
|
Interesting, thanks for researching this! Personally I don't buy that the whole point of UUID is to not have sequential guessable identifiers - the reason UUIDv1 ids include the mac address is to make sure they don't class with other machines acting in good faith; it's not to protect against nefarious actors. UUIDv4 on the other hand is intended to give a truly random non-guessable identifier. I've filed an issue against the docs site to research this further as Bill Moran does not cite any authoritative sources backing his claims; however his reasoning seems sound and so with the lack of any further evidence I'd definitely support someone removing that particular paragraph. If you're looking for high-performance seemingly random IDs another option is a Feistel cipher - this will result in less work to generate (no randomness required), less storage required, and faster performance due to significantly smaller indexes (even with |
I'm going to go with uuid v4, myself. Thank you for the talk on this. Will leave this open for you to finish as you see fit. |
Closing as it's open on the website repo |
I'm not commenting on this issue as a whole. I just came across this thread from searching for some general UUID stuff. @benjie But just FYI on UUIDv1 in general, it's a common misconception that they're sequential. They're not. Yeah there's a timestamp before the MAC address, but it's kind of akin to using Generate a bunch of UUIDv1s within a few seconds. You'll see that the first 8 characters (time) move very quickly, and the middle parts (date) are the slower ones. That's why there's alternatives out there like "comb UUIDs" and "UUIDv6" which are similar to UUIDv1 except they've put things in the right order (year first). No idea why they did the timestamp backwards for UUIDv1, seems silly to me. Also I agree that the point of UUIDs has nothing to do with guessability. Especially any version that isn't v4. And even then, known issues in various UUIDv4 generators can make them guessable too. As far as I've read, the order doesn't matter much in postgres, even UUIDv4 is ok compared to other databases. But sequential still feels like a good idea to me, especially seeing you might be also storing copies of the data outside postgres too. I've made my own version of UUID that is (I think) is a good mix of v1 + comb + v4. Consists of timestamp (in the right order, I've even made the date human readable), node (not tied to hardware) and also random at the end. I've heard of UUIDv4 (very) occasionally conflicting, so at least with my version the conflict will happen on the same node immediately and fail early, rather that accepting the new data and finding you have conflicting IDs later on when it comes time to merge. |
I also ended up here while searching and I think this article is relevant to the discussion: https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/
|
https://www.graphile.org/postgraphile/postgresql-schema-design/#postgres-roles
We suggest v1mc in the docs, though as you can see here uuid1 is much less random than uuid4
https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/
The text was updated successfully, but these errors were encountered: