Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generator doesn't actually guarantee 80-bits of UIDs per millisecond? #80

Open
vlovich opened this issue May 1, 2024 · 3 comments
Open

Comments

@vlovich
Copy link

vlovich commented May 1, 2024

I could be horribly misreading the code, but on every new millisecond it'll reseed the random state with new entropy. If the increment moves the entropy to RAND_MAX (i.e. 2^80), the generate function returns an error. My statistics is fuzzy, but I think that means on average you can actually onl increment ~2^40 times since the randomness will be close to the half-way point to RAND_MAX half the time leaving only 40 bits of increment. You would also expect that a non-infrequent amount of time you would have very few generations available if the random number generated is or is close to RAND_MAX.

Shouldn't Generator remember the starting entropy so that on hitting RAND_MAX it simply wraps and only returns an error if it were to increment back to the original random state?

@dylanhart
Copy link
Owner

This is more of a design problem with the original spec & reference implementation. The design basically assumes ~1 ulid per millisecond per process. You're totally correct that it can do better. I chose to implement this behavior in the same way as the reference implementation rather than to try and improve it myself.

Shouldn't Generator remember the starting entropy so that on hitting RAND_MAX it simply wraps and only returns an error if it were to increment back to the original random state?

No because then the function wouldn't be monotonically increasing which is the useful property being implemented here. If you don't care about this property then creating random ulids without a generator is valid as well. I think that overflowing into the next millisecond is what's needed to keep the property valid, but that adds incorrect timestamps.

However, feel free to implement the described behavior into your codebase if it solves your needs. Everything on Ulid should be public and writing a custom Generator to fit your needs should be easy.

@vlovich
Copy link
Author

vlovich commented May 1, 2024

That's a good point. This is the part of the homepage that threw me:

If, in the extremely unlikely event that, you manage to generate more than 2^80 ULIDs within the same millisecond, or cause the random component to overflow with less, the generation will fail.

The "within the same millisecond" part reads non-sensically to me given these design properties. What's it trying to say?

@dylanhart
Copy link
Owner

See ulid/spec#11 for discussion. I don't think that line is accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants