Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guarantee a minimum number of IDs before overflow of the random component #39

Open
TomMD opened this issue Jan 9, 2020 · 5 comments
Open

Comments

@TomMD
Copy link

TomMD commented Jan 9, 2020

The actual number of ULIDs I can get in a millisecond might only be 1 (with absurd probability 2^-80). We could guarantee at least 2^79 ULIDs by requiring the first random generated in a millisecond starts with a zero bit (implementors would merely mask it, I'm sure).

This has the negative impact of only 79 bits of randomness instead of 80 so if there are 1e12ULIDs being generated across different devices in the same millisecond then we'll get a collision whp (vs the previous expectation of 7.8e11 ULIDs in one millisecond)

@ad-si
Copy link

ad-si commented Jan 12, 2020

I think monotonicity guarantees should completely be removed from the spec as they make the ulid() function impure. This can lead to very unpredictable behavior. (e.g. where different calls to the function accidentally do or don't share the same state and modify their value's in dangerous ways). I'd rather recommend the implementations to provide a second ulidMonoton() function and a function to manually increment the bits of the random part if there is a possibility of non monotonicity. What do you think @alizain?

@Bacco
Copy link

Bacco commented May 8, 2020

@TomMD your proposal seems to have the best balance point considering randomness, easy of implementation and safe range to accomodate successive calls. Thought the same today, came to open a similar issue and then saw yours. +1

@wtarreau
Copy link

wtarreau commented Mar 8, 2024

I thought exactly the same when first reading the spec: if you generate a high random value at the beginning of the millisecond you risk to overflow, and it would be sufficient to just mask the topmost bit from the RNG to avoid this. An UID generator that can fail is a problem for a lot of software, and the loss of that bit solves that problem entirely without significantly impacting the randomness. Plus there are two extra bits never used in the base32 representation. It would also make sense to pad the timestamp on the left (129 or 130 bits) and keep one or two bits to count overflows between RAND and TS if needed. But I tend to think that keeping 128 bits is more important than trying not to lose one bit of entropy.

@joonatanu-softwerk
Copy link

joonatanu-softwerk commented Jul 26, 2024

I was reading this spec and and many implementations and came to this scenario as well. (Found also issue #11 )

The random part may generate something very close to the last value of random part. Therefore an "overflow error" may appear. Because all of it is beginning with random itself, i have no guarantees of how many ULIDs can i generate in the same millisecond. Sometimes it is just 1, sometimes it is 2^80.

I would suggest the following change:

  • Allow random part to overflow into timestamp component
  • Instead of checking for the "same" millisecond, check for the "same or future" millisecond of the last generated ULID
  • Use the MAX(LastUlidTimePart, CurrentTimePart) as the time part for newly generated ULID

In that unfortunate scenario, when random produced the (close to) last value into the random part, this allows a whole new set of ULIDs to be generated and after the random part overflow into timestamp part, with just 1ms into the future, we now are guaranteed to have 2^80 ULIDs available before another overflow happens. That is near-impossible to do in this already "half-spent" single millisecond.

The time component will catch up in the end and in the mean time, all ULIDs are still orderable and generatable. And we have removed the random-error from specification that user can not predict.

@Seramis
Copy link

Seramis commented Aug 30, 2024

I have created an implementation of ULID in C# that allows overflow into timestamp part: https://github.com/ByteAether/Ulid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants