v7 should have a counter option? #717

rogusdev · 2023-11-05T16:01:52Z

Per the new draft "With this method rand_a section of UUIDv7 SHOULD be utilized as fixed-length dedicated counter bits that are incremented by one for every UUID generation."

That's in the "Fixed-Length Dedicated Counter Bits (Method 1)" section under https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04#monotonicity_counters which is directly linked from https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04#section-5.2 under the rand_a explanation.

To my reading, that means it should have a counter, like v6 (and v1), but instead the current v7 builder implementation https://docs.rs/uuid/latest/src/uuid/v7.rs.html#47-53 is just random for rand_a. Am I missing something on this one? If not, and this is a desirable feature, I might look into putting up a PR to add the counter support.

KodrAus · 2023-11-15T23:31:51Z

Hi @rogusdev 👋

I'd be keen to go catch up on some of the discussion around that. Those sections look like recommendations rather than requirements, so I think uuid's current implementation that fills the entire 74 bits of rand_a and rand_b with random data is still compliant.

It seems a bit overkill to me to need a clock, a source of randomness, and a monotonic counter to generate V7 UUIDs, but we could consider adding *_monotonic variants of the v7 methods that accept a ClockSequence like v1 and v6 do.

KodrAus · 2023-11-16T02:18:10Z

The test vector for UUIDv7 appears to use a fully random value for rand_a and rand_b: https://datatracker.ietf.org/doc/html/draft-peabody-dispatch-new-uuid-format-04#name-example-of-a-uuidv7-value

I think we should consider supporting this, but don't think we need to block stabilizing the current APIs on it.

rogusdev · 2023-11-16T05:27:58Z

From my perspective, while the main point of uuid v7 is to put the clock at the front, so that db indexes can btree on them properly, counters add additional safety mechanisms that are quite to my taste: if you are generating ids in a single thread, you are guaranteed that no duplicates can happen, if there are less than the counter limit per millisecond. While the degree of randomness involved otherwise is certainly extremely unlikely to have duplicates, I am a big fan of guarantees.

Which is in fact why they put it into the RFC, as that is also a feature in many other popular id libraries.

rogusdev · 2023-11-16T05:31:42Z

That said, "SHOULD" is indeed a recommendation, rather than requirement, and they list multiple alternatives. So I support having a separate set of ctor functions to call for the counter version(s).

sergeyprokhorenko · 2023-11-28T23:24:27Z

Hi Ashley,

I'm Sergey Prokhorenko, a contributor to rfc4122bis and a counter enthusiast.

Some developers implement the counter:

UUIDv7 with the counter for PostgreSQL is currently being developed in C by Andrey Borodin:

But it seems to me that Rust is also a good tool for such development.

KodrAus · 2023-12-13T23:11:55Z

We've currently actually got access to a counter in our Timestamp type, because that's where it gets stashed for v1 and v6 UUIDs. How does the following sound to you:

Don't feature gate Timestamp.counter on v1 or v6; just make it always available.
Use the counter value if it's non-zero in Uuid::new_v7, otherwise use random bytes.
Add a Builder::from_timestamp_millis_counter(millis: u64, counter: u16, random_bytes: [u8; 6]) method.

An alternative for 2. would be to add Uuid::new_v7_counter and Uuid::now_v7_counter methods that use the counter.

sergeyprokhorenko · 2023-12-14T18:47:33Z

We've currently actually got access to a counter in our Timestamp type, because that's where it gets stashed for v1 and v6 UUIDs. How does the following sound to you:

Don't feature gate Timestamp.counter on v1 or v6; just make it always available.

Like the vast majority, I am convinced that only the seventh version is worthy of attention. There is no need to waste time and effort implementing other versions.

Use the counter value if it's non-zero in Uuid::new_v7, otherwise use random bytes.

I'm against. A counter initialized to zero increases the collision probability. Worse, the counter value may be the same as the random value used instead of the counter at the beginning of the millisecond.

I prefer a counter that is initialized to a random value at the beginning of each millisecond. But this may reduce the actual capacity of the counter. Therefore, the leftmost bit of the counter should be initialized to zero and/or the timestamp should be incremented when the counter overflows.

Add a Builder::from_timestamp_millis_counter(millis: u64, counter: u16, random_bytes: [u8; 6]) method.

The timestamp in the seventh version is shorter, and you missed the ver and var segments.
My suggestion:

Segment length, bits	Field in RFC	Segment content
48	unix_ts_ms	Timestamp
4	ver	Version
1	rand_a	Counter segment initialized to zero
11	rand_a	Counter segment initialized with a pseudorandom number
2	var	Variant
30	rand_b	Counter segment initialized with a pseudorandom number
32	rand_b	UUIDv7 segment filled with a pseudorandom number
64		Optional segment to the right of the UUID in key database columns

An alternative for 2. would be to add Uuid::new_v7_counter and Uuid::now_v7_counter methods that use the counter.

If we are talking about initializing the counter every millisecond with a random value and incrementing within a millisecond, then I am for it.

KodrAus · 2024-01-16T22:53:29Z

This library already implements v1 and v6 versions and has some established API conventions that we need to remain consistency with. I think we should be able to support everything suggested with some new Uuid::new_v7_counter and Uuid::now_v7_counter methods that accept an impl ClockSequence. The implementation of that ClockSequence is what would handle time and randomness.

sergeyprokhorenko · 2024-01-16T23:19:12Z

I highly recommend looking at the UUIDv7 implementation in PostgreSQL v17 by Andrey Borodin

sergeyprokhorenko · 2024-01-17T16:45:15Z

Patch UUID v7 (commitfest record)

KodrAus · 2024-01-18T04:24:07Z

Thanks @sergeyprokhorenko 👍 Having a good reference implementation to point at will definitely be helpful

sergeyprokhorenko · 2024-01-24T19:54:02Z

Another good implementation:
https://github.com/LiosK/uuid7-rs/tree/8b6362ac9bd4b0a24639ebe8365e4f6f6cacec75
https://crates.io/crates/uuid7

It is mentioned in this benchmark.

Due to the excessively long counter (initialized with a random number every millisecond), less entropy is generated, which requires a lot of resources.

But it's worth trying to replace rand::rngs::OsRng with openssl-rand for better performance.

sergeyprokhorenko · 2024-02-16T11:18:19Z

Patch UUID v7 (commitfest record)

This patch has already been successfully tested: https://ardentperf.com/2024/02/03/uuid-benchmark-war/

brianbruggeman · 2024-03-20T14:15:00Z

Thought/Question: Why not add a builder pattern with defaults? That doesn't eliminate the backwards compatibility and can improve clarity around generating the UUID.

ehiggs · 2024-04-14T22:30:47Z

The API in the Rust ULID crate provides monotonicity using an increment(&self) -> Option<Ulid> method. This means you can create batches of IDs by generating a single ulid and then increment the random part to create all the batches.

If you know that you are generating a batch then an increment method should be "sufficient logic for organizing the creation order of those one-thousand UUIDs" (quoting from the RFC).

Here's is an example:

use ulid::Ulid;
use uuid::Uuid;

fn main() {
    let mut uuids = Vec::with_capacity(10);
    for _ in 0..10 {
        let uuid = Uuid::now_v7();
        uuids.push(uuid);
    }
    for uuid in uuids.iter() {
        println!("{:<12} {uuid}", "uuidv7:");
    }
    println!("");
    let mut ulids = Vec::with_capacity(10);
    let mut ulid = Ulid::new();
    for _ in 0..10 {
        ulids.push(ulid);
        ulid = ulid.increment().unwrap();
    }
    for ulid in ulids.iter() {
        println!("{:<12} {ulid}", "ulid:");
    }
}

Is taking blatant inspiration from the excellent work in the ULID crate ok? Of course. ULID is the very first referred to time-based unique ID referenced in the new RFC.

Note that this also gives more control over creation of batches. If one creates a batch of 1000 UUIDs, using an internal counter you have no control over whether they will be in the same millisecond. This means you will have part of the batch spread across different timestamps and the random component will jump around for each millisecond. Using an increment you can keep the millisecond and the random component stable. It's not required in the specification but seems useful to be able to trivially see if UUIDs are part of the same batch.

KodrAus mentioned this issue Nov 15, 2023

UUIDv6–v8 Support #523

Closed

rogusdev closed this as completed Nov 16, 2023

rogusdev reopened this Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v7 should have a counter option? #717

v7 should have a counter option? #717

rogusdev commented Nov 5, 2023 •

edited

KodrAus commented Nov 15, 2023

KodrAus commented Nov 16, 2023

rogusdev commented Nov 16, 2023 •

edited

rogusdev commented Nov 16, 2023

sergeyprokhorenko commented Nov 28, 2023

KodrAus commented Dec 13, 2023

sergeyprokhorenko commented Dec 14, 2023 •

edited

KodrAus commented Jan 16, 2024

sergeyprokhorenko commented Jan 16, 2024 •

edited

sergeyprokhorenko commented Jan 17, 2024

KodrAus commented Jan 18, 2024

sergeyprokhorenko commented Jan 24, 2024

sergeyprokhorenko commented Feb 16, 2024

brianbruggeman commented Mar 20, 2024

ehiggs commented Apr 14, 2024 •

edited

v7 should have a counter option? #717

v7 should have a counter option? #717

Comments

rogusdev commented Nov 5, 2023 • edited

KodrAus commented Nov 15, 2023

KodrAus commented Nov 16, 2023

rogusdev commented Nov 16, 2023 • edited

rogusdev commented Nov 16, 2023

sergeyprokhorenko commented Nov 28, 2023

KodrAus commented Dec 13, 2023

sergeyprokhorenko commented Dec 14, 2023 • edited

KodrAus commented Jan 16, 2024

sergeyprokhorenko commented Jan 16, 2024 • edited

sergeyprokhorenko commented Jan 17, 2024

KodrAus commented Jan 18, 2024

sergeyprokhorenko commented Jan 24, 2024

sergeyprokhorenko commented Feb 16, 2024

brianbruggeman commented Mar 20, 2024

ehiggs commented Apr 14, 2024 • edited

rogusdev commented Nov 5, 2023 •

edited

rogusdev commented Nov 16, 2023 •

edited

sergeyprokhorenko commented Dec 14, 2023 •

edited

sergeyprokhorenko commented Jan 16, 2024 •

edited

ehiggs commented Apr 14, 2024 •

edited