[v6] Python implementation increments timestamp instead of clock sequence #4

nerg4l · 2021-06-09T21:32:26Z

While having a look at the python prototype and reading the "UUIDv6 Basic Creation Algorithm" section of the draft I noticed a difference between them.

From Section 4.3.4.

If the state was available, but the saved timestamp is less than
or equal to the current timestamp, increment the clock sequence
value.

prototypes/python/new_uuid.py

Lines 35 to 39 in 98bc0c1

    
           if _last_timestamp is not None and timestamp <= _last_timestamp: 
        
               timestamp = _last_timestamp + 1 
        
           _last_timestamp = timestamp 
        
           if clock_seq is None: 
        
               clock_seq = random.getrandbits(14) # instead of stable storage

I think the implementation in the linked code segment increments the timestamp when it should increment the clock sequence instead.

kyzer-davis · 2021-06-10T13:08:43Z

Hey @nerg4l,
I had noticed this too when forking the UUIDv1 code from the official Python repo.
https://github.com/python/cpython/blob/3.9/Lib/uuid.py#L688

I have been meaning to fix it in v6 prototype but have not found the time.
After I can submit PR to have the Python folks update their code too since the UUIDv1 creation algorithm is basically the same for that step in RFC 4122: https://datatracker.ietf.org/doc/html/rfc4122#section-4.2.1

kyzer-davis · 2021-06-10T17:46:37Z

@nerg4l
I pushed a branch uuidv7-python which should fix the v1/v6 issue where timestamp is incremented instead of the clock sequence.
This is simply re-used code from my UUIDv8 timestamp check.

Sequence Counter Logic:

sequenceCounter always starts at 0 instead of random value Python UUIDv1 was setting before
Compare _last_timestamp to current timestamp for this UUID generation
If current timestamp is less than or equal increment sequenceCounter by one (shouldn't ever be less than)
If current timestamp is greater, reset sequenceCounter to 0
Ensure we are updating _last_timestamp and _last_sequence for future UUID generation iterations.

Testing v6 with Dev Debugs enabled seems to increment properly but let me know what you think.

nerg4l · 2021-06-10T19:58:45Z

I think it is better to manage the sequence this way. The previous logic altered the time segment which meant whenever someone would read it from the generated uuid they would get an incorrect result.

Just remove clock_seq from the paramtere list. Currently that parameter is not respected. It gets a new value assigned in the method.

RFC 4122 - Section 4.1.5 says the following about clock sequence:

The clock sequence MUST be originally (i.e., once in the lifetime of
a system) initialized to a random number to minimize the correlation
across systems. This provides maximum protection against node
identifiers that may move or switch from system to system rapidly.
The initial value MUST NOT be correlated to the node identifier.

I don't know how important to respect this requirement in case of v6. The 48 random bit should be enough to avoid collisions.

kyzer-davis · 2021-06-10T20:55:53Z

@nerg4l
Yeah, I just made a commit to drop clock_seq variable from v1/v6. The goal with this v6 prototype was to detail that you only needed to change a few lines of code to change an existing v1 implementation over to v6. Little did I know Python's v1 implementation was non-compliant with RFC4122...

As for the v6:
Our current 01 draft re-uses the logic from the section you specified in RFC4122. However, in v1 the Sequence also doubles as entropy since the Node usually may contain a static 48-bit MAC. But in v6 our Node is random as such there is no reason to randomize the Seq to start.

Furthermore, if I recall in testing random Seq start in v8 prototype I found that random sequence counter greatly increased the change the Seq would rollover causing out of order UUIDs. Or if the rollover isn't handled properly overflows where the Seq becomes larger than the space it was meant to encompass resulting in UUIDs larger than 128 bits.

Thus is the reason why I stated v7 and v8 like so:

The clock sequence MUST start at zero and increment monotonically for
each new UUID created on by the application on the same timestamp.
When the timestamp increments the clock sequence MUST be reset to
zero. The clock sequence MUST NOT rollover or reset to zero unless
the timestamp has incremented. Care MUST be given to ensure that an
adequate sized clock sequence is selected for a given application
based on expected timestamp precision and expected UUID generation
rates.

I will discuss with Brad and update draft 02 where required to state that the sequence should start at 0 and increase by 1. Exactly like we do in v7/v8.

nerg4l · 2021-06-11T12:15:21Z

Furthermore, if I recall in testing random Seq start in v8 prototype I found that random sequence counter greatly increased the change the Seq would rollover causing out of order UUIDs. Or if the rollover isn't handled properly overflows where the Seq becomes larger than the space it was meant to encompass resulting in UUIDs larger than 128 bits.

I had the same observation while writing an implementation for v6.

I will discuss with Brad and update draft 02 where required to state that the sequence should start at 0 and increase by 1. Exactly like we do in v7/v8.

I think that would be a great addition. It would also make the sequence reusable between v6, v7, and v8.

When you discuss this don't forget to take into consideration the rule for systems/languages which cannot provide 100-nanosecond timestamps (e.g. JavaScript in the browser). RFC 4122 - Section 4.2.1.2. suggests the following:

A high resolution timestamp can be simulated by keeping a count of
the number of UUIDs that have been generated with the same value of
the system time, and using it to construct the low order bits of the
timestamp. The count will range between zero and the number of
100-nanosecond intervals per system time interval.

At the moment, there are 3 ways which comes to my mind.

Option 1: Increment low order bits, Sequence always 0
Option 2: Increment sequence, Low order bits not modified
Option 3: Increment low order until it reaches the precision of the system time and then increment sequence

The most optimal would be Option 3 but it might add a lot of unnecessary complexity.

Draft 01 Update (#2 #4 #5 and update Readme with new prototype links)

kyzer-davis added a commit that referenced this issue Jun 10, 2021

Fix #4 in v1/v6, Update UUIDv7 ontop of #2, add v8 timestamp clarity

d7e0fc9

nerg4l closed this as completed Jun 17, 2021

kyzer-davis mentioned this issue Jul 12, 2021

Draft 01 Update #6

Merged

kyzer-davis added a commit that referenced this issue Jul 12, 2021

Merge pull request #6 from uuid6/uuidv7-python

4861c23

Draft 01 Update (#2 #4 #5 and update Readme with new prototype links)

kyzer-davis mentioned this issue Jul 12, 2021

Draft 02 updates uuid6/uuid6-ietf-draft#14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v6] Python implementation increments timestamp instead of clock sequence #4

[v6] Python implementation increments timestamp instead of clock sequence #4

nerg4l commented Jun 9, 2021

kyzer-davis commented Jun 10, 2021

kyzer-davis commented Jun 10, 2021

nerg4l commented Jun 10, 2021

kyzer-davis commented Jun 10, 2021

nerg4l commented Jun 11, 2021

[v6] Python implementation increments timestamp instead of clock sequence #4

[v6] Python implementation increments timestamp instead of clock sequence #4

Comments

nerg4l commented Jun 9, 2021

kyzer-davis commented Jun 10, 2021

kyzer-davis commented Jun 10, 2021

nerg4l commented Jun 10, 2021

kyzer-davis commented Jun 10, 2021

nerg4l commented Jun 11, 2021