Adding Python Prototype for v7 #2

fabiolimace · 2021-05-22T02:46:08Z

Added v7 Python prototype plus testing

kyzer-davis · 2021-05-24T13:15:50Z

Thanks for the pull. A quick glance with my morning coffee looks good. I will grab a copy this afternoon and run through some deeper reviews and some testing.

Appreciate the help!

fabiolimace · 2021-05-25T04:18:08Z

You can modify it or refuse it if you think it is not a compliant implementation.

There are some aspects of this implementation that can be problematic:

It does not have a clock sequence (sequence counter). I did it without clock seq to avoid more complexity.
The node id changes all the time. Should it be a static random node id for the entire session?
The "devDebugs" IF is incomplete. I didn't have time to finish.
The subsec bits are encoded by multiplying the fractional part by 2 ** subsec_bits. I assume that if the decoding is done by dividing the subsec by 2 ** subsec_bits, the encoding must be done by multiplying the fractional part by 2 ** subsec_bits. Is it required for encoding? Can the encoding be relaxed as long as the decoding derives a value "as close to the correct value as possible"?

kyzer-davis · 2021-05-25T14:27:26Z

Ah, thanks for the heads up.
I will work in a clock sequence.
Node can change each time that is fine.
I was only doing UUID generation in these prototypes so no need to handle decoding at the moment.

kyzer-davis · 2021-06-10T17:37:03Z

@fabiolimace I updated v7 in branch uuidv7-python

Added clock sequence
More comments for those that may follow along with this code
More tests and dev debug sections

Testing I did seems okay. Let me know what you think. (Note: I did replace your splices because I am terrible with bitwise operations). I also added f4b6a3/uuid-creator to the readme table for UUIDv6

@bradleypeabody In testing this implementation I found that we may need to update draft 01. NS only needs 30 bits for subsec and our example for NS example in 4.4.4.1. UUIDv7 Encoding used too many bits. Worth also double-checking Millisecond and Microsecond too.

All 12 bits of scenario subsec_a is fully dedicated to providing
sub-second encoding for the Nanosecond precision (nsec).

All 12 bits of subsec_b have been dedicated to providing sub-
second encoding for the Nanosecond precision (nsec).

The first 14 bit of the subsec_seq_node dedicated to providing
sub-second encoding for the Nanosecond precision (nsec).

It is an easy fix, we just need to give 8 bits back to the Random part of subsec_seq_node and update like so:

The first 6 bit of the subsec_seq_node dedicated to providing
sub-second encoding for the Nanosecond precision (nsec).

Finally the remaining 48 bits in the subsec_seq_node section are
layout is filled out with random data to pad the length and
provide guaranteed uniqueness (rand).

fabiolimace · 2021-06-12T11:45:10Z

@kyzer-davis Now it's much better. I also wanted to replace the bitwise operations. Thanks!

I think this patch can fix the problem of bad decoding that forced the use of padding:

--- OLD/new_uuid.py
+++ NEW/new_uuid.py
@@ -170,7 +170,7 @@
 def uuid7(devDebugs=False, returnType="hex"):
     """Generates a 128-bit version 7 UUID with nanoseconds precision timestamp and random node
 
-    example: 60c26bbe-0728-7f46-9602-bcf7423f3cb7
+    example: 060c4735-8bcb-7726-a200-1fd41eaa8a29
 
     format: unixts|subsec_a|version|subsec_b|variant|subsec_seq_node
 
@@ -217,8 +217,7 @@
 
     ### Binary Conversions
     ### Need subsec_a (12 bits), subsec_b (12-bits), and subsec_c (leftover bits starting subsec_seq_node)
-    unixts = f'{sec:032b}'
-    unixts = unixts + "0000" # Pad end with 4 zeros to get 36-bit
+    unixts = f'{sec:036b}'
     subsec_binary = f'{subsec:030b}'
     subsec_a =  subsec_binary[:12] # Upper 12
     subsec_b_c = subsec_binary[-18:] # Lower 18
@@ -263,7 +262,7 @@
     _last_uuid_int = UUIDv7_int
 
     # Convert Hex to Int then splice in dashes
-    UUIDv7_hex = hex(int(UUIDv7_bin, 2))[2:]
+    UUIDv7_hex = f'{UUIDv7_int:032x}'
     UUIDv7_formatted = '-'.join(
         [UUIDv7_hex[:8], UUIDv7_hex[8:12], UUIDv7_hex[12:16], UUIDv7_hex[16:20], UUIDv7_hex[20:32]])

If you want to test the UUID time you can apply these changes to testing_v6.py and testing_v7.py :

testing_v6.py

--- OLD/testing_v6.py
+++ NEW/testing_v6.py
@@ -1,5 +1,6 @@
 import new_uuid
 import random
+import time
 
 """
 Testing:
@@ -17,16 +18,24 @@
 showUUIDs = False # True to view the generated UUID returnType and lists
 clock_seq = None # Set Clock Sequence
 
+def extractSeconds(uuid):
+	uuid_hex = uuid.replace('-', '')
+	timestamp = uuid_hex[:12] + uuid_hex[13:16]
+	return int((int(timestamp, 16) - 0x01b21dd213814000) / 10000000)
+
 def v6Tests(showUUIDs):
     counter = 0
     testList = []
     masterDict = {}
+    
+    start = int(time.time())
     while counter < 1000:
         # UUIDv6 = new_uuid.uuid1(devDebugs, returnType)
         UUIDv6 = new_uuid.uuid6(devDebugs, returnType)
         testList.append(UUIDv6)
         masterDict[UUIDv6] = counter
         counter += 1
+    end = int(time.time())
 
     if showUUIDs:
         print("\n")
@@ -54,6 +63,9 @@
         if masterDict[UUID] != counter:
             failCount+=1
             print('{0}: {1}'.format(str(counter), UUID))
+        elif not (extractSeconds(UUID) >= start and extractSeconds(UUID) <= end):
+            failCount+=1
+            print('{0}: {1} {2}'.format(str(counter), UUID, time.ctime(extractSeconds(UUID))))
         counter+= 1
     if failCount == 0:
         print("+ No Failures Observed")

testing_v7.py

--- OLD/testing_v7.py
+++ NEW/testing_v7.py
@@ -1,5 +1,6 @@
 import new_uuid
 import random
+import time
 
 """
 Testing:
@@ -17,15 +18,25 @@
 
 showUUIDs = False # True to view the generated UUID returnType and lists
 
+def extractSeconds(uuid):
+	uuid_hex = uuid.replace('-', '')
+	uuid_int = int(uuid_hex, 16)
+	uuid_bin = f'{uuid_int:0128b}'
+	time_bin = uuid_bin[:36]
+	return int(time_bin, 2)
+    
 def v7Tests(showUUIDs):
     counter = 0
     testList = []
     masterDict = {}
+    
+    start = int(time.time())
     while counter < 1000:
         UUIDv7 = new_uuid.uuid7(devDebugs, returnType)
         testList.append(UUIDv7)
         masterDict[UUIDv7] = counter
         counter += 1
+    end = int(time.time())
 
     if showUUIDs:
         print("\n")
@@ -53,6 +64,9 @@
         if masterDict[UUID] != counter:
             failCount+=1
             print('{0}: {1}'.format(str(counter), UUID))
+        elif not (extractSeconds(UUID) >= start and extractSeconds(UUID) <= end):
+            failCount+=1
+            print('{0}: {1} {2}'.format(str(counter), UUID, time.ctime(extractSeconds(UUID))))
         counter+= 1
     if failCount == 0:
         print("+ No Failures Observed")

The file testing_v8.py don't need to test the UUID time, since it depends on the implementation.

And thank you for the inclusion of the uuid-creator!

Draft 01 Update (#2 #4 #5 and update Readme with new prototype links)

fabiolimace · 2021-08-09T16:03:45Z

@kyzer-davis

I think we can avoid the timestamp padding doing 2 changes in the file new_uuid.py.

Change 1:

     ### Binary Conversions
     ### Need subsec_a (12 bits), subsec_b (12-bits), and subsec_c (leftover bits starting subsec_seq_node)
(-)  unixts = f'{sec:032b}'
(-)  unixts = unixts + "0000" # Pad end with 4 zeros to get 36-bit
     subsec_binary = f'{subsec:030b}'

     ### Binary Conversions
     ### Need subsec_a (12 bits), subsec_b (12-bits), and subsec_c (leftover bits starting subsec_seq_node)
(+)  nixts = f'{sec:036b}'
     subsec_binary = f'{subsec:030b}'

Change 2:

     # Convert Hex to Int then splice in dashes
(-)  UUIDv7_hex = hex(int(UUIDv7_bin, 2))[2:]
     UUIDv7_formatted = '-'.join(

     # Convert Hex to Int then splice in dashes
(+)  UUIDv7_hex = f'{UUIDv7_int:032x}' # int to hex
     UUIDv7_formatted = '-'.join(

After tthese changes the UUID is generated with the right length (36) without padding:

before: 60c26bbe-7287-f469-602b-cf7423f3cb7
after:  060c4735-8bcb-7726-a200-1fd41eaa8a29

The padding can result in different time when one tries to call uuid.get_time().

kyzer-davis · 2021-08-09T17:11:26Z

@fabiolimace

After tthese changes the UUID is generated with the right length (36) without padding:

Both methods end up padding unix 32 bit to 36. The difference is my current implementation pads the least significant bits (end) and your proposed change pads the most-significant, starting bits. (note the leading 0 in your final UUID.)
My preference has always been to pad in the least significant position and avoid leading 0s. I actually just published #21 earlier today detailing this in the V02 draft.

Change 2:

This is only required due to change number 1 causing the operation of int(UUIDv7_bin, 2) to drop the leading 0s you padded earlier. Somewhat counter-intuitive since f'{UUIDv7_int:032x} re-pads.
With the current padding, least significant position, you can use either UUIDv7_hex = hex(int(UUIDv7_bin, 2))[2:] or UUIDv7_hex = f'{UUIDv7_int:032x}' since they yield the same result of a 32 hex characters.

The padding can result in different time when one tries to call uuid.get_time()

The current implementation of uuid.get_time() will likely not be able to handle full UUIDv7 parsing until it is extended. By explicitly detailing the padding position this makes future extension of that easier. That is, if the spec is ratified as an official RFC.
With the current padding the decoder can always assume the first 32-bits of UUIDv7 are valid 32-bit Unix epoch. Decoding the remaining 4 bits along with the subsequent sub-second precision found in the rest of the UUIDv7 layout I would leave up to the implementer of the decoder.

Adding Python Prototype for v7

9b796f4

Added v7 Python prototype plus testing

fabiolimace mentioned this pull request May 22, 2021

Some ideas for UUIDv7 uuid6/uuid6-ietf-draft#11

Closed

Update README.md

70febd7

kyzer-davis added a commit that referenced this pull request Jun 10, 2021

Fix #4 in v1/v6, Update UUIDv7 ontop of #2, add v8 timestamp clarity

d7e0fc9

kyzer-davis mentioned this pull request Jul 12, 2021

Draft 01 Update #6

Merged

kyzer-davis added a commit that referenced this pull request Jul 12, 2021

Merge pull request #6 from uuid6/uuidv7-python

4861c23

Draft 01 Update (#2 #4 #5 and update Readme with new prototype links)

kyzer-davis merged commit 70febd7 into uuid6:main Jul 12, 2021

nerg4l mentioned this pull request Jul 12, 2021

Draft 02 updates uuid6/uuid6-ietf-draft#14

Merged

This was referenced Aug 9, 2021

V02: Fixed Typos and added 32 to 36 bit padding clarification uuid6/uuid6-ietf-draft#21

Closed

Discussion: UUIDv7 subsecond precision encoding uuid6/uuid6-ietf-draft#24

Open

This was referenced Dec 30, 2021

Python uuidv7 still currently right-pads with four zeros to turn 32 bit timestamp to 36 bit timestamp #8

Closed

Update v7 to draft 02 #16

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Python Prototype for v7 #2

Adding Python Prototype for v7 #2

fabiolimace commented May 22, 2021

kyzer-davis commented May 24, 2021 •

edited

Loading

fabiolimace commented May 25, 2021 •

edited

Loading

kyzer-davis commented May 25, 2021

kyzer-davis commented Jun 10, 2021 •

edited

Loading

fabiolimace commented Jun 12, 2021 •

edited

Loading

fabiolimace commented Aug 9, 2021

kyzer-davis commented Aug 9, 2021 •

edited

Loading

Adding Python Prototype for v7 #2

Adding Python Prototype for v7 #2

Conversation

fabiolimace commented May 22, 2021

kyzer-davis commented May 24, 2021 • edited Loading

fabiolimace commented May 25, 2021 • edited Loading

kyzer-davis commented May 25, 2021

kyzer-davis commented Jun 10, 2021 • edited Loading

fabiolimace commented Jun 12, 2021 • edited Loading

fabiolimace commented Aug 9, 2021

kyzer-davis commented Aug 9, 2021 • edited Loading

kyzer-davis commented May 24, 2021 •

edited

Loading

fabiolimace commented May 25, 2021 •

edited

Loading

kyzer-davis commented Jun 10, 2021 •

edited

Loading

fabiolimace commented Jun 12, 2021 •

edited

Loading

kyzer-davis commented Aug 9, 2021 •

edited

Loading