Clarify how to represent fractional seconds in UUIDv7 #44

stevesimmons · 2021-12-19T20:35:21Z

Some of the UUIDv7 implementations I've seen on GitHub have not implemented fractional seconds correctly. For instance, putting the number of milliseconds directly into the 12 bit field subsec_a, thereby using a 0-999 subset of the full 0-4095 range.

To prevent this, could the initial descriptions of the subsec_a and subsec_b fields (around L568 of the -02.txt draft) make it more explicit that these must use the full range, with an example of what not to do.

Currently this isn't spelled out (though discussed in issue #24). For instance:

Section 4.4.1. - UUIDv7 Timestamp Usage

Additional sub-second precision (millisecond, nanosecond,
microsecond, etc) MAY be provided for encoding and decoding in the
remaining bits in the layout. [note: but doesn't say how!]

Section 4.4.4.1. - UUIDv7 Encoding

All 12 bits of scenario subsec_a is fully dedicated to millisecond
information (msec). [note: it isn't clear what "fully dedicated" means here]

It's only once the reader gets down to L845 that the requirements are spelled out:

Section 4.4.4.2. UUIDv7 Decoding

Similarly as per Figure 2, the sub-second precision values lie within
subsec_a, subsec_b, and subsec_seq_node which are all interpreted as
sub-second information after skipping over the version (ver) and
(var) bits. These concatenated sub-second information bits are
interpreted in a way where most to least significant bits represent a
further division by two. This is the same normal place notation used
to express fractional numbers, except in binary. For example, in
decimal ".1" means one tenth, and ".01" means one hundredth. In this
subsec field, a 1 means one half, 01 means one quarter, 001 is one
eighth, etc. This scheme can work for any number of bits up to the
maximum available, and keeps the most significant data leftmost in
the bit sequence.

As an additional suggestion, it would be very helpful if the text could include some examples of uuids and their min/max implied timestamps, to serve as test cases in unit tests.

The text was updated successfully, but these errors were encountered:

oittaa · 2022-01-01T19:27:50Z

I think it might be a good idea to include code examples, at least in a few of the most common programming languages, how to encode and decode those timestamps. Basically every programming language I know expresses precision timestamps as integers. For example https://docs.python.org/3/library/time.html#time.time_ns

oittaa · 2022-01-01T22:18:47Z

Python code example

# Enough to represent nanoseconds from time.time_ns()
SUBSEC_BITS = 30
SUBSEC_DECIMAL_DIGITS = 9


def subsec_encode(
    value: int,
    subsec_bits: int = SUBSEC_BITS,
    subsec_decimal_digits: int = SUBSEC_DECIMAL_DIGITS,
) -> int:
    return value * 2 ** subsec_bits // 10 ** subsec_decimal_digits


def subsec_decode(
    value: int,
    subsec_bits: int = SUBSEC_BITS,
    subsec_decimal_digits: int = SUBSEC_DECIMAL_DIGITS,
) -> int:
    return -(-value * 10 ** subsec_decimal_digits // 2 ** subsec_bits)


def test_millisecond():
    print("Testing millisecond conversions.")
    for i in range(10 ** 3):
        if i % 10 ** 2 == 0:
            print(f"{i=} ...")
        assert i == subsec_decode(subsec_encode(i, 10, 3), 10, 3)


def test_microsecond():
    print("Testing microsecond conversions.")
    for i in range(10 ** 6):
        if i % 10 ** 5 == 0:
            print(f"{i=} ...")
        assert i == subsec_decode(subsec_encode(i, 20, 6), 20, 6)


def test_nanosecond():
    print("Testing nanosecond conversions.")
    for i in range(10 ** 9):
        if i % 10 ** 6 == 0:
            print(f"{i=} ...")
        assert i == subsec_decode(subsec_encode(i, 30, 9), 30, 9)


def main():
    import time

    timestamp = time.time_ns()
    unixts, ns = divmod(timestamp, 10 ** SUBSEC_DECIMAL_DIGITS)
    subsec = subsec_encode(ns)
    subsec_to_ns = subsec_decode(subsec)
    print(f"{timestamp=}")
    print(f"{unixts=}")
    print(f"{ns=}")
    print(f"{subsec=}")
    print(f"{subsec_to_ns=}")
    test_millisecond()
    test_microsecond()
    test_nanosecond()


if __name__ == "__main__":
    main()

Ouput is something like

timestamp=1641075060289465000
unixts=1641075060
ns=289465000
subsec=310810677
subsec_to_ns=289465000
Testing millisecond conversions.
i=0 ...

Those main() and test_... functions here are to show that it actually works and maybe you would just include subsec_encode() and subsec_decode() in the actual documentation. The encoding and decoding should work correctly even if you change SUBSEC_BITS to something like the above mentioned 12 for a millisecond precision.

fabiolimace · 2022-02-05T18:52:39Z

Great code example @oittaa !

kyzer-davis added the Draft 03 IETF Draft 03 Work label Jan 31, 2022

kyzer-davis added the UUIDv7 All things UUIDv7 related label Feb 7, 2022

kyzer-davis added UUIDv8 All things UUIDv8 related and removed UUIDv7 All things UUIDv7 related labels Feb 23, 2022

kyzer-davis mentioned this issue Feb 23, 2022

Draft 03 PR #58

Merged

kyzer-davis linked a pull request Feb 23, 2022 that will close this issue

Draft 03 PR #58

Merged

kyzer-davis closed this as completed in #58 Mar 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify how to represent fractional seconds in UUIDv7 #44

Clarify how to represent fractional seconds in UUIDv7 #44

stevesimmons commented Dec 19, 2021 •

edited

oittaa commented Jan 1, 2022

oittaa commented Jan 1, 2022 •

edited

fabiolimace commented Feb 5, 2022

Clarify how to represent fractional seconds in UUIDv7 #44

Clarify how to represent fractional seconds in UUIDv7 #44

Comments

stevesimmons commented Dec 19, 2021 • edited

oittaa commented Jan 1, 2022

oittaa commented Jan 1, 2022 • edited

fabiolimace commented Feb 5, 2022

stevesimmons commented Dec 19, 2021 •

edited

oittaa commented Jan 1, 2022 •

edited