# uuid - Universally Unique Identifiers

Purpose:	The uuid module implements Universally Unique Identifiers as described in RFC 4122.

RFC 4122 defines a system for creating universally unique identifiers for resources in a way that does not require a central registrar. UUID values are 128 bits long and, as the reference guide says, “can guarantee uniqueness across space and time.” They are useful for generating identifiers for documents, hosts, application clients, and other situations where a unique value is necessary. The RFC is specifically focused on creating a Uniform Resource Name namespace and covers three main algorithms:

* Using IEEE 802 MAC addresses as a source of uniqueness
* Using pseudo-random numbers
* Using well-known strings combined with cryptographic hashing

In all cases, the seed value is combined with the system clock and a clock sequence value used to maintain uniqueness in case the clock is set backwards.

## UUID 1 - IEEE 802 MAC Address

UUID version 1 values are computed using the MAC address of the host. The uuid module uses getnode() to retrieve the MAC value of the current system.

In [1]:
# uuid_getnode.py
import uuid

print(hex(uuid.getnode()))

0x20c9d0851d37


If a system has more than one network card, and so more than one MAC, any one of the values may be returned.

To generate a UUID for a host, identified by its MAC address, use the uuid1() function. The node identifier argument is optional; leave the field blank to use the value returned by getnode().

The components of the UUID object returned can be accessed through read-only instance attributes. Some attributes, such as hex, int, and urn, are different representations of the UUID value.

In [2]:
# uuid_uuid1.py
import uuid

u = uuid.uuid1()

print(u)
print(type(u))
print('bytes   :', repr(u.bytes))
print('hex     :', u.hex)
print('int     :', u.int)
print('urn     :', u.urn)
print('variant :', u.variant)
print('version :', u.version)
print('fields  :', u.fields)
print('  time_low            : ', u.time_low)
print('  time_mid            : ', u.time_mid)
print('  time_hi_version     : ', u.time_hi_version)
print('  clock_seq_hi_variant: ', u.clock_seq_hi_variant)
print('  clock_seq_low       : ', u.clock_seq_low)
print('  node                : ', u.node)
print('  time                : ', u.time)
print('  clock_seq           : ', u.clock_seq)

2702fc8c-f7d9-11e6-9ce4-20c9d0851d37
<class 'uuid.UUID'>
bytes   : b"'\x02\xfc\x8c\xf7\xd9\x11\xe6\x9c\xe4 \xc9\xd0\x85\x1d7"
hex     : 2702fc8cf7d911e69ce420c9d0851d37
int     : 51855398765196879779474783405932354871
urn     : urn:uuid:2702fc8c-f7d9-11e6-9ce4-20c9d0851d37
variant : specified in RFC 4122
version : 1
fields  : (654507148, 63449, 4582, 156, 228, 36051158900023)
  time_low            :  654507148
  time_mid            :  63449
  time_hi_version     :  4582
  clock_seq_hi_variant:  156
  clock_seq_low       :  228
  node                :  36051158900023
  time                :  137069350715849868
  clock_seq           :  7396


Because of the time component, each call to uuid1() returns a new value.

In this output, only the time component (at the beginning of the string) changes.

In [3]:
# uuid_uuid1_repeat.py
import uuid

for i in range(3):
    print(uuid.uuid1())

270a8850-f7d9-11e6-889e-20c9d0851d37
270b0ec6-f7d9-11e6-b786-20c9d0851d37
270b43d0-f7d9-11e6-8e20-20c9d0851d37


Because each computer has a different MAC address, running the sample program on different systems will produce entirely different values. This example passes explicit node IDs to simulate running on different hosts.

In addition to a different time value the node identifier at the end of the UUID also changes.

In [4]:
#uuid_uuid1_othermac.py
import uuid

for node in [0x1ec200d9e0, 0x1e5274040e]:
    print(uuid.uuid1(node), hex(node))

270f25ae-f7d9-11e6-8875-001ec200d9e0 0x1ec200d9e0
270f38e6-f7d9-11e6-a4c9-001e5274040e 0x1e5274040e


## UUID 3 and 5 - Name-Based Values

It is also useful in some contexts to create UUID values from names instead of random or time-based values. Versions 3 and 5 of the UUID specification use cryptographic hash values (MD5 or SHA-1, respectively) to combine namespace-specific seed values with names. There are several well-known namespaces, identified by pre-defined UUID values, for working with DNS, URLs, ISO OIDs, and X.500 Distinguished Names. New application-specific namespaces can be defined by generating and saving UUID values.

To create a UUID from a DNS name, pass uuid.NAMESPACE_DNS as the namespace argument to uuid3() or uuid5():

In [5]:
# uuid_uuid3_uuid5.py
import uuid

hostnames = ['google.com', 'baidu.com']

for name in hostnames:
    print(name)
    print('  MD5   :', uuid.uuid3(uuid.NAMESPACE_DNS, name))
    print('  SHA-1 :', uuid.uuid5(uuid.NAMESPACE_DNS, name))
    print()

google.com
  MD5   : 9a74c83e-2c09-3513-a74b-91d679be82b8
  SHA-1 : 64ee70a4-8cc1-5d25-abf2-dea6c79a09c8

baidu.com
  MD5   : dde57628-1ca0-3389-9f0d-6b94ce706474
  SHA-1 : 6121f649-ca8e-5e6f-847d-580647b71c0c



The UUID value for a given name in a namespace is always the same, no matter when or where it is calculated.

Values for the same name in the namespaces are different.

In [6]:
# uuid_uuid3_repeat.py
import uuid

namespace_types = sorted(
    n
    for n in dir(uuid)
    if n.startswith('NAMESPACE_')
)
name = 'www.google.com'

for namespace_type in namespace_types:
    print(namespace_type)
    namespace_uuid = getattr(uuid, namespace_type)
    print(' ', uuid.uuid3(namespace_uuid, name))
    print(' ', uuid.uuid3(namespace_uuid, name))
    print()

NAMESPACE_DNS
  de87628d-5377-3ba7-b31b-cde1cc8d423f
  de87628d-5377-3ba7-b31b-cde1cc8d423f

NAMESPACE_OID
  053a4fab-f488-379b-a15f-77392e34e705
  053a4fab-f488-379b-a15f-77392e34e705

NAMESPACE_URL
  d407cf76-ed73-3579-959e-78c80e8d4579
  d407cf76-ed73-3579-959e-78c80e8d4579

NAMESPACE_X500
  d1ac503c-9b89-3f89-92a6-20f6a8bd743a
  d1ac503c-9b89-3f89-92a6-20f6a8bd743a



## UUID 4 - Random Values

Sometimes host-based and namespace-based UUID values are not “different enough.” For example, in cases where UUID is intended to be used as a hash key, a more random sequence of values with more differentiation is desirable to avoid collisions in the hash table. Having values with fewer common digits also makes it easier to find them in log files. To add greater differentiation in UUIDs, use uuid4() to generate them using random input values.

The source of randomness depends on which C libraries are available when uuid is imported. If libuuid (or uuid.dll) can be loaded and it contains a function for generating random values, it is used. Otherwise os.urandom() or the random module are used.

## Working with UUID Objects

In addition to generating new UUID values, it is possible to parse strings in standard formats to create UUID objects, making it easier to handle comparisons and sorting operations.

Surrounding curly braces are removed from the input, as are dashes (-). If the string has a prefix containing urn: and/or uuid:, it is also removed. The remaining text must be a string of 16 hexadecimal digits, which are then interpreted as a UUID value.

In [7]:
# uuid_uuid_objects.py
import uuid


def show(msg, l):
    print(msg)
    for v in l:
        print(' ', v)
    print()

input_values = [
    'urn:uuid:f2f84497-b3bf-493a-bba9-7c68e6def80b',
    '{417a5ebb-01f7-4ed5-aeac-3d56cd5037b0}',
    '2115773a-5bf1-11dd-ab48-001ec200d9e0',
]

show('input_values', input_values)

uuids = [uuid.UUID(s) for s in input_values]
show('converted to uuids', uuids)

uuids.sort()
show('sorted', uuids)

input_values
  urn:uuid:f2f84497-b3bf-493a-bba9-7c68e6def80b
  {417a5ebb-01f7-4ed5-aeac-3d56cd5037b0}
  2115773a-5bf1-11dd-ab48-001ec200d9e0

converted to uuids
  f2f84497-b3bf-493a-bba9-7c68e6def80b
  417a5ebb-01f7-4ed5-aeac-3d56cd5037b0
  2115773a-5bf1-11dd-ab48-001ec200d9e0

sorted
  2115773a-5bf1-11dd-ab48-001ec200d9e0
  417a5ebb-01f7-4ed5-aeac-3d56cd5037b0
  f2f84497-b3bf-493a-bba9-7c68e6def80b

