## Overview
We need a system that generates ids having the following characteristics:
- contains only numeric values
- is unique
- is sortable
- is 64 bits in length

## Approach
- Use a database's *auto_increment* feature. The problem is that regular RDBMS database's value consistency over availability. In this particular scenario we need a more available system.
- Use a distributed system where each node generates UUID. UUIDs have very low probability of collision. It can be generated independently without coordination between servers. However, UUIDs are 128 bit, therefore do not meet size restriction of the problem statement. It incorporates UNIX epoch with MAC address.
- Snowflake ID

## Snowflake ID
Uses 64 bits and requires no coordination between servers like UUID.  
<img src="images/snowflake_id.png" />

**Reserved bit:** 1 bit at the start set to 0.  
**Timestamp:** number of milliseconds passed since a given epoch. Unix epoch needs starts at January 1st 1970. Snowflake IDs at Discord use custom epoch starting on Jan 1st 2015. 41 bits can last for 69 years.  
**Datacenter and Machine IDs:** 32 unique combinations of each  
**Sequence:** 12 bits to represent incrementing number, is reset every millisecond. 12 bits means 4096 unique IDs per millisecond.

## UUID
Though UUIDs don't fulfill the criteria mentioned, it is worthwhile to discuss them. UUIDs are 128 bits in size and have extremely low probability of duplicating. UUIDs are referred to as GUID in Microsoft systems. UUIDs don't require coordination between nodes in a distributed system to remain unique. There are multiple versions as documented below:

### Version 1 and 6
Are time and MAC based. Components:
- 48 bits of MAC address.
- 60 bits of timestamp. The timestamp is the number of 100-nanosecond intervals since midnight 15 October 1582. This is divided into:
  - time_low: 32 or 12 bits of low field of the timestamp
  - time_mid: 16 bits of middle field of the timestamp
  - time_high: 12 or 32 bits of high field of timestamp
- 14 bits of clock sequence, which starts at a random number and goes up by one if time goes backwards, to ensure that things like clock drift or leap seconds don't lead to collisions. (It reserves 16 bits).
- 4 bits for version

UUID v6 is newer and is sortable. Both are bad in terms of privacy since it contains MAC address. 

Java doesn't provide a built-in way to generte UUID v1 and v6.

![UUID v1 and v6](./images/UUID_v1_v6.png)

### Version 2
Very similar to v1, but not used often.

### Version 3 and 5
Are based on hashing some data like DNS or URL. Version 3 uses MD5 whereas version 5 uses SHA1. These are deterministic and name based.

Java doesn't provide a built-in way to generte UUID v3 and v5.

### Version 4
Is composed of entirely random bits. Components:
- 4 bits for version
- 2 bits for variant
- 122 bits of random data

Version 4 is usually the version meant when talking about UUIDs. It can be generated in Java as:

In [1]:
import java.util.UUID;

UUID uuid = UUID.randomUUID();
System.out.println(uuid);

27c0b626-cd0e-4225-81aa-87af6082aa6b


![UUID v4](./images/UUID_v4.png)

### Version 7
Designed specifically to be used in distributed environments and databases. Like v1 and v6, this is also time based, but uses UNIX epoch instead. The other difference is that the node component is replaced with random data. This is also sortable like UUID v6.