Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UUID module #546

Merged
merged 7 commits into from
Nov 2, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
## development

- Add your change HERE
- Add `FFaker::UUID` `.uuidv4`, `.uuidv6`, `.uuidv7`, and `.uuidv8` [@stilist]
- Deprecate `FFaker::Guid.guid` in favor of `FFaker::UUID` methods [@stilist]
stilist marked this conversation as resolved.
Show resolved Hide resolved
- Limit FFaker::BankUS.routing_number first two digits [@professor]

# 2.23.0
Expand Down
1 change: 1 addition & 0 deletions lib/ffaker.rb
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ def self.bothify(masks)
'ua' => 'UA',
'uk' => 'UK',
'us' => 'US',
'uuid' => 'UUID',
'vn' => 'VN'
}

Expand Down
8 changes: 7 additions & 1 deletion lib/ffaker/guid.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,14 @@ module Guid
extend ModuleUtils
extend self

# Because this method uses arbitrary hexadecimal characters it is likely to
# generate invalid UUIDs--UUIDs must have a version (1-8) at bits 48-51,
# and bits 64-65 must be 0b10.
#
# @deprecated Often generates invalid UUIDs. Use {UUID} instead.
def guid
FFaker.hexify('########-####-####-####-############')
warn '[guid] is deprecated. Use the UUID.uuidv4 method instead.'
FFaker::UUID.uuidv4
end
end
end
175 changes: 175 additions & 0 deletions lib/ffaker/uuid.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
# frozen_string_literal: true

require 'date'

module FFaker
# UUIDs are a 128-bit value (16 bytes), often represented as a
# 32-character hexadecimal string in the format
# `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`.
#
# @note This generates lowercase strings, but UUIDs are case-insensitive.
#
# @see https://www.rfc-editor.org/rfc/rfc4122#section-4
# @see https://datatracker.ietf.org/doc/draft-ietf-uuidrev-rfc4122bis/
module UUID
extend ModuleUtils
extend self

# > UUID version 4 is meant for generating UUIDs from truly-random or
# > pseudo-random numbers.
def uuidv4
uuid = 0
# random_a
# > The first 48 bits of the layout that can be filled with random data
# > as specified in Section 6.9. Occupies bits 0 through 47 (octets 0-5).
uuid |= rand((2**48) - 1) << 80
# ver
# > The 4 bit version field as defined by Section 4.2, set to 0b0100 (4).
# > Occupies bits 48 through 51 of octet 6.
uuid |= 0b0100 << 76
# random_b
# > 12 more bits of the layout that can be filled random data as per
# > Section 6.9. Occupies bits 52 through 63 (octets 6-7).
uuid |= rand((2**12) - 1) << 64
# var
# > The 2 bit variant field as defined by Section 4.1, set to 0b10.
# > Occupies bits 64 and 65 of octet 8.
uuid |= 0b10 << 62
# random_c
# > The final 62 bits of the layout immediately following the var field
# > field to be filled with random data as per Section 6.9. Occupies bits
# > 66 through 127 (octets 8-15).
uuid |= rand((2**62) - 1)

as_string(uuid)
end

# > UUID version 6 is a field-compatible version of UUIDv1 Section 5.1,
# > reordered for improved DB locality. It is expected that UUIDv6 will
# > primarily be used in contexts where UUIDv1 is used. Systems that do not
# > involve legacy UUIDv1 SHOULD use UUIDv7 instead.
def uuidv6
timestamp = rand((2**60) - 1)

uuid = 0
# time_high
# > The most significant 32 bits of the 60 bit starting timestamp.
# > Occupies bits 0 through 31 (octets 0-3).
# @note Shifts 28 bits to remove `time_mid` and `time_low`.
uuid |= (timestamp >> 28) << 96
# time_mid
# > The middle 16 bits of the 60 bit starting timestamp. Occupies bits 32
# > through 47 (octets 4-5).
# @note Shifts 12 bits to remove `time_low`.
uuid |= ((timestamp >> 12) & ((2**16) - 1)) << 80
# ver
# > The 4 bit version field as defined by Section 4.2, set to 0b0110 (6).
# > Occupies bits 48 through 51 of octet 6.
uuid |= 0b0110 << 76
# time_low
# > 12 bits that will contain the least significant 12 bits from the 60
# > bit starting timestamp. Occupies bits 52 through 63 (octets 6-7).
uuid |= (timestamp & ((2**12) - 1)) << 64
# var
# > The 2 bit variant field as defined by Section 4.1, set to 0b10.
# > Occupies bits 64 and 65 of octet 8.
uuid |= 0b10 << 62
# clk_seq
# > The 14 bits containing the clock sequence. Occupies bits 66 through
# > 79 (octets 8-9).
#
# (earlier in the document)
# > The clock sequence and node bits SHOULD be reset to a pseudo-random
# > value for each new UUIDv6 generated; however, implementations MAY
# > choose to retain the old clock sequence and MAC address behavior from
# > Section 5.1.
uuid |= rand((2**14) - 1) << 48
# node
# > 48 bit spatially unique identifier. Occupies bits 80 through 127
# > (octets 10-15).
uuid |= rand((2**48) - 1)

as_string(uuid)
end

# > UUID version 7 features a time-ordered value field derived from the
# > widely implemented and well known Unix Epoch timestamp source, the
# > number of milliseconds since midnight 1 Jan 1970 UTC, leap seconds
# > excluded. UUIDv7 generally has improved entropy characteristics over
# > UUIDv1 Section 5.1 or UUIDv6 Section 5.6.
def uuidv7
timestamp = rand((2**48) - 1)

uuid = 0
# unix_ts_ms
# > 48 bit big-endian unsigned number of Unix epoch timestamp in
# > milliseconds as per Section 6.1. Occupies bits 0 through 47 (octets
# > 0-5).
uuid |= timestamp << 80
# ver
# > The 4 bit version field as defined by Section 4.2, set to 0b0111 (7).
# > Occupies bits 48 through 51 of octet 6.
uuid |= 0b0111 << 76
# rand_a
# > 12 bits pseudo-random data to provide uniqueness as per Section 6.9
# > and/or optional constructs to guarantee additional monotonicity as
# > per Section 6.2. Occupies bits 52 through 63 (octets 6-7).
uuid |= rand((2**12) - 1) << 64
# var
# > The 2 bit variant field as defined by Section 4.1, set to 0b10.
# > Occupies bits 64 and 65 of octet 8.
uuid |= 0b10 << 62
# rand_b
# > The final 62 bits of pseudo-random data to provide uniqueness as per
# > Section 6.9 and/or an optional counter to guarantee additional
# > monotonicity as per Section 6.2. Occupies bits 66 through 127 (octets
# > 8-15).
uuid |= rand((2**62) - 1)

as_string(uuid)
end

# > UUID version 8 provides an RFC-compatible format for experimental or
# > vendor-specific use cases. The only requirement is that the variant and
# > version bits MUST be set as defined in Section 4.1 and Section 4.2.
# > UUIDv8's uniqueness will be implementation-specific and MUST NOT be
# > assumed.
# >
# > [...] To be clear: UUIDv8 is not a replacement for UUIDv4 Section 5.4
# > where all 122 extra bits are filled with random data.
def uuidv8
uuid = 0
# custom_a
# > The first 48 bits of the layout that can be filled as an
# > implementation sees fit. Occupies bits 0 through 47 (octets 0-5).
uuid |= rand((2**48) - 1) << 80
# ver
# > The 4 bit version field as defined by Section 4.2, set to 0b1000 (8).
# > Occupies bits 48 through 51 of octet 6.
uuid |= 0b1000 << 76
# custom_b
# > 12 more bits of the layout that can be filled as an implementation
# > sees fit. Occupies bits 52 through 63 (octets 6-7).
uuid |= rand((2**12) - 1) << 64
# var
# > The 2 bit variant field as defined by Section 4.1, set to 0b10.
# > Occupies bits 64 and 65 of octet 8.
uuid |= 0b10 << 62
# custom_c
# > The final 62 bits of the layout immediately following the var field
# > to be filled as an implementation sees fit. Occupies bits 66 through
# > 127 (octets 8-15).
uuid |= rand((2**62) - 1)

as_string(uuid)
end

private

def as_string(uuid)
uuid.to_s(16)
.rjust(32, '0')
.gsub(/(.{8})(.{4})(.{4})(.{4})(.{12})/, '\1-\2-\3-\4-\5')
end
end
end
2 changes: 1 addition & 1 deletion test/test_guid.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ class TestGuid < Test::Unit::TestCase
assert_methods_are_deterministic(FFaker::Guid, :guid)

def test_guid
assert_match(/[A-F0-9]{8}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{12}/,
assert_match(/\A[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}\z/,
FFaker::Guid.guid)
end
end
73 changes: 73 additions & 0 deletions test/test_uuid.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# frozen_string_literal: true

require_relative 'helper'

class TestUUID < Test::Unit::TestCase
include DeterministicHelper

assert_methods_are_deterministic(
FFaker::UUID,
:uuidv4, :uuidv6, :uuidv7, :uuidv8
)

def setup
@tester = FFaker::UUID
end

# @see https://stackoverflow.com/a/38191104
def test_uuidv4
raw_uuid = @tester.uuidv4
assert_format(raw_uuid)

uuid = uuid_to_integer(raw_uuid)
assert_version(uuid, 0b0100)
assert_variant(uuid, 0b10)
end

def test_uuidv6
raw_uuid = @tester.uuidv6
assert_format(raw_uuid)

uuid = uuid_to_integer(raw_uuid)
assert_version(uuid, 0b0110)
assert_variant(uuid, 0b10)
end

def test_uuidv7
raw_uuid = @tester.uuidv7
assert_format(raw_uuid)

uuid = uuid_to_integer(raw_uuid)
assert_version(uuid, 0b0111)
assert_variant(uuid, 0b10)
end

def test_uuidv8
raw_uuid = @tester.uuidv8
assert_format(raw_uuid)

uuid = uuid_to_integer(raw_uuid)
assert_version(uuid, 0b1000)
assert_variant(uuid, 0b10)
end

private

# Matches structure of all UUID versions.
def assert_format(uuid)
assert_match(/\A[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[89ab][0-9a-f]{3}-[0-9a-f]{12}\z/,
uuid)
end

def assert_version(uuid, version)
assert_equal(version, (uuid >> 76) & 0b1111)
end

def assert_variant(uuid, variant)
assert_equal(variant, (uuid >> 62) & 0b11)
end

def uuid_to_integer(uuid)
uuid.delete('-').to_i(16)
end
end