Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uuid5 #28761

Merged
merged 6 commits into from Sep 26, 2018

Conversation

5 participants
@matbesancon
Copy link
Contributor

commented Aug 19, 2018

Create UUIDs version 5 using the SHA1 hashing of another UUID (namespace) and string (name).
Special thanks to @ScottPJones for helping out with the last bits :trollface:

@iamed2

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2018

I suggest including the provided namespaces from the RFC as constants, and testing with UUID5s generated by, e.g., Python (which also provides the namespaces from the RFC as constants).

@matbesancon

This comment has been minimized.

Copy link
Contributor Author

commented Aug 20, 2018

I tried comparing the result of the Python lib with the current implementation and of course it's different, I'll look into it

@matbesancon

This comment has been minimized.

Copy link
Contributor Author

commented Aug 20, 2018

@iamed2 following your advice, Python comparisons are now integrated in the tests and should pass for uuid5. PyCall in tests seemed like a pain, so I just commented the reproduction steps in Python.

I also changed the input to take the actual bytes of both the namespace and name (which was initially different from the Python implementation)

function uuid5(ns::UUID, name::String)
nsbytes = zeros(UInt8, 16)
nsv = ns.value
for idx in Base.OneTo(16)

This comment has been minimized.

Copy link
@matbesancon

matbesancon Aug 20, 2018

Author Contributor

I have a feeling this loop is not as efficient as it can get, if some bit wizards have suggestions

@matbesancon

This comment has been minimized.

Copy link
Contributor Author

commented Aug 20, 2018

The results produced on the standard namespaces are the same as the rust ones:

extern crate uuid;
use uuid::Uuid;

fn main() {
    let refs = vec![&uuid::NAMESPACE_DNS, &uuid::NAMESPACE_OID, &uuid::NAMESPACE_URL, &uuid::NAMESPACE_X500];
    let pairs = refs
        .iter()
        .map(|u|
            (u, Uuid::new_v5(u, "julia"))
        );
    for (u1,u2) in pairs {
        println!("{0} - {1}", u1, u2);
    }
}
@iamed2

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2018

Excellent :)

@matbesancon

This comment has been minimized.

Copy link
Contributor Author

commented Aug 23, 2018

ok "we" have one tiny problem, uuidgen just got v5 on Ubuntu18.04 (not sure when it was integrated) and yields a slightly different result at the 17th byte for some results, not all. The results in the lib are still consistent with Python and Rust, just the OS uuidgen yields something different.

$ uuidgen --sha1
$ uuidgen --sha1 --namespace @dns --name julia
00ca23ad-40ef-500c-8910-157de3950d07
$ uuidgen --sha1 --namespace @oid --name julia
b7bf72b0-fb4e-538b-952a-3be296f07f6d
$ uuidgen --sha1 --namespace @url --name julia
997cd5be-4705-5439-9fe6-d77b18d612e5
$ uuidgen --sha1 --namespace @x500 --name julia
993c6684-82e7-5cdb-9d46-9bff0362e6a9

Python (same as Julia and Rust)

In [1]: import uuid
In [2]: uuid.uuid5(uuid.NAMESPACE_DNS,"julia")
Out[2]: UUID('00ca23ad-40ef-500c-a910-157de3950d07')
In [3]: uuid.uuid5(uuid.NAMESPACE_OID,"julia")
Out[3]: UUID('b7bf72b0-fb4e-538b-952a-3be296f07f6d')
In [4]: uuid.uuid5(uuid.NAMESPACE_URL,"julia")
Out[4]: UUID('997cd5be-4705-5439-9fe6-d77b18d612e5')
In [5]: uuid.uuid5(uuid.NAMESPACE_X500,"julia")
Out[5]: UUID('993c6684-82e7-5cdb-bd46-9bff0362e6a9')

Rust source: https://gist.github.com/matbesancon/287e8d19d63ea2a14fc6459372753fe1

@StefanKarpinski

This comment has been minimized.

Copy link
Member

commented Aug 24, 2018

Sigh. It seems like matching Python and Rust is the way to go.

@iamed2

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

I believe this indicates the variant, and I think you can ignore that digit.

@iamed2

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

(they shouldn't be different though, as both Python and Linux' uuid-gen claim to follow RFC-4122)

@matbesancon

This comment has been minimized.

Copy link
Contributor Author

commented Aug 25, 2018

@iamed2

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2018

Ohhh, I see:

From the RFC:

The following table lists the contents of the variant field, where the letter "x" indicates a "don't-care" value.

1     0     x    The variant specified in this document.

So in that hex digit b and 9 (1011 and 1001) should be considered equivalent, same with a and 8 (1010 and 1000).

@matbesancon

This comment has been minimized.

Copy link
Contributor Author

commented Aug 25, 2018

thanks for clarifying this, I was trying to find where to file a bug for uuidgen

@iamed2

This comment has been minimized.

Copy link
Contributor

commented Aug 26, 2018

No problem, stuff like this eats at me until I figure it out 😄

@matbesancon

This comment has been minimized.

Copy link
Contributor Author

commented Sep 26, 2018

should this get the 1.1 flag, since this is extending the exposed API of UUID?

@fredrikekre fredrikekre added this to the 1.1 milestone Sep 26, 2018

@StefanKarpinski

This comment has been minimized.

Copy link
Member

commented Sep 26, 2018

Yes, that's correct. It's a feature so it goes in a minor release.

@StefanKarpinski StefanKarpinski merged commit 8eca2c2 into JuliaLang:master Sep 26, 2018

4 checks passed

ci/circleci: build-i686 Your tests passed on CircleCI!
Details
ci/circleci: build-x86_64 Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
julia freebsd ci Build done
Details

ararslan added a commit that referenced this pull request Sep 26, 2018

Add SHA to UUIDs dependencies in Project.toml
A dependency on SHA was added to UUIDs in #28761 but the Project.toml
was not updated to reflect that, resulting in warnings while building
Julia.

ararslan added a commit that referenced this pull request Sep 26, 2018

Add SHA to UUIDs dependencies in Project.toml (#29375)
A dependency on SHA was added to UUIDs in #28761 but the Project.toml
was not updated to reflect that, resulting in warnings while building
Julia.

fredrikekre added a commit that referenced this pull request Nov 30, 2018

fredrikekre added a commit that referenced this pull request Dec 1, 2018

fredrikekre added a commit that referenced this pull request Dec 1, 2018

fredrikekre added a commit that referenced this pull request Dec 3, 2018

fredrikekre added a commit that referenced this pull request Dec 4, 2018

fredrikekre added a commit that referenced this pull request Dec 4, 2018

fredrikekre added a commit that referenced this pull request Dec 5, 2018

Addition of NEWS and compat admonitions for important
changes between Julia 1.0 and 1.1, including:

- Custom .css-style for compat admonitions.

- Information about compat annotations to CONTRIBUTING.md.

- NEWS.md entry for PRs #30090, #30035, #30022, #29978,
  #29969, #29858, #29845, #29754, #29638, #29636, #29615,
  #29600, #29506, #29469, #29316, #29259, #29178, #29153,
  #29033, #28902, #28761, #28745, #28708, #28696, #29997,
  #28790, #29092, #29108, #29782

- Compat annotation for PRs #30090, #30013, #29978,
  #29890, #29858, #29827, #29754, #29679, #29636, #29623,
  #29600, #29440, #29316, #29259, #29178, #29157, #29153,
  #29033, #28902, #28878, #28761, #28708, #28156, #29733,
  #29670, #29997, #28790, #29092, #29108, #29782, #25278

- Documentation for broadcasting CartesianIndices (#30230).
- Documentation for Base.julia_cmd().
- Documentation for colon constructor of CartesianIndices (#29440).
- Documentation for ^(::Matrix, ::Number) and ^(::Number, ::Matrix).

- Run NEWS-update.jl.

Co-authored-by: Morten Piibeleht <morten.piibeleht@gmail.com>
Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>

fredrikekre added a commit that referenced this pull request Dec 5, 2018

Addition of NEWS and compat admonitions for important
changes between Julia 1.0 and 1.1, including:

- Custom .css-style for compat admonitions.

- Information about compat annotations to CONTRIBUTING.md.

- NEWS.md entry for PRs #30090, #30035, #30022, #29978,
  #29969, #29858, #29845, #29754, #29638, #29636, #29615,
  #29600, #29506, #29469, #29316, #29259, #29178, #29153,
  #29033, #28902, #28761, #28745, #28708, #28696, #29997,
  #28790, #29092, #29108, #29782

- Compat annotation for PRs #30090, #30013, #29978,
  #29890, #29858, #29827, #29754, #29679, #29636, #29623,
  #29600, #29440, #29316, #29259, #29178, #29157, #29153,
  #29033, #28902, #28878, #28761, #28708, #28156, #29733,
  #29670, #29997, #28790, #29092, #29108, #29782, #25278

- Documentation for broadcasting CartesianIndices (#30230).
- Documentation for Base.julia_cmd().
- Documentation for colon constructor of CartesianIndices (#29440).
- Documentation for ^(::Matrix, ::Number) and ^(::Number, ::Matrix).

- Run NEWS-update.jl.

Co-authored-by: Morten Piibeleht <morten.piibeleht@gmail.com>
Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>

fredrikekre added a commit that referenced this pull request Dec 5, 2018

Compat admonitions and NEWS for Julia 1.1 (#30230)
Addition of NEWS and compat admonitions for important changes between Julia 1.0 and 1.1, including:

- Custom .css-style for compat admonitions.

- Information about compat annotations to CONTRIBUTING.md.

- NEWS.md entry for PRs #30090, #30035, #30022, #29978,
  #29969, #29858, #29845, #29754, #29638, #29636, #29615,
  #29600, #29506, #29469, #29316, #29259, #29178, #29153,
  #29033, #28902, #28761, #28745, #28708, #28696, #29997,
  #28790, #29092, #29108, #29782

- Compat annotation for PRs #30090, #30013, #29978,
  #29890, #29858, #29827, #29754, #29679, #29636, #29623,
  #29600, #29440, #29316, #29259, #29178, #29157, #29153,
  #29033, #28902, #28878, #28761, #28708, #28156, #29733,
  #29670, #29997, #28790, #29092, #29108, #29782, #25278

- Documentation for broadcasting CartesianIndices (#30230).
- Documentation for Base.julia_cmd().
- Documentation for colon constructor of CartesianIndices (#29440).
- Documentation for ^(::Matrix, ::Number) and ^(::Number, ::Matrix).

- Run NEWS-update.jl.


Co-authored-by: Morten Piibeleht <morten.piibeleht@gmail.com>
Co-authored-by: Fredrik Ekre <ekrefredrik@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.