Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding header tag formats to generate UUIDS #269

Closed
n3npq opened this issue Jul 20, 2017 · 5 comments
Closed

Adding header tag formats to generate UUIDS #269

n3npq opened this issue Jul 20, 2017 · 5 comments

Comments

@n3npq
Copy link
Contributor

n3npq commented Jul 20, 2017

UUID's provide a common format for identification and database retrieval.

The attached patch adds header tag formats to RPM queries.

  • UUIDv1 time stamps (for events like build/install times):
$ ./rpm -q --qf '%{buildtime:uuidv1}\n' bash
c60fd500-bc0a-11e7-804e-003048b801de
  • UUIDv3 namespace identifiers based on MD5 (like package/header digests)
$ ./rpm -q --qf '%{sigmd5:uuidv3}\n' bash
c24fd63e-14c4-32d1-84f5-168a1f2908db
  • UUIDv4 random nonces (overkill because random, but added for completeness)
$ ./rpm -q --qf '%{sigmd5:uuidv4}\n' bash
b7a15d96-c4cd-4e41-8598-6463375cb39f
  • UUIDv5 namespace identifiers based on SHA1 (like package/header digests)
$ ./rpm -q --qf '%{sigmd5:uuidv5}\n' bash
9292a557-f445-5f36-9b79-8e79a6efaaa2

The UUIDv3/UUIDv5 name spaces can be configured through optional macros (defaults below)

    %_uuid_auth    http://rpm.org
    %_uuid_path    /packages

For reference, the actual text used for, say, Sha1header, in a namespace UUID looks like

    http://rpm.org/packages/Sha1header/XXXXXXXXXXXXXXXXXXXXXXXX

Essentially a prefix (to make the namespace unique) of %_uuid_auth and %_uuid_path followed by a tag name and a tag value. The string is then digested with MD5/SHA1 and encoded in a UUIDv3/UUIDv5 respectively.

UUID's will be the starting point for a RPM+LMDB implementation used as header retrieval keys.

You will need the acinclude.m4 file from issue #257 to use the patch below. Adding the hires timestamps from issue #197 would improve the UUIDv1 granularity (not implemented).

@n3npq
Copy link
Contributor Author

n3npq commented Jul 20, 2017

rpm_uuid.patch.gz

@n3npq
Copy link
Contributor Author

n3npq commented Aug 6, 2017

The rpm_uuid.patch is now applied to an RPM tree at rpm5@c8c72fb

@Conan-Kudo
Copy link
Member

From @n3npq:

UUID's will be the starting point for a RPM+LMDB implementation used as header retrieval keys.

It didn't seem like you used this with the LMDB implementation (ed9de19), though you mention it as a starting point. I'm guessing you're not currently using it? Or did I miss something?

@n3npq
Copy link
Contributor Author

n3npq commented Aug 17, 2017

Changing indices to be octets (rather than integers) is a prerequisite to using UUID's as a join key.

There are many problems (endianness, exposure of hdrNum/tagNum in the RPM API) that need to be solved to use a UUID as an octet string for LMDB (and for BDB). See other issues for work-in-progress.

Ideally, a UUIDv1 (which has time ordered properties) should be used as the retrieval key throughout all RPM backend databases. One of the immediately obvious and useful side effects is that "rpm -qa" will present results in install order by default, which is more deterministic (and less confusing imho) than random hash bucket iteration returns.

@n3npq
Copy link
Contributor Author

n3npq commented Aug 17, 2017

Meanwhile this patch adds a --queryformat modifier to generate UUID's from tag values, which is an entirely orthogonal and mostly non-intrusive usage case than converting hdrNum/tagNum to an octet string.

@n3npq n3npq closed this as completed Aug 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants