Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add a generated OID db to const-oid #451

Merged
merged 1 commit into from
Feb 27, 2022

Conversation

npmccallum
Copy link
Contributor

This databse is generated from the official IANA registry.

Signed-off-by: Nathaniel McCallum nathaniel@profian.com

@npmccallum
Copy link
Contributor Author

@carl-wallace @tarcieri Please review.

@tarcieri
Copy link
Member

This seems interesting but might be something good to feature gate.

Regarding that, perhaps it would be possible to precompute the output provided it isn’t too large.

@npmccallum
Copy link
Contributor Author

  1. Can we keep the NamedOid type ungated? I have additional plans for that type.

  2. I prefer the build-time generation. I have bad experiences with checked-in, generated code.

@npmccallum
Copy link
Contributor Author

@tarcieri I have updated the PR to gate the db module (and the build script) on the db feature. I have, however, retained the NamedOid type as ungated.

@carl-wallace
Copy link
Contributor

carl-wallace commented Feb 26, 2022

A few questions (then I will be out until tomorrow so expect delay in any reply).

  1. is intent for this to stay LDAP centric or add other stuff (i.e., user principal name, OCSP, algorithm OIDs, etc)?
  2. I'd prefer for lookups like this to fail over to dot notation, but then you lose the str.
  3. What is the objection to pre-computation for this particular generated item? These things change almost never. There could be a test to affirm no change, perhaps, to get at similar dynamic end.
  4. Would this replace flat lists of OIDs or augment them? If replace and not pre-generated, we'd need something to generate docs.
  5. Would this be better in its own crate, a la https://github.com/rusticata/oid-registry?

@npmccallum
Copy link
Contributor Author

  1. There is no intention to keep it LDAP centric. I just started there because a useful datafile was available.

  2. I don't understand your point here.

  3. There's no standard way to do pre-computation in Rust. So you have to invent a mechanism. Then you have to remember to make sure to run rustfmt on it. Then people will try to patch the generated code (regardless of the warnings you put at the top of the file).

  4. I think it replaces them. My goal is to have types that associate the NamedOid types with other newtypes (such as RDNSequence; see feat: add new NewType derive #450) in higher level constructs.

  5. Maybe. But it is pretty unintrusive where it is. And even if you enable the db feature, compilation is fast and adds no additional dependencies. We could also rework how the db gate works to just remove the query interface. Then you get the benefits of standard constant names, and everything gets compiled out if unused.

@tarcieri
Copy link
Member

tarcieri commented Feb 27, 2022

Can we keep the NamedOid type ungated? I have additional plans for that type.

That's fine

I prefer the build-time generation. I have bad experiences with checked-in, generated code.

There are two circumstances where I think build-time generation is helpful:

  1. If the build output changes frequently
  2. If the build output varies depending on the target, e.g. generating bindings to C libraries

If the data is mostly static and not target-dependent, I think it's needless overhead.

This is a feature that isn't relevant to any of const-oid's current users, and it's a fairly deep dependency of the RustCrypto ecosystem, receiving 10,000-20,000 downloads a day.

The current implementation is very small and quick-to-compile. Adding a build script is another compilation unit (i.e. a crate) which gets pipelined with the other compilation units of the const-oid package. That's not to say I've benchmarked but build scripts are notorious for slowing down build speeds, and the current feature-gating approach still means you have to build an empty crate before you can build the const-oid crate proper even in the case the db feature is disabled.

I don't think this particular data set is high maintenance enough to justify adding a build script. My expectation is we'll generate the code once and call it a day. If it turns out we're doing it frequently and it's actually a problem, we can revisit.

As far as how to structure the conversion, I'd suggest just checking in a crate underneath const-oid, and doing:

[workspace]
members = ["."]

...to put it in its own workspace. cargo run it, and check in the output.

Alternatively maybe look at something like cargo xtask or cargo make.

@carl-wallace
Copy link
Contributor

  1. My point re: item 2 is when reaching for a friendly name I pretty much always want dot notation (i.e., 1.2.3.4) when there is no friendly name known. This looked to return None (likely to return str instead of String). This mechanism could be wrapped, or maybe a function that returned a String with dot notation used as a default could be added.
  2. People can do what they want, but for something so static, build time generation doesn't seem necessary to me.

@npmccallum
Copy link
Contributor Author

@carl-wallace @tarcieri I have now pushed a new version. It is basically a rewrite.

  1. The output is now pre-generated.
  2. There is now a Database::resolve() method which does what I think @carl-wallace is asking for.
  3. We now also generate OIDs from RFC 5280.
  4. Symbol name generation is improved greatly.
  5. I removed the feature gate.

Why Remove the Feature Gate?

  1. Since the Rust code is now pre-generated, build times should be near the original speeds.
  2. If the Database interface is not used, only used symbols will be retained by the compiler.
  3. If the Database interface is used, you only pay the price on the kinds you use it for.

Therefore, this roughly follows the maxim "pay only for what you use."

@tarcieri
Copy link
Member

tarcieri commented Feb 27, 2022

Since the Rust code is now pre-generated, build times should be near the original speeds.

This is true. By my measurements debug builds are 70% slower, and release builds 20% slower.

However, what this really bloats is the resulting rlib size. It went from 137kB to 1MB.

Personally I'd still prefer keeping the feature gate. const-oid is otherwise quite small and used in embedded contexts, and the extra bloat of this database is irrelevant to all of the crate's existing users.

Since it can be feature gated at a module level, it's also very simple to do, so I don't see any disadvantages in doing it.

If the Database interface is not used, only used symbols will be retained by the compiler.

It's not quite that simple. The compiler will build and retain all code regardless of use at the level of an individual crate, as it effectively builds code with the equivalent of -ffunction-sections and -fdata-sections. This is then handed off to the linker for dead code elimination (via e.g. --gc-sections), so unused symbols add to link times.

@npmccallum
Copy link
Contributor Author

@tarcieri Feature gate re-added.

Comment on lines +10 to +11
tree: BTreeMap<u8, Kind>,
docs: HashMap<u8, (Ident, TokenStream)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this makes sense, but looks a little weird

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What looks weird?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just using BTreeMap for one and HashMap for the other

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is by design. The sorting in the BTreeMap makes the output predicable. OTOH, the HashMap makes the lookups faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mostly a moot point, but FWIW the performance seems identical with BTreeMap in place of HashMap

This databse is generated from the official IANA LDAP registry and from
RFC 5280.

Signed-off-by: Nathaniel McCallum <nathaniel@profian.com>
@tarcieri tarcieri merged commit 59b0212 into RustCrypto:master Feb 27, 2022
@tarcieri tarcieri mentioned this pull request Mar 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants