Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Digest::Base from Digest::Algorithm #9496

Closed
wants to merge 2 commits into from

Conversation

bcardiff
Copy link
Member

Digest::Base holds the state of a digest algorithm.
Digest::Algorithm defines the API to be used as entry point.

Crystal native implementation like Digest::MD5, Digest::SHA1 are now Digest::Algorithm.

OpenSSL::Digest do not follow the same design asn native implementation. The user needs to input the name of the algorithm. This cause that the class methods to not work as they were defined.

An attempt to extend the class methods to forward constructor arguments as:

OpenSSL::Digest.hexdigest("SHA256") do |ctx|
  ctx.update(data)
end

will not work because other class methods that were defined in Digest::Base that has arguments.

Digest::SHA1.digest(data)
OpenSSL::Digest.digest(data, "SHA256") # Should the constructor be specified before/after. It's weird.

So I think the cleaner path to avoid changing the expected API is to decouple Digest::Base as a helper class to keep the state of the algorithm, and Digest::Algorithm to be used as entry point.

All Digest::Algorithm can be created without argument. So the class methods can be well defined. OpenSSL::Digest does not fall in this case. For convenience OpenSSL::Digest::SHA1/SHA256/SHA512 are defined as shameless wrapper to a OpenSSL::Digest.

Finally, it is now possible to be able to have a registry of algorithm by name as defined by String => Digest::Algorithm.class.

Maybe if the OpenSSL module got redesigned things can be simplified since the noise is introduced by it's difference with the native implementation.

Ref: #9483
Follow-up: #8426

Digest::Base holds the state of a digest algorithm.
Digest::Algorithm defines the API to be used as entry point.

Add OpenSSL::Digest::SHA1/256/512 for convenience
@bcardiff bcardiff added this to the 0.35.1 milestone Jun 17, 2020
spec/std/digest/algorithm.cr Outdated Show resolved Hide resolved
Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right solution.
Differtiating between Digest::Algorithm and Digest::Base is really confusing and doesn't make much sense in terms of the API. It ends up exposing class methods on Digest::Algorithm that can't be used on that type, only on implementing types. That's why these methods are in macro inherinted, so that they only show up in implementing classes.
I suppose it would be a better solution to just make that macro inherited an opt-in macro which inheriting types need to call explicitly (or maybe the inherited hook could check whether a new method is defined and this could be automatically opt-in).
That would leave us with a single type (we could consider renaming it to Digest::Algorithm), clearly defined class methods and an overall concise API.

I'm also skeptical of the OpenSSL types. Are they really necessary/useful? We've plans to refactor to a generic TLS API anyways, so we could try to avoid adding more APIs to OpenSSL namespace.

@bcardiff
Copy link
Member Author

I tried something like that in bcardiff@c1fde24 but I didn't like that the optional module end up acting as the type to be used in a collection of algorithm.

I think that when OpenSSL module is reworked, OpenSSL::Digest implementation might change and then ::Digest::Base could be merged with ::Digest::Algorithm if all the instances can be created without arguments. When that happens, we can leave ::Digest::Algorithm as base class.

We can also duplicate a bit of code a don't make OpenSSL::Digest related to ::Digest but that was part of the goal in #8426 that many where happy about.

@straight-shoota
Copy link
Member

bcardiff@c1fde24 is not really what I meant.
Anyways, for 0.35.1 we should keep the changes as minimal as possible and not introduce a new concept in the API. The actual problem can be fixed trivially by removing the type restriction on the block arguments. I have posted an implementation at #9500.

It's unfortunate that the OpenSSL implementations' class method API is not compatible so we can't use an abstract Digest::Base class type as digest interface. We probably shouldn't have added the instance-level compatibility in #8426 without refactoring the OpenSSL implementation to also match the class-level API.
Until that's done, we need either of these compromises:

There's no ideal solution. But I'd pick the latter one because it's the least changes. I'm also not aware anyone showed a concrete use case for this. So it probably also has the least impact.

@asterite
Copy link
Member

I'm sorry, but all the alternatives proposed here are wrong.

@bcardiff already told me what we should ideally do, and I love that. It's just that it seems we can't do that change in 0.35.1 because it doesn't align with 0.35.0.

The ideal solution is:

  • we remove OpenSSL::Digest
  • we introduce Digest::SHA244, Digest::MD4, etc. (one for each algorithm in openssl) and have them inherit Digest::Base. Internally the use openssl, but that's an implementation detail
  • we need to move the existing Digest::SHA1 and Digest::SHA256 to the Crystal namespace because they were introduced only because of the without_openssl flag needed to compile the compiler (because of crystal play). These implementations are not safe (maybe I implemented them incorrectly?) and they are slower than their openssl counterpart. In fact, the existing Digest::SHA256 should be completely removed because it's not used anywhere in the compiler, and the implementation from openssl is superior (faster, probably more secure).
  • rename Digest::Base to Digest: there's no need for a Base type, the existing Digest module is empty (no methods), so we could move them there

Then the API becomes super clean: Digest is the base of all digest. All class methods work. There's one digest class per digest type. And there's no mention of OpenSSL in the API at all, which is ideal because that's just an implementation detail (we could eventually implement the algorithms ourselves and ditch openssl).

@straight-shoota
Copy link
Member

straight-shoota commented Jun 18, 2020

This proposal sounds great!
But such a big refactor should be avoided in a patch release. For 0.35.1 we need a quick and simple fix. That comes down to one of the compromises I mentioned. They're not wrong, they're just insufficient temporary solutions. And #9500 is the easiest one of them.

@bcardiff bcardiff removed this from the 0.35.1 milestone Jun 18, 2020
@bcardiff bcardiff closed this Jun 18, 2020
@bcardiff bcardiff mentioned this pull request Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants