New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't deal with Encodings internally #48
Comments
Take a look at the commit that added it. The encoding system just replaced type conversion boilerplate code for on the input arguments: At the same time, the encoding system let us change the test vectors from ugly string literals to hex strings easily.
If the ability to easily use bytes isn't impacted, what's the problem with doing both? Supporting hex helped us out as library authors. As I've said many times, all you have to do is pretend the encoding system isn't there, and you will get bytes in and bytes out by default without even trying. This is why I find your complaints much-ado-about-nothing. Now, if the default encoding were hex, instead of bytes, or if I tacked some crazy nonstandard things onto the outputs like "$xsalsa20poly1305$hexencoded$cyphertextgo" that would be a reason to complain.
I don't, and understanding how the library works will be important to any of its prospective users, whether they're leveraging it through Krypt, your hypothetical library, or bare-to-the-metal.
That's great, and I've been talking to @emboss about writing a Krypt provider for RbNaCl. Have you considered helping out with Krypt instead of writing a competing library? |
I'm not normally one to cite Node, but here's a similar API: http://nodejs.org/api/crypto.html#crypto_cipher_update_data_input_encoding_output_encoding Not saying that's a good thing, just that this sort of thing isn't entirely unprecedented. |
To be fair, that's because the existing conversion boilerplate in a lot of cases already expected use hex as a default.
It ties every single API down to having to accept an encoding parameter. One that, IMHO, will probably end up becoming a named parameter / options hash, necessitating a breaking API change. It makes every API responsible for something that's conceptually outside of its scope.
As a user of the library, I can ignore it. As someone who's interested in contributing to the library itself, I think it's a bad decision. It's not catastrophic, I can't think of any vulnerabilities it would introduce, and it certainly seems like a clean implementation of the concept. But it places the responsibility of encoding in the wrong place (along the lines of It's a solution to the problem, but not the right solution to the problem.
If users can simply ignore documentation to use a cryptographic library in dangerous ways, they will do so regularly, emphatically, and without hesitation. Simply refer to the seemingly-infinite number of webpages out there that demonstrate using crypto in ECB mode, or using static nonces. For instance, the highest-voted answer on the topic on StackOverflow encourages exactly this. And the author's response when called on it? "In most commercial encryption settings, though, [using a new initialization vector every time] is a bit of overkill." And this is a contributor with over 16k reputation: an experienced developer by any metric. Good documentation is not enough for a programmer without a background in cryptography to use cryptographic libraries correctly. The only way is to effectively force them to use it correctly (as much as is possible) by default. I am utterly convinced that any library which expects users to generate their own nonces and ensure they are used only once, or that allows users to use any string of appropriate length as a cryptographic key is simply not fit for use by cryptographic novices. Put another way, if bad cryptography could have been solved by adequate documentation, it would have been done so by now.
Krypt appears to have completely different goals than I do. Exposing concepts like PKCS5 and HMAC to novices is the wrong way to go in my opinion, and I have no interest in having a library with pluggable backends. My primary goal is a library where the default is to be extremely hard to use incorrectly. It should support high-level use-cases like "hide this data" or "prove that this data hasn't been tampered with", as opposed to merely providing good choices for cryptographic primitives. Details like unique, one-use nonces or keys that must be generated from cryptographically random sources should be opaque and require active effort to bypass built-in protections. As Thomas Ptacek said, "I don't care what you write, but if you go into it thinking that the dangerous stuff is in the primitives like the AES core, and if you just stick to the glue you'll be safe, you're gonna have a bad time." By supporting high-level use-cases, I intend to reduce the glue code users have to write to implement tasks involving crypto to near-zero.
Encoding aside, I think anyone who looks at that API can agree that it should probably not be considered a template on which to build crypto APIs. Better might be to consider NaCl itself. Or OpenSSL. OpenSSL allows for encoding, but only on its command-line interface and not the API itself. The only "encodings" the OpenSSL APIs support directly are to convert things like keys to standardized interchange formats: PEM, PKCS7, PKCS12, etc. On the other hand, Google's KeyCzar's Python bindings treat all cryptographic inputs as byte buffers, and all cryptographic outputs as Base64 strings. The Java bindings have overloaded method calls that input bytes and output bytes, input Strings and output Base64 strings, and input byte buffers and output byte buffers. Keys are managed out-of-process by a command-line tool that stores keys internally with metadata about their acceptable uses (a protected key store is something an extremely-high-level crypto library should provide). I'm not sure what that points to other than: other libraries have approached this in various ways, and there's no well-trod ground here. |
Well, I'm afraid we'll have to agree to disagree. That said, RbNaCl is shipped, the API is 1.0, and I'd like to honor Semantic Versioning if possible. If the encoding API were to be removed (which I probably wouldn't agree with) it would have to be in version 2.0. So if discussion of shipping RbNaCl 2.0 comes up, that would be the time to broach the issue. Feel free to reopen this ticket at this time. I'm afraid you've missed the window here though.
PKCS5 is perhaps a bit obsolete, but it's a very important feature (used by Rails among other things). JRuby recently vendored Krypt to provide PKCS5 support to Rails. Sodium will ship the PHC winner... how is it any different from that besides PKCS5 being a NIST standard (and a bit of a build-your-own PHF toolkit instead of a standard like PHC)? HMAC is definitely useful to end-users as the symmetric equivalent of digital signatures. djb saw fit to include it in NaCl. Overall, I think you're just being too hyperbolic. NaCl's goals are the same as yours, yet you seem to be arguing that djb didn't go far enough, and that by including things like HMAC (complete with built-in constant time comparison functions) he's exposing user to potential troubles. You could make that case for Poly1305, but I think NaCl includes HMAC because it has fewer sharper edges, while still being useful. I wish you good luck on your novice-oriented crypto library. You will probably find me contributing to Krypt instead though. |
To be clear, I'm not arguing against supporting things like PKCS5, HMAC, etc. But those are cryptographic implementation details and have no business being exposed to end-users who are just trying to do things like "check that this data hasn't been forged" or "hide this data from eavesdroppers". When I'm writing a webapp, I'm not constantly having to deal with TCP packets. It's been abstracted away from me. Nobody would rightfully consider a web framework that directly exposed TCP details to be "high-level", or suitable for use by average developers. And yet we expect novices to understand, choose between, and correctly use PKCS5, PKCS7, PKCS12, AES, SHA2, HMAC, PBKDF2, bcrypt, scrypt, CSPRNGs, IVs, keys, nonces, salts, et al ad infinitum. NaCl, libsodium, and RbNaCl are a huge step in the right direction. But they are still relatively low-level libraries that are extremely dangerous if left in the wrong hands. |
I do kinda agree with @stouset here. Hell, I was never terribly happy with the encoding in the first place. They do complicate the API and for not that much gain. It's particularly noticable on the |
Again, they certainly made incorporating hex literals as the test vectors much easier |
Here is what I expect you will find if you try to remove the Encoding system: There's very little code, and in fact, very little repeated code, most of it in constructors. The serialization component is DRY. You will need to add #to_s conversions every place the encoding system is used to ensure proper conversion to strings. For every line of code dealing with encodings you delete, you will likely need to add another line to the test suite that handles converting the test vectors from hex to binary. Is this really helping, or is it churn? And it this feature really so... not helpful to end users of RbNaCl that it's worth taking away and hiding from them? You keep making claims about how it harms the codebase. I think it really helped converting the test vectors to hex, and that conversion made the code a lot more readable. I think there are further ways the encoding system could be combined with some of the existing argument checking code to further simplify input validation. I would veto any attempts to change the test vectors from hex back to binary string literals. |
|
No objection from me either on keeping those as hex (That was a sensible transition), but it's a trivial two lines of helper code in the specs (see https://github.com/cryptosphere/rbnacl/blob/master/spec/spec_helper.rb#L17) to make a make a method which returns the raw test vector when given a name, which also eliminates the 'sometimes we hex decode a test vector for e.g. let(:key) { test_vector :secret_key }
# and
let(:box) { SecretBox.new(key) }
# instead of
let(:key) { TestVectors[:secret_key] }
# and
let(:box) { SecretBox.new(key, :hex) } In the latter case, many methods need to be aware the vectors are hex encoded. In the former, they need no such knowledge. I also suspect it will break in a probably not uncommon case with the API as it is: Your private key is a |
I've brought it up incidentally in other bug reports, but it's worth bringing up separately: I don't believe that RbNaCl should concern itself with encodings.
It's a nice thought, but it has a lot of downsides IMHO. Every class that takes input or produces output now needs to have boilerplate code handling encoding or decoding. It makes the API less "clean". And it's incidental to (what I believe) should be the main focus of this library: crypto.
Cryptographic primitives understand one concept: bytes. Encoding data is, conceptually, a concern of transmission and storage layers. By handling encodings directly, this library takes on responsibilities which I believe are best suited for handling by other layers of an application.
For ease of use by tests (as @tarcieri mentioned earlier), a simple pair of helpers to convert between hex and binary is an easy solution.
In my mind, there is a clear separation of concerns for different parts of the "stack".
I still think that RbNaCl is far too low-level for an average programmer to use safely (and that's okay!). There's still too much that can go wrong: users can use poor sources of randomness for keys, they can reuse nonces, they can use a key across cryptographic contexts, and all sorts of other rookie mistakes. Having documentation that warns against these sorts of things has been repeatedly proven not to be good enough.
I'd much rather RbNaCl be considered a building-block for experienced folk to build "crypto for the masses" on top of, than for it to try and take on too much responsibility.
The text was updated successfully, but these errors were encountered: