Replace Emacs' MD5 implementation with a Rust crate #117

Open
Wilfred opened this Issue Jan 27, 2017 · 8 comments

Projects

None yet

4 participants

@Wilfred
Owner
Wilfred commented Jan 27, 2017

No description provided.

@Wilfred Wilfred added the help wanted label Jan 27, 2017
@moosingin3space

I'd like to claim this one since I've been working with the hashing code in Emacs.

@gregkatz

Any reason you're not using rust-crypto? Looks like that can do the SHA stuff as well MD5.

@moosingin3space

Rust-crypto doesn't look active -- it's been 8 months since any commits were made.

@Wilfred
Owner
Wilfred commented Jan 31, 2017

@moosingin3space it's yours :)

@briansmith
briansmith commented Feb 2, 2017 edited

I suggest doing something different:

  1. Replace the C MD5 implementation with an Emacs Lisp implementation, like MD4 is implemented (see https://github.com/Wilfred/remacs/blob/aa5b1277ebad491a361a7bc8d71650310227764c/lisp/md4.el). In fact, it's already been implemented in https://github.com/mkhattab/old-emacs/blob/master/vendor/flim/md5.el.

  2. Incrementally replace uses of MD5 with SHA-{256, 384, 512} where possible.

MD5 is a broken algorithm and uses of it should be replaced with uses other algorithms, but in the meantime I think replacing the C MD5 implementation with the Emacs Lisp implementation would solve the dependency issue in an acceptable way.

@moosingin3space

I like @briansmith's idea. We could also do this to resolve the SHA-224 issue (as mentioned here).

@Wilfred
Owner
Wilfred commented Feb 4, 2017

MD5 is a great example of where we can show off Rust's advantages. It's a well-defined algorithm that gives us a nice speedup by implementing as a primitive.

We've generally avoided refactorings that make Remacs slower, see #105. Note that moving to a pure elisp implementation would require refactoring secure-hash (which currently needs a function it can call at the C/Rust level).

The elisp MD5 implementation linked is super-cool, but the code is actually using md5.so via FFI and only falls back to the pure elisp implementation if it has to. The elisp implementation is much slower, partly because it has to represent 32-bit ints as a cons pair.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment