Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Add SHA-1 message digest #221

Closed
wants to merge 28 commits into from

12 participants

Kai Nacke Masahiro Nakagawa Piotr Szturmaj David Simcha David Nadlinger Andrei Alexandrescu Brad Roberts Daniel Murphy Lars T. Kyllingstad Jonathan M Davis Alex Rønne Petersen Johannes Pfau
Kai Nacke

SHA-1 is an important message digest. E.g. it is used by git. This implementation features:

  • optimized standard implementation, based on a verbatim copy of std/md5.d E.g. if compiled with -O -inline -release then there is not single function call or loop inside function transform.
  • if SSSE3 support is detected then a special assembler function is used. This gives a speedup of 1.4 on 32 bit and about 1.8 on 64 bit

The 64 bit implementation is a bit clumsy because of some issues with the dmd compiler (issues 6459 and 5355). Supports 32 bit Windows and 32 bit and 64 Linux. MacOS X, FreeBSD are untested, but should also work.

Kai Nacke Add SHA-1 message digest
SHA-1 is an important message digest. E.g. it is used by git. This implementation features:
- optimized standard implementation. E.g. if compiled with -O -inline -release then there is not single function call or loop.
- if SSSE3 support is detected then a special assembler function is used. This gives a speedup of 1.4 on 32 bit and about 1.8 on 64 bit
The 64 bit implementation is a bit clumsy because of some issues with the dmd compiler (issues 6459 and 5355).
Supports 32 bit Windows and 32 bit and 64 Linux. MacOS X, FreeBSD are untested.
4592a4b
David Simcha
Collaborator

Nice work! I do think it's silly, though, to put sha1 and md5 in separate modules, especially in the top level std package. I think we should schedule std.md5 for deprecation and do one of the following:

  1. Combine this code and std.md5 into a new module called std.hash or std.digest or something. This would be publicly imported by std.md5 while it still exists.

  2. Create a new package called std.hash and move this code to std.hash.sha1 and move md5 to std.hash.md5. (Or, s/hash/digest or whatever. The exact name isn't what's important.) Again, std.hash.md5 would be publicly imported by std.md5 until std.md5 is removed.

Kai Nacke

I would prefer package std.digest. The assembler module already sits in std/internal/digest.

Masahiro Nakagawa
Collaborator

I agree dsimcha's comment.

In addition, I propose the API change.
I think SHA1_CTX(and MD5_CTX) is bad name on D and SHA1_CTX seems to be an OutputRange.

Sample code is below:

struct SHA1
{
    void put(const void[] input);
    // alias if needed
    alias put update;
}
Kai Nacke Created new package std.digest.
- Moved sha1 and md5 to new package
- Introduced dummy modul std.md5
7ec2946
Andrei Alexandrescu

I think digest is too narrow a package. We should define package crypt, and the digest implementations would be part of it. Later on we can add stream encryption to the same package.

Andrei Alexandrescu

line length

added some commits August 29, 2011
Kai Nacke Renamed package digest to crypt. ddbb69e
Kai Nacke Removed files in package digest.
- Removed files in std/digest
- Removed files in /std/internal/digest
- Added changed makefiles
2550298
Kai Nacke Issue 6459 seems to be solved. 4030fa3
Piotr Szturmaj

Please consider adapting my implementation of SHA family functions: https://github.com/pszturmaj/dmisc/blob/master/sha.d. There are almost all SHA flavors, but they lack SSE implementation.

David Simcha
Collaborator

Ping? Are we still waiting on anything?

std/crypt/md5.d
((100 lines not shown))
  100
+
  101
+void sum(ref ubyte[16] digest, in void[][] data...)
  102
+{
  103
+    MD5_CTX context;
  104
+    context.start();
  105
+    foreach (datum; data)
  106
+    {
  107
+        context.update(datum);
  108
+    }
  109
+    context.finish(digest);
  110
+}
  111
+
  112
+// /******************
  113
+//  * Prints a message digest in hexadecimal to stdout.
  114
+//  */
  115
+// void printDigest(const ubyte digest[16])
1
David Nadlinger Collaborator

Huh, seems like a temporary debugging helper? Is it still in there for a reason?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
std/crypt/md5.d
((161 lines not shown))
  161
+
  162
+version(unittest) import std.stdio;
  163
+unittest
  164
+{
  165
+    string a = "Mary has ", b = "a little lamb";
  166
+    int[] c = [ 1, 2, 3, 4, 5 ];
  167
+    string d = getDigestString(a, b, c);
  168
+    assert(d == "F36625A66B2A8D9F47270C00C8BEFD2F", d);
  169
+}
  170
+
  171
+/**
  172
+ * Holds context of MD5 computation.
  173
+ *
  174
+ * Used when data to be digested is buffered.
  175
+ */
  176
+struct MD5_CTX
1
David Nadlinger Collaborator

Not a very D-ish name, same for all the helper functions ROTATE_LEFT, …

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
David Nadlinger
Collaborator

Ah, I see, these were carried over from the original std.md5

Andrei Alexandrescu
Owner

Could you please rebase so we can merge this? Thanks!

Piotr Szturmaj

Do we really want comprehensive std.crypto or just few hashes? I like this SHA1 implementation because of SSSE3 but SHA1 is old and is not recommended for current cryptography, just for backward compatibility. What about SHA 224, 256, 384 or 512?

What about polymorphism - Imagine using HMAC with a hash which is known only at run time. Static ifs are not suitable because some hashes take parameters of great aplitude.

Brad Roberts
Owner

As discussed in the newsgroups a while ago, I'm against having either of these in phobos (I know that md5 has been there essentially forever). There is a set of import files for openssh in deimos which is a far better path, imho. Re-invention and dealing with all the touchy security implications of having our own implementation of this sort of code is a bad idea.

Piotr Szturmaj

This is not re-invention, just implementation of well known algorithms in D. You are probably thinking about some side-channel attacks, which are practically impossible to overcome, even in OpenSS[H|L]. D makes writing correct and safe code easy. Why not make use of that in crypto code?

Btw. I begun working on this some time ago: http://prowiki.org/wiki4d/wiki.cgi?CryptoDevel.

Brad Roberts
Owner

I'm purposely not talking about specific failure modes since there are many. The entire category is very very hard to get right and the cost of failure is high. There's no way our implementation of any of this category of code will receive the same scrutiny and attention as openssl will get.

For those that want to help improve things in that space, please contribute directly to openssl. You'll get a lot more help with proving them safe and correct and more people will benefit.

Should there be a higher level, more D-like, layer on top of openssl? Absolutely, please. There's just about nothing friendly about openssl's api.

Daniel Murphy
Collaborator

What about code that doesn't need to be secure? If I'm computing the hash of a file for verification, do I care as long as the algorithm works? Do we really want a dependency on openssl for tasks this simple? At a quick look Java and Python seem to have these available, although they could be using openssl internally for all I know.

Kai Nacke

IMHO, there are 2 different views which both should be supported.

If you talk about security then you really talk about a plugin architecture which makes it easy to replace algorithms and implementations. In other words, you need a framework like the Java Crypto Engine. E.g., OpenSSL is great. But I use mainly Windows. Maybe I can't believe that Unix hackers can write secure Windows code, too. ;-) With such a plugin architecture I would be free to choose between a native D implementation, OpenSSL or the stuff provided by Windows.

But: I played with git and simply needed a SHA1 implementation. I did not care about security but wanted a D interface and a fast implementation. That's all my pull request is about. As others already mentioned, this could be used to built something like a "D Security Provider".
If someone creates the mentioned plugin architecture then we could have both.

Andrei Alexandrescu
Owner

OK, let's see. Is @redstar still around? If so, please rebase and let's get this puppy in. Thanks!

added some commits August 25, 2011
Kai Nacke Add SHA-1 message digest
SHA-1 is an important message digest. E.g. it is used by git. This implementation features:
- optimized standard implementation. E.g. if compiled with -O -inline -release then there is not single function call or loop.
- if SSSE3 support is detected then a special assembler function is used. This gives a speedup of 1.4 on 32 bit and about 1.8 on 64 bit
The 64 bit implementation is a bit clumsy because of some issues with the dmd compiler (issues 6459 and 5355).
Supports 32 bit Windows and 32 bit and 64 Linux. MacOS X, FreeBSD are untested.
75af4b6
Kai Nacke Merge remote-tracking branch 'origin/sha1' into sha1
Conflicts:
	posix.mak
	win32.mak
a30ae94
Kai Nacke

Yes, I am still around. I rebased this pull request. Hopefully, this works now. I had some difficulties with both makefiles.

Lars T. Kyllingstad
Collaborator

Before this is merged, please take into account pull request #585 (and the associated discussion).

Jonathan M Davis
Collaborator
jmdavis commented May 28, 2012

Given the situation with crc32 as discussed in pull request #585 (i.e. the fact that it's a hash function but not really cryptographic), I'd strongly argue that we should go with std.hash rather than std.crypt.

Alex Rønne Petersen
Collaborator
alexrp commented June 01, 2012

@redstar Ping? I think we need some input on whether you're OK with using std.hash instead of std.crypt so we can have both this and #585 merged.

Kai Nacke

No problem. I just renamed the package from std.crypt to std.hash.

Jonathan M Davis
Collaborator

Why are so many of the functions in this pull request using the wrong naming conventions? Functions should be camelCased (e.g. decode, not Decode, and rotateLeft, not ROTATE_LEFT ), as should variable names. Types should be PascalCased.

Kai Nacke

I wanted to have the same interface as std.md5. Therefore the unusual names. But I can change the new classes to have names with the right conventions.

Jonathan M Davis
Collaborator

std.md5 really shouldn't have those names either. It's only that way, because it's old. And actually, since it's being moved to std.hash.md5, it's names should be fixed. That may mean leaving std.md5 where it is (still scheduled for deprecation - just without doing the public import of std.hash.md5), if aliases in std.md5 aren't enough (and at first glance, it doesn't look like they are, since it's not just free functions), but it gives us the opportunity to fix its naming scheme. Having a module which uses an alternate naming scheme makes it harder to remember and use its functions, and we already went to a fair bit of work to fix most of the rest of Phobos' names so that they were consistent (though obviously std.md5 was missed).

So, please fix both std.hash modules so that they follow the correct naming scheme (PascalCase for types and camelCase for pretty much everything else).

Kai Nacke

Ok, I start changing the names.

Jonathan M Davis
Collaborator

Ok, I start changing the names.

Thanks.

Jonathan M Davis
Collaborator

A thread has been started in the newsgroup about cleaning up and standardizing the API for hash functions:

http://forum.dlang.org/post/js1e5p$1q6d$1@digitalmars.com

Unfortunately, the first message seems to have been lost from the forum somehow, but the first reply retains enough of it that you can figure out what it said.

Johannes Pfau
jpf91 commented June 24, 2012

I posted a follow up message regarding the new API, please see
http://forum.dlang.org/thread/js7bd7$7sr$1@digitalmars.com#post-js7bd7:247sr:241:40digitalmars.com

I also included your SHA1 implementation. Because of the new API I had to modify the file structure a lot, but I made sure you're still listed as the original author in the documentation and in the git log, I hope that's OK.

However, you didn't add a license to your module, is it OK to release it under the boost license?

And if you could do me a favor: Could you please test the SSSE3 code again? I modified the transform function signature, so it's:

transformSSSE3(uint[5]* state, const(ubyte[64])* buffer)

Imho that is better than silently assuming the length. However, code which did this before:

state[0]

now has to be rewritten into this:

(*state)[0]

And I can't check the SSSE3 code right now.

Andrei Alexandrescu
Owner

We have now the recently approved proposal that includes SHA-1.

Andrei Alexandrescu andralex closed this July 08, 2012
Jonathan M Davis
Collaborator

Approved proposal? A redesign of the hash stuff has been under discussion, but I'm not aware of anything being "approved" with relation to that. It's still being sorted out and is probably going to need a full review in the newsgroup. That being said, no, we don't need this pull request anymore, because a fuller redesign is being worked on.

Johannes Pfau
jpf91 commented July 09, 2012

The SHA-1 implementation used in the redesign is actually this implementation, so closing this pull request is probably fine.

I think the redesign is ready for review now, so I'll add std.hash to the review queue if we decide that a full review is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 28 unique commits by 1 author.

Aug 25, 2011
Kai Nacke Add SHA-1 message digest
SHA-1 is an important message digest. E.g. it is used by git. This implementation features:
- optimized standard implementation. E.g. if compiled with -O -inline -release then there is not single function call or loop.
- if SSSE3 support is detected then a special assembler function is used. This gives a speedup of 1.4 on 32 bit and about 1.8 on 64 bit
The 64 bit implementation is a bit clumsy because of some issues with the dmd compiler (issues 6459 and 5355).
Supports 32 bit Windows and 32 bit and 64 Linux. MacOS X, FreeBSD are untested.
4592a4b
Aug 26, 2011
Kai Nacke Created new package std.digest.
- Moved sha1 and md5 to new package
- Introduced dummy modul std.md5
7ec2946
Aug 29, 2011
Kai Nacke Renamed package digest to crypt. ddbb69e
Aug 30, 2011
Kai Nacke Removed files in package digest.
- Removed files in std/digest
- Removed files in /std/internal/digest
- Added changed makefiles
2550298
Aug 31, 2011
Kai Nacke Issue 6459 seems to be solved. 4030fa3
Sep 24, 2011
Kai Nacke Reverted last commit.
Issue 6459 is not fixed!
4d31a27
Oct 10, 2011
Kai Nacke Merge branch 'master' into sha1
Conflicts:
	win32.mak
a1bba6b
Jan 20, 2012
Kai Nacke Merge branch 'master' into sha1
Conflicts:
	std/md5.d
	win32.mak
ff3b2a5
Kai Nacke Updated example code. 3be3dc4
Apr 24, 2012
Kai Nacke Add SHA-1 message digest
SHA-1 is an important message digest. E.g. it is used by git. This implementation features:
- optimized standard implementation. E.g. if compiled with -O -inline -release then there is not single function call or loop.
- if SSSE3 support is detected then a special assembler function is used. This gives a speedup of 1.4 on 32 bit and about 1.8 on 64 bit
The 64 bit implementation is a bit clumsy because of some issues with the dmd compiler (issues 6459 and 5355).
Supports 32 bit Windows and 32 bit and 64 Linux. MacOS X, FreeBSD are untested.
75af4b6
Kai Nacke Merge remote-tracking branch 'origin/sha1' into sha1
Conflicts:
	posix.mak
	win32.mak
a30ae94
Apr 25, 2012
Kai Nacke Merge branch 'master' into sha1 49de85a
May 10, 2012
Kai Nacke Merge branch 'master' into sha1
Conflicts:
	std/md5.d
929a3dd
May 13, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 b2c2e9f
May 14, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 dba5740
May 15, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 b9d179e
May 21, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 29748b9
May 22, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 9f997a2
May 23, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 e4002ea
May 24, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 7e09c7a
May 26, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 c4bd212
May 28, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 ea897b8
May 29, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 f2ba4a8
May 30, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 0ff5e1f
Jun 01, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 5b63c22
Jun 04, 2012
Kai Nacke Merge remote-tracking branch 'upstream/master' into sha1 6eccf3f
Jun 05, 2012
Kai Nacke Merge branch 'master' into sha1
Conflicts:
	win32.mak
b038959
Kai Nacke Rename std.crypt to std.hash to be compatible with #585. d0c7437
Something went wrong with that request. Please try again.