-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SHA-3 and SHAKE (Keccak) support #60317
Comments
Today the latest crypto hash function was announced by NIST [1]. I suggest that we include the new hash algorithm in 3.4 once it lands in OpenSSL. The Keccak site also has a reference implementation in C and Assembler [2]. It may take some effort to integrate the reference implementation as it contains several optimized backends for X86, X86_64, SIMD and various ARM platforms. [1] http://www.nist.gov/itl/csd/sha-100212.cfm |
We have MD5, SHA1, sha256, sha512 implemented, to use when openssl is not available. Can we do the same with sha-3?. I would suggest to adopt the reference implementation without extensive optimizations, since we will have them when openssl has them. So we might implement SHA-3 now and integrate OpenSSL implementation later, when available. This is interesting, for instance, because many users of Python 3.4 will have a non "up to date" OpenSSL system library. |
I've done some experiments with the reference implementation and adopted code of sha1module.c for sha3: https://bitbucket.org/tiran/pykeccak So far the code just compiles (64bit only) but doesn't work properly yet. I may need to move away from the NIST interface and use the sponge interface directly. |
For what it's worth, I've built a working C-based sha3-module that is available here: https://github.com/bjornedstrom/python-sha3 Note that I've only tested this on Python 2, for Python 3 YMMV. Best regards |
Hello Björn, thanks for the information. Your package didn't turn up on Google when I started with my experiment. Perhaps it's too new? Your code and mine have lots of similarities. I was amused when I saw that you had the same issue with the block size attribute. At first I set it to 200 (1600 / 8) but eventually I didn't implement it. My code does everything in C with a separate constructor for each flavor of SHA-3. It's compatible to Python 2.6 to 3.4 and uses the optimized code for 32 and 64bit platforms. Oh, and my code is now working properly. Feel free to review the module. I'll upload the test code later. |
Release 0.1 of pysha3 [1] is out. I've tweaked the C module to make it compatible with Python 2.6 to 3.4. The module and its tests run successfully under Linux and Windows. So far I've tested Linux X84_64 (2.7, 3.2, 3.3, 3.4), Windows X86 (2.6, 2.7, 3.2, 3.3) and Windows X86_64 (2.6, 2.7, 3.2, 3.3). Please review Modules/sha3module.c and ignore all version specific #if blocks. For Python 3.4 I'm going to remove all blocks for Python < 3.3. |
Can't you post a patch here? |
How about a sandbox repos? |
Good, you can click the "create patch" button when it's ready :) |
Antoine pointed out that the code contains C++ comments and exports a lot of functions. The latest patch has all // comments replaced, marks all functions and globals as static and #includes the C files directly. |
Please review the latest patch. I've included Gregory as he is the creator of hashlib. |
The hightlights of the next patch are
|
I've documented the optimization options of Keccak. The block also contains a summarization of my modifications of the reference code. http://hg.python.org/sandbox/cheimes/file/57948df78dbd/Modules/_sha3/sha3module.c#l22 |
New patch. I've removed the dependency on uint64 types. On platforms without a uint64 type the module is using the 32bit implementation with interleave tables. By the way the SSE / SIMD instructions aren't useful. They are two to four times slower. |
don't worry about optimization settings in python itself for now. the canonical optimized version will be in a future openssl version. now that it has been declared the standard it will get a *lot* more attention in the next few years. as it is, we _may_ want to replace this reference implementation with one from libtomcrypt in the future when it gets around to implementing it just so that the code for all of our bundled hash functions comes from the same place. |
New changeset 11c9a894680e by Christian Heimes in branch 'default': |
The code has landed in default. Let's see how the build bots like my patch and the reference implementation. |
_sha3 is not being built on Windows, so importing hashlib fails >>> import hashlib
ERROR:root:code for hash sha3_224 was not found.
Traceback (most recent call last):
File "C:\Repos\cpython-dirty\lib\hashlib.py", line 109, in __get_openssl_constructor
f = getattr(_hashlib, 'openssl_' + name)
AttributeError: 'module' object has no attribute 'openssl_sha3_224'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Repos\cpython-dirty\lib\hashlib.py", line 154, in <module>
globals()[__func_name] = __get_hash(__func_name)
File "C:\Repos\cpython-dirty\lib\hashlib.py", line 116, in __get_openssl_constructor
return __get_builtin_constructor(name)
File "C:\Repos\cpython-dirty\lib\hashlib.py", line 104, in __get_builtin_constructor
raise ValueError('unsupported hash type ' + name)
ValueError: unsupported hash type sha3_224
... |
I've pushed a fix about 5 minutes ago. The module wasn't compiled in debug builds due to an error in the project file. Please update your copy and try again. |
6cf6b8265e57 and 8172cc8bfa6d have fixed the issue on my VM. I didn't noticed the issue as I only tested hashlib with the release builds, not the debug builds. Sorry for that. |
Ah. I did not even notice there was _sha3.vcxproj. Is there any particular reason for not making it part of python3.dll like _sha1, _sha256, _sha512 are? (I thought it was only modules with special link requirements that became separate DLLs.) |
The module is rather large (about 190 KB) because the optimized SHA-3 implementation isn't optimized for size. For this reason I like to keep the module out of the main binary for now. |
Please do not go forward until NIST publishes its SHA-3 specification document. We don't know yet what parameters they will finally choose when making Keccak SHA-3. |
NIST has published a tentative schedule for SHA-3 standardization. They expect to publish in the second quarter of 2014. See http://csrc.nist.gov/groups/ST/hash/sha-3/timeline_fips.html and http://csrc.nist.gov/groups/ST/hash/sha-3/sha-3_standardization.html |
As long as the reference Keccak code is going to live in the python stdlib anyway, I would /greatly/ appreciate it if the Keccak sponge function was directly exposed instead of just the fixed parameters used for SHA-3. A Keccak sponge can have a much wider range of rates/capacities, and after absorption can have any number of bytes squeezed out. The ability to get an unbounded number of bytes out is very useful and I've written some code that uses that behavior. I ended up having to write my own Keccak python library since none of the other SHA-3 libraries exposed this either. |
Hi Aaron, it's a tempting idea but I have to decline. The API is deliberately limited to the NIST interface. Once OpenSSL gains SHA-3 support we are going to use it in favor for the reference implementation. I don't expect OpenSSL to provide the full sponge API. I also like to keep all options open so I can switch to a different and perhaps smaller implementation in the future. The reference implementation is huge and the binary is more than 400 KB. For comparison the SHA-2 384 + 512 module's binary is just about 60 KB on a 64bit Linux system. Once a a new API has been introduced it's going to take at least two minor Python release and about four to five years to remove it. But I could add a more flexible interface to Keccak's sponge to my standalone sha3 module https://pypi.python.org/pypi/pysha3 ... |
https://pypi.python.org/pypi/cykeccak/ is what I've written to do this, for reference. Honestly I hope that the Keccak sponge is directly exposed in openssl (or any other SHA-3 implementation) because of its utility beyond SHA-3. If the source of some other implementation is going to be bundled with python anyway, it shouldn't be difficult to expose the sponge bits. |
I should clarify, I don't speak for 2.7. The rules there are a little different and it's up to Benjamin to decide. But please don't add new features to 3.4 and 3.5. |
Remember that FIPS202 slightly change some parts of the Keccak that won the competition, so test results are different. I updated my stand alone SHA3 module, for anyone who is interested in using this now in Python 2 and 3. |
The authors of Keccak have released a new version of the Keccak Code Package, http://keccak.noekeon.org/reorganized_code.html . The new package makes it much easier to integrate Keccak in Python. I'm working on a new patch with SHA3 and SHAKE support. |
This patch implements SHA-3 and SHAKE for Python 3.6. The algorithm is provided by a slightly modified copy of the Keccak Code Package. I had to replace C++ comments and perform some minor cleanups. |
Is there any guidance or recommendation on how to use the SHAKE variants? |
Christian: any interest in proposing this for 2.7? We could ask Benjamin. It could still make 2.7.11--rc1 should be tagged in about a month. |
I'd there any good reason 2.7 needs this? They are available via pypi as On Sat, May 7, 2016, 3:15 AM Larry Hastings <report@bugs.python.org> wrote:
|
Larry, Antoine, |
New patch:
|
comments added to the code review. |
Patch 3 addresses GPS' code review. |
I don't think this is sufficient motivation. Each new API is a permanent maintenance and documentation burden. It is also a burden to every new user seeing the module and trying to decide which offering to use. We should provide tools that we know people need and error on the side of economy. I asked a room full of network engineers about SHAKE and not a single one of them had heard of it, so I think it would be premature to add to the standard library. |
Why would a network engineer know about a new variable length hashing algorithm? It's not really within their problem domain. |
I'm not sure why one would pick and choose here—SHAKE is part of the NIST |
The maintenance burden is minimal. All six algorithms are just variants of the same KeccakP-1600 sponge construction with different initialization parameters for rate, capacity, delimiter and output size. SHAKEs have no default output len and another delimiter as SHA3s. https://github.com/gvanas/KeccakCodePackage/blob/master/Modes/KeccakHash.h#L34 |
New changeset f8700ee4aef0 by Christian Heimes in branch 'default': |
New changeset 4971ca2960c7 by Christian Heimes in branch 'default': |
New changeset e8884dcace9f by Christian Heimes in branch 'default': |
New changeset 68df416e94ba by Christian Heimes in branch 'default': |
A buildbot is complaining about strict aliasing: In file included from /buildbot/buildarea/3.x.ware-gentoo-x86.installed/build/Modules/_sha3/sha3module.c:113:0: |
New changeset ddc95a9bc2e0 by Christian Heimes in branch 'default': |
New changeset e5871ffe9ac0 by Christian Heimes in branch 'default': |
Christian, since the code is now integrated in Python 3.6+ (with some bugfixes AFAICS), could you consider updating your bitbucket package to match it? It would be helpful as a backport package for older Python versions. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: