New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speck, Android and Linux kernel interop #585

Closed
noloader opened this Issue Feb 13, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@noloader
Collaborator

noloader commented Feb 13, 2018

The Linux kernel added Speck-64 and Speck-128 for opportunistic encryption on low resource devices that would otherwise have to disable encryption on the file system. Also see [PATCH v2 0/5] crypto: Speck support on the kernel-crypto mailing list.

The problem is we are not going to interop well with the kernel. The problem is due to confusion in the Simon and Speck algorithmic description and the test vectors. The algorithmic description and test vectors did not quite align, and we went down one path (follow the test vectors) and the kernel went down another path (follow the algorithmic description).

It is possible to to interop with the kernel using our implementation but... (1) it is not readily apparent how to do it, and (2) there's a loss of efficiency when doing it. The interop can happen now by providing an adapter class. If my estimates are correct, then we could loose 2 to 4 cpb on a conversion via the adapter. That means SSE4 could drop from 2.1 cpb to about 4 or 6 cpb.

Also not readily apparent, the same interop issue affect Simon, too. At this point in time we are not aware of anyone implementing Simon and diverging due to the algorithmic description versus test vector results.

Simon and Speck has been part of the library since December 2017. We are probably running between the cracks, but we can likely make changes and take action before the 6.1 release and anyone notices.

This will track our handling of the issue and document how we close the gap.


Also see Speck, Android and Linux kernel interop on the Crypto++ mailing list.

@ebiggers

This comment has been minimized.

ebiggers commented Feb 13, 2018

It is possible to to interop with the kernel using our implementation but... (1) it is not readily apparent how to do it, and (2) there's a loss of efficiency when doing it. If my estimates are correct, then we could loose 2 to 4 cpb on a conversion. That means SSE4 could drop from 2.1 cpb to about 4 or 6 cpb.

Why would it be slower? The data only needs to be packed/unpacked once, not once per cipher round. With SSE2 you need _mm_unpacklo_epi64() (PUNPCKLQDQ) and _mm_unpackhi_epi64 (PUNPCKHQDQ) anyway to de-interleave the (x, y), and similarly for interleaving them again at the end. So it should be easy to relabel the registers such that the word order is (y, x). Also since the byte order is now little endian the byte swap via _mm_shuffle_epi8(), i.e. PSHUFB, isn't needed anymore which would make the new convention faster actually.

Also just to clarify, the above linked LKML thread is for v1 of the patchset which used the (x, y) word order. But in v2 (https://www.mail-archive.com/linux-crypto@vger.kernel.org/msg30925.html) it has been changed to (y, x) since that's what the Speck designers said is the intended order.

@noloader

This comment has been minimized.

Collaborator

noloader commented Feb 13, 2018

@ebiggers,

Why would it be slower? The data only needs to be packed/unpacked once, not once per cipher round

My bad... I was referring to an adapter class to frame the data. That would accommodate both behaviors with minimal changes. Let me get that added above.

The other options are (1) add a second implementation so both behaviors are stand alone and perform at speed, (2) provide only one implementation and align with the kernel's implementation by modifying the existing implementation. A variation on (2) is, provide the adapter class on top of the updated Speck implementation so the old behavior is still available through the adapter but slower.

I'm learning towards (2). We'll modify the Crypto++ implementation to align with the kernel's implementation. The modified Speck will interop and run at speed. Those who need the incorrect behavior can get it through an adapter class that runs a little slower.

Also just to clarify, the above linked LKML thread is for v1 of the patchset which used the (x, y) word order. But in v2 ... it has been changed to (y, x) since that's what the Speck designers said is the intended order.

OK, thanks. The reference has been changed.

noloader added a commit that referenced this issue Feb 14, 2018

Remove Simon and Speck ciphers (GH #585)
We recently learned our Simon and Speck implementation was wrong. The removal will stop harm until we can loop back and fix the issue.
The issue is, the paper, the test vectors and the ref-impl do not align. Each produces slightly different result. We followed the test vectors but they turned out to be wrong for the ciphers.
We have one kernel test vector but we don't have a working implementation to observe it to fix our implementation. Ugh...
@noloader

This comment has been minimized.

Collaborator

noloader commented Feb 14, 2018

Simon and Speck were removed at Commit 15b14cc61890. This is a stop-gap measure to ensure we don't do any harm. We added the following text to the Simon and Speck wiki pages:

Note: Simon and Speck were removed after Crypto++ 6.0 was released at Commit 15b14cc61890. Our implementation was wrong because we failed to follow the algorithmic description presented in the paper. Instead, we used the reference implementation and the test vectors which turned out to be incorrect. Also see Issue 585.

The original files for the Simon and Speck implementation are available in simon-speck.zip. They are in GitHub for folks who need them.

We will loop back around to this issue when we have the time to untangle things. We really need (1) a good set of test vectors; and (2) a working reference implementation. We don't have either at the moment.

We have one test vector from the Linux kernel, but our implementation failed to arrive at the known answer. We did some knob turning but failed to arrive at the answer. Plugging the kernel test vector into the reference implementation also failed to arrive at the answer.

Here is the kernel test vector provided by email. More should be available in the patch set for testmgr.c on the kernel mailing list.

        {
                .key    = "\x00\x01\x02\x03\x04\x05\x06\x07"
                          "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
                          "\x10\x11\x12\x13\x14\x15\x16\x17"
                          "\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f",
                .klen   = 32,
                .input  = "\x70\x6f\x6f\x6e\x65\x72\x2e\x20"
                          "\x49\x6e\x20\x74\x68\x6f\x73\x65",
                .ilen   = 16,
                .result = "\x43\x8f\x18\x9c\x8d\xb4\xee\x4e"
                          "\x3e\xf5\xc0\x05\x04\x01\x09\x41",
                .rlen   = 16,
        },

@noloader noloader closed this Feb 15, 2018

@noloader noloader added bug and removed Enhancement labels Feb 15, 2018

noloader added a commit that referenced this issue Feb 15, 2018

noloader added a commit that referenced this issue Feb 18, 2018

Re-add Simon and Speck test vectors (GH #585)
Of the 200+ test vectors only 10 are semi-authentic. The ten are from the Simon and Speck paper but they had permutations applied to them so they worked with the algorithms described in the paper. The remaining 200 or so were generated with Crypto++ using straight C++ code. The library generated the test vectors because we don't have a reference implementation

noloader added a commit that referenced this issue Feb 19, 2018

Re-add Simon and Speck, enable SSE (GH #585)
This commit re-adds Simon and Speck. The commit includes C++, SSSE3 and SSE4. NEON, Aarch32 and Aarch64 are disabled at the moment.

noloader added a commit that referenced this issue Feb 19, 2018

Re-add Simon and Speck, enable NEON and Aarch64 (GH #585)
This commit re-adds Simon and Speck. The commit includes NEON, Aarch32 and Aarch64

noloader added a commit that referenced this issue Feb 19, 2018

Remove previous Simon and Speck zip file (GH #585)
The implementation was incorrect. The zip file was added to allow access for users who needed it.

noloader added a commit that referenced this issue Feb 19, 2018

noloader added a commit that referenced this issue Feb 20, 2018

Fix ODR violation in AdvancedProcessBlocks_{ARCH} (GH #585)
The ALTIVEC function required an inline declaration. Lack of inline caused the self test failure. Two NEON functions needed the same. We also cleaned up constants in unnamed namespaces
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment