-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Silicon Labs: Add cryptographic acceleration support #4825
Conversation
Why is AES-192 much faster than AES-128? upd: I see, it's unsupported ( |
@amq Correct. Sadly, the benchmark application doesn't actually check the validity of the operation. That's why one needs to run the mbed OS tests for mbed TLS in addition (though not a bad idea regardless). |
@stevew817 The travis errors were resolved in #4839. Could you rebase so that travis can pass? |
@stevew817 Thanks for the PR. We use the PR titles in release notes, which are formatted with markdown. The |
Initial commit of mbed TLS hardware acceleration drivers for Silicon Labs parts
18e0947
to
1e51dfa
Compare
@theotherjimmy Rebased and renamed. |
Excelent. Let's do some reviewing! |
PR is ready for review @RonEld @yanesca @sbutcher-arm @andresag01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is a mixture of several features and topics. It would have been better if this PR was split to several PRs
There are some recurring comments on the critical section, that probably only needs expanation, but there are other comments that need your addressing.
@hanno-arm please review as well
|
||
#include "em_device.h" | ||
|
||
#if defined(CRYPTO_PRESENT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this Macro is undefined, you will probably have link error, because you will have undefined functions - the aes functions will not be compiled in mbedtls/src/aes.c because of the MBEDTLS_AES_ALT
, and they will not be compiled in this file because of this macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MBEDTLS_AES_ALT is only defined when CRYPTO_PRESENT is defined (see mbedtls_device.h)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevew817: I think you are correct, but given that the dependency is MBEDTLS_AES_ALT on CRYPTO_PRESENT and not the other way around (as in mbedtls_device.h
) I think that this preprocessor check should be around #if defined(MBEDTLS_AES_ALT)
and not the other way around. At the moment this is not an issue, but if in the future this changes, then it is much easier to make a mistake...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at mbedtls_device.h
I would think that MBEDTLS_AES_ALT
is also defined when AES_PRESENT
is defined. And if I understand correctly in this caseCRYPTO_PRESENT
can't be defined.
Could you please help me understand
- why are the contents of
crypto_aes.c
protected by#if defined(CRYPTO_PRESENT)
guards, but the contents ofaes_aes.c
are not protected by#if defined(AES_PRESENT)
macros? - why are
mbedtls_aes_decrypt()
andmbedtls_aes_encrypt()
implemented inaes_aes.c
but not incrypto_aes.c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- aes_aes.c is protected by AES_PRESENT (line 28)
- mbedtls_aes_encrypt and _decrypt not being present in crypto_aes was an oversight, fixed.
{ | ||
uint32_t temp[4]; | ||
CRYPTO_DataRead(reg, temp); | ||
memcpy((void*)val, temp, 16); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is better to do sizeof(temp)
rather than 16.
if ((uint32_t)val & 0x3) | ||
{ | ||
uint32_t temp[4]; | ||
memcpy(temp, val, 16); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is better to do sizeof(temp)
rather than 16.
Can you assure that val
is of length 16?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. If you look through the usages, all calls to DataWrite/DataRead happen with buffers that are checked to be 16 bytes. They have to, since the Data registers are 128 bits, and don't accept partial reads/writes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
if ( ( 128UL != keybits ) && ( 256UL != keybits ) ) { | ||
/* Unsupported key size */ | ||
return( MBEDTLS_ERR_AES_INVALID_KEY_LENGTH ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that once Mbed-TLS/mbedtls#964 will be approved and merged, this error should probably be changed to MBEDTLS_ERR_AES_FEATURE_UNAVAILABLE
for 192 bit size key.
But since the PR is not merged and fully approved, it is not final yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the PR hasn't made it in, I can't return the new error code in my port...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, this comment was only an FYI, so will know that you will probably need to change this in the future
device->CMD = CRYPTO_CMD_INSTR_AESENC; | ||
while ((device->STATUS & CRYPTO_STATUS_INSTRRUNNING) != 0); | ||
|
||
CORE_ENTER_CRITICAL(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is a critical section.
What resource are you protecting? Is it device? if yes, then you should protect also the device->WAC = 0
and device->CTRL = 0
a few lines before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resource we are protecting with critical sections is writing to or reading from 128-bit, 256-bit or 512-bit registers in our accelerator. Since that operation is done with actionable reads/writes from/to a single 32-bit register, they have to be protected against preemption.
The other registers (such as WAC and CTRL) are regular registers, which we get control over by acquiring the device. In the case of preemption, these values will be backed up and put back by the preempting code, so their use is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok , thanks
crypto->SEQCTRLB = 0; | ||
|
||
/* Put the state back */ | ||
CORE_ENTER_CRITICAL(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment on the critical section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as before on the consecutive reads/writes needed to load a full 128-bit value
size_t blocks = ilen / 64; | ||
crypto_sha_update_state( ctx->state, input, blocks, CRYPTO_SHA2 ); | ||
input += blocks * 64; | ||
ilen -= blocks * 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't ctx->total[0] be decreased with what was processed already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx->total is an upcounter to count the amount of SHA'd bytes, and it is updated with the total amount of data passed to this function (ilen) earlier in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you tested a use case of Doing twice update of 65 bytes, and then finish?
I think I understand what you did.
Isn't it better to have only one counter, that will only tell the size of the "residue buffer"? Once this "counter" will reach 64, you will call crypto_sha_update_state
, and set this residue buffer size to zero
mbedtls_sha1_process( ctx, ctx->buffer ); | ||
input += fill; | ||
ilen -= fill; | ||
left = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't ctx->total[0] be decreased with what was processed already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx->total is an upcounter to count the amount of SHA'd bytes, and it is updated with the total amount of data passed to this function (ilen) earlier in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand what you did.
Isn't it better to have only one counter, that will only tell the size of the "residue buffer"? Once this "counter" will reach 64, you will call crypto_sha_update_state
, and set this residue buffer size to zero
* \param ctx SHA-1 context | ||
* \param output SHA-1 checksum result | ||
*/ | ||
void mbedtls_sha1_finish( mbedtls_sha1_context *ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very much similar as mbedtls_sha256_finish
.
Consider combining the two, when possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the context struct and output length are different, I don't really see the point of adding yet another function call to the stack for maybe a few bytes of code space gain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
#define MBEDTLS_AES_ALT | ||
|
||
#define MBEDTLS_ECP_INTERNAL_ALT | ||
#define ECP_SHORTWEIERSTRASS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be defined in this file. It's configuration specific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay.
* Use _C flags at compile time in SHA to avoid compiling in unconfigured features * Don't define ECP_SHORTWEIERSTRASS since it is part of the application's configuration
@RonEld I think I answered all of your questions and committed your feedback. |
@0xc0170 why did uvisor CI fail? |
retest uvisor |
Jenkins error, needs restart |
@0xc0170 Did the retest work? It looks to be still failing... |
@mazimkhan Can you please restart uvisor? Seems like my last comment did not |
@RonEld did I satisfy your concerns? |
@stevew817 I am still not sure why you can't set the key buffer in teh context when you set the key, and when doing the specific cryptographic operation, you set the key in the HW register |
@RonEld This only applies when setting a decryption key. For AES on our parts, the 'key' needs to be pre-processed when you are doing a decryption operation. This pre-processing is what you are seeing in setkey_dec. |
@stevew817 Thanks for the reference
Work, or will you need to load the Cipher key again between every operation? |
@RonEld We don't read back the key register after aes_crypt_cbc, and the next call will put the key from the context in the keybuf register, so the operation you mentioned will work. |
OK, thanks |
@RonEld Do I need a review from anyone else? |
@stevew817 Yes. I am not hte final gate keeper for mbed TLS |
@adbridge done. |
@mazimkhan @hanno-arm please confirm you are happy with the rework |
@adbridge: @Patater is stepping in for @mazimkhan during his paternity leave. I'll synchronize with @sbutcher-arm to see if this needs a third review with @RonEld's still present from the time before the reshuffling. |
@Patater Could you give this the final once over then please. This has been outstanding for 3 months |
/morph build |
Build : SUCCESSBuild number : 460 Triggering tests/morph test |
Test : SUCCESSBuild number : 280 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good to me; it's close. I have a number of questions and style/maintainability suggestions.
In general, the coding style is not consistent in this PR. Please consider using a consistent coding style (preferably Mbed OS style).
aes_lock(); | ||
AES_ECB128( output, | ||
input, | ||
16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to use MBEDTLS_AES_BLOCK_SIZE
or AES_BLOCKSIZE
instead of "16" wherever applicable in this PR? (Might require making a new define in this file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the 16 byte size is also hardcoded in the function signature, I think it should be fine hardcoding it here as well. The blocksize is never -ever- going to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
The main reason for doing this would be readability and not because we'd want to change it in the future.
processed++; | ||
} | ||
|
||
crypto_management_release(device); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird that crypto_management_acquire
is called inside the else
but crypto_management_release
is called outside the else. Due to the use of continue
in the if
body, consider removing the else
or moving the crypto_management_release
inside the else
body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you read the code wrong? _acquire and _release are both at the same level inside the else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
I now see crypto_management_release
is inside the else body.
n = ( n + 1 ) & 0x0F; | ||
processed++; | ||
continue; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the continue
in the if
body, consider if this else
is really needed.
Similar crypto_management_acquire
and crypto_management_release
imbalance is here as in mbedtls_aes_crypt_cfb128
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here... Are we looking at different versions of the code? I really do see acquire on line 498, which is inside the else and on the same level as the release on line 543.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
I now see crypto_management_release
is inside the else body.
for (size_t i = 0; i < iterations; i++ ) { | ||
device->CMD = CRYPTO_CMD_INSTR_DATA1TODATA0; | ||
device->CMD = CRYPTO_CMD_INSTR_AESENC; | ||
while ((device->STATUS & CRYPTO_STATUS_INSTRRUNNING) != 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we had a way to avoid these busy loops.
See #5206 to track this issue.
However, even if this was addressed in Mbed OS through, for example, a new Mbed-OS-specific feature, could you make use of it? You mention this code must run on other platforms than Mbed OS, but I'd hate to see potential energy efficiency gains thrown away for the sake of portability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a future feature best implemented using the mbedTLS threading primitives (if the api would support semaphores in addition to mutexes, since the release would happen from IRQ).
For this specific case, I doubt it would make a whole lot of sense since an AES block operation only takes about 70 clock cycles anyways. Context-switching away and back again would be more expensive. Halting the core might be an alternative.
For efficiency when using accelerators, how about adding an API for multi-block ECB operations? Setting up and tearing down the accelerator for each block is expensive.
#define ECC_BIGINT_SIZE_IN_BITS (256) | ||
#define ECC_BIGINT_SIZE_IN_BYTES (ECC_BIGINT_SIZE_IN_BITS/8) | ||
#define ECC_BIGINT_SIZE_IN_32BIT_WORDS (ECC_BIGINT_SIZE_IN_BYTES/sizeof(uint32_t)) | ||
#define EC_BIGINT_COPY(X, Y) memcpy(X, Y, sizeof(ecc_bigint_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please place parenthesis around all invoked macro parameters.
- #define EC_BIGINT_COPY(X, Y) memcpy(X, Y, sizeof(ecc_bigint_t));
+ #define EC_BIGINT_COPY(X, Y) memcpy((X), (Y), sizeof(ecc_bigint_t));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
* limitations under the License. | ||
*/ | ||
|
||
#ifndef MBEDTLS_CRYPTO_MANAGEMENT_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use MBEDTLS
prefix here, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
|
||
/* Load the data block(s) */ | ||
for ( size_t i = 0; i < blocks; i++ ) { | ||
if ((uint32_t)(&data[i*64]) & 0x3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making a #define
for 64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Can I point out that the mbed TLS 'official' software implementation doesn't use a define for SHA blocksize, either? And the blocksize is hardcoded as part of the API?
|
||
if ( is224 != 0 ) { | ||
ctx->is224 = true; | ||
memcpy(ctx->state, init_state_sha224, 32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making a #define
for 32
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
targets/targets.json
Outdated
@@ -2320,7 +2320,7 @@ | |||
"inherits": ["EFM32"], | |||
"extra_labels_add": ["EFM32GG", "1024K", "SL_AES"], | |||
"core": "Cortex-M3", | |||
"macros": ["EFM32GG990F1024", "TRANSACTION_QUEUE_SIZE_SPI=4"], | |||
"macros": ["EFM32GG990F1024", "TRANSACTION_QUEUE_SIZE_SPI=4", "MBEDTLS_CONFIG_HW_SUPPORT"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this on EFM32
or do not all EFM32
have hardware support now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it can be moved to the EFM32 generic target.
targets/targets.json
Outdated
@@ -2794,7 +2794,7 @@ | |||
}, | |||
"EFR32MG12P332F1024GL125": { | |||
"inherits": ["EFM32"], | |||
"extra_labels_add": ["EFR32MG12", "1024K", "SL_RAIL", "SL_CRYPTO"], | |||
"extra_labels_add": ["EFR32MG12", "1024K", "SL_RAIL", "SL_CRYPTO", "MBEDTLS_CONFIG_HW_SUPPORT"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is MBEDTLS_CONFIG_HW_SUPPORT
added to extra_labels_add
on this target instead of to macros
like on other targets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Fixed.
@stevew817 FYI, some new comments received. Could you please check? |
@ashok-rao @Patater Will do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The commits looks good but the message does not say much what is it fixing. Can you fix it? /morph build |
Build : FAILUREBuild number : 502 |
Build : SUCCESSBuild number : 507 Triggering tests/morph test |
/morph test |
Exporter Build : SUCCESSBuild number : 121 |
Test : SUCCESSBuild number : 317 |
Initial commit of mbed TLS hardware acceleration drivers for Silicon Labs parts
Description
This commit adds HW acceleration drivers for mbed TLS on Silicon Labs parts.
Hardware support:
EFM32ZG/HG/WG/GG/LG:
EFR32MG1/EFM32PG1:
EFR32MG12/EFM32PG12:
Status
READY
Todos
Deploy notes
None