Fault mitigation #5646

jenswi-linaro · 2022-11-14T08:25:24Z

This is #5247 rebased and updated.

The use case is to make buf_ta_open() more resilient to fault injection attacks on the hardware, more specifically glitching attacks. The basic assumption is that glitch can cause one or more instructions in sequence to act as nops when executed. Anything can happen of course but I'm using this approximation to have something more concrete to work with.

The fault mitigations are supposed to add zero overhead unless enabled and if enabled have a modest overhead.

In this PR I've made fault mitigations enabled by default with CFG_CORE_FAULT_MITIGATION?=y in mk/config.mk. Should it be disabled by default instead?

ldts · 2022-11-18T06:04:26Z

@jenswi-linaro are there any performance tests you plan to run? I have several boards (xilinx, nxp, stm32...) that I could use if you need help. Not that performance is more critical than security in this case so may be irrelevant.

ldts · 2022-11-18T08:26:52Z

lib/libutils/ext/include/fault_mitigation.h

+ *	This is implicit since we're normally trying to protect things post
+ *	boot and booting takes quite some time.
+ *
+ * [1] https://www.riscure.com/uploads/2020/05/Riscure_Whitepaper_Fault_Mitigation_Patterns_final.pdf


@jenswi-linaro the link is not visible.

There is a copy at https://web.archive.org/web/20220616035354/https://www.riscure.com/uploads/2020/05/Riscure_Whitepaper_Fault_Mitigation_Patterns_final.pdf perhaps we could use that link?

Thanks, I'll update.

jenswi-linaro · 2022-11-18T10:20:30Z

@jenswi-linaro are there any performance tests you plan to run? I have several boards (xilinx, nxp, stm32...) that I could use if you need help. Not that performance is more critical than security in this case so may be irrelevant.

@ldts, thanks for the offer. I haven't planned any performance testing because I believe the added overhead will not be noticeable combined with signature verification. That said, if you notice anything odd please let me know.

jenswi-linaro · 2022-11-21T08:44:18Z

Rebased to resolve a merge conflict.

etienne-lms

minor things in commit "Basic fault mitigation routines"

etienne-lms · 2022-11-19T04:33:09Z

lib/libutils/ext/include/fault_mitigation.h

+ * While a function is executed it can update its state as a way of keeping
+ * track of important passages inside the function. Before the function
+ * returns with for instance ftmn_return_res() to check that the
+ * accumulated state matches the expected state.


setence sounds strange.

| (...). ~~Before~~When the function
| * returns with for instance ftmn_return_ res() to it is checked that the
| * accumulated state matches the expected state.

?

etienne-lms · 2022-11-19T04:40:27Z

lib/libutils/ext/include/fault_mitigation.h

+ * FTMN_PANIC() - FTMN specific panic function
+ *
+ * This function is called whenever the FTMN function detects an
+ * inconsistency.  An inconsistency is able to occur if the system is


2 space chars

etienne-lms · 2022-11-19T04:41:20Z

lib/libutils/ext/include/fault_mitigation.h

+ *
+ * This function is called whenever the FTMN function detects an
+ * inconsistency.  An inconsistency is able to occur if the system is
+ * subject to an fault injection attack, in this case doing a panic() isn't


etienne-lms · 2022-11-19T04:43:39Z

lib/libutils/ext/include/fault_mitigation.h

+#ifdef __KERNEL__
+#define FTMN_PANIC()	panic();
+#else
+#define FTMN_PANIC()	TEE_Panic(0);


A specific ID could be helpful when printed in generic core debug trace

Personally, I never look at the TEE_Panic ID. Do you have any particular ID in mind?

TEE_ERROR_SECURITY or TEE_ERROR_BAD_STATE could be good candidates.
But i admit it will not be specifically obvious it specifically from the fault injection countermeasure.

The only occasion when I have found specific values to be helpful in TEE_Panic(), is when running negative tests in CI, when the xtest output is visible but not the secure world log. Not very applicable to this case I suppose. But FWIW either ways are OK with me (0 or something else).

I don't care which we take either. @etienne-lms which do you prefer?

Not willing to track glitch attacks on CI runs? :)
Ok, 0 is fine.

etienne-lms · 2022-11-19T04:48:39Z

lib/libutils/ext/include/fault_mitigation.h

+static inline void __ftmn_callee_done(struct ftmn_func_arg *arg,
+				      unsigned long my_hash, unsigned long res)
+{
+	if (IS_ENABLED(CFG_CORE_FAULT_MITIGATION))


any reson not testing arg here, compared to other neighbour functions.

Well spotted, I'll fix.

etienne-lms · 2022-11-22T10:49:27Z

lib/libutils/ext/include/fault_mitigation.h

+ *
+ * The passed result will be stored in the struct ftmn_func_arg struct
+ * supplied by the caller. This function can be called any number of times
+ * by the callee, provided that one of the FTMN_CALLEE_DONE_XXX() function has


s/function/functions/

etienne-lms · 2022-11-22T10:50:44Z

lib/libutils/ext/include/fault_mitigation.h

+ * __ftmn_get_tsd_func_arg().
+ *
+ * The FTMN_CALLE_* functions only work with the real function name so the
+ * old hash must be removed and replaces with the new for the calling


s/replaces/replaced/

etienne-lms · 2022-11-22T10:51:24Z

lib/libutils/ext/include/fault_mitigation.h

+			       (my_old_hash), FTMN_FUNC_HASH(__func__))
+
+/*
+ * FTMN_SET_CHECK_RES() - records a result in local checked state


d/record/Record/

ditto for few macros below.

etienne-lms · 2022-11-22T10:55:34Z

core/tests/ftmn_boot_tests.c

+#include <fault_mitigation.h>
+#include <initcall.h>
+#include <kernel/thread.h>
+#include <types_ext.h>


etienne-lms · 2022-11-22T10:59:35Z

core/kernel/ree_fs_ta.c

@@ -54,6 +54,7 @@
 #include <tee/tee_ta_enc_manager.h>
 #include <tee/uuid.h>
 #include <utee_defines.h>
+#include <fault_mitigation.h>


jenswi-linaro · 2022-11-23T10:31:31Z

Addressed all comments except the one concerning TEE_Panic(0).

jforissier

LGTM. I find the modified code a bit difficult to read unfortunately, but I suppose there is not much that can be done. The impact could have been much worse ;)

A couple of comments below, then for the whole:

Acked-by: Jerome Forissier <jerome.forissier@linaro.org>

jforissier · 2022-11-24T13:30:45Z

lib/libutils/ext/include/fault_mitigation.h

+
+typedef int (*ftmn_memcmp_t)(const void *p1, const void *p2, size_t nb);
+
+/* The default hash used when xoring the result in struct ftmn_check */


How are these values chosen? More or less randomly I suppose? Any particular property they'd better have?

No, just random.

jforissier · 2022-11-24T13:46:18Z

lib/libutils/ext/include/fault_mitigation.h

+{
+#if defined(CFG_CORE_FAULT_MITIGATION) && defined(__KERNEL__)
+	return &thread_get_tsd()->ftmn_arg;
+#elif defined(CFG_CORE_FAULT_MITIGATION)


CFG_CORE should not be used for user-space features IMO. How about CFG_FAULT_MITIGATION?

Good point.

Adds basic fault mitigation routines designed to help protecting from fault injection attacks on the hardware. This is by no means bullet proof, but it should at least improve the situation. These routines focus on verifying that a function has been called and that the returned value matches the result from the function. This is done by having a handshake between the caller and the callee where also the return value is transmitted in a separate channel. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

Adds some simple test for the fault mitigation routines. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

Adds fault mitigation in mbedtls_rsa_rsassa_pss_verify_ext() by using the macro FTMN_CALLEE_DONE_MEMCMP() instead of memcmp() when checking that the hash in the RSA signature is matching the expected value. FTMN_CALLEE_DONE_MEMCMP() saves on success the result in a thread local storage if fault mitigations was enabled when the function was called. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

…fy() Adds fault mitigation in mbedtls_rsa_rsassa_pkcs1_v15_verify() by using the macro FTMN_CALLEE_DONE_MEMCMP() instead of just mbedtls_safer_memcmp() when checking that the hash in the RSA signature is matching the expected value. FTMN_CALLEE_DONE_MEMCMP() saves on success the result in a thread local storage if fault mitigations was enabled when the function was called. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

Adds fault mitigations in crypto_acipher_rsassa_verify() by checking that the internal call to memcmp() when verifying the hash in the RSA signature was called and was successful. The internal call to memcmp() records the result of the comparison if successful. This is double checked against the normal return value from the called pk_info->verify_func(). If the normal return value is OK then the recorded return value must match or we're likely subject to a fault injection attack and we're triggering a panic. If the normal return value isn't OK we don't care about the recorded value, it's overridden by a new error code. In this case we don't know if we're subject to a fault injection attack or not, the important thing to make sure that the calling function doesn't miss the error. This fault mitigation is only enabled with the calling function enabled fault mitigations and CFG_CORE_FAULT_MITIGATION is 'y'. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

Adds fault mitigations in crypto_acipher_rsassa_verify() and dependent functions in libTomCrypt in order to include the critical final memcompare. This fault mitigation is only enabled with the calling function enabled fault mitigations and CFG_CORE_FAULT_MITIGATION is 'y'. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

…a_verify() Adds a stubbed fault mitigation for the drivers version of crypto_acipher_rsassa_verify). End the function with FTMN_CALLEE_DONE() to record that the function was indeed called and a redundant copy of the return value. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

Adds fault mitigations to shdr_verify_signature() and shdr_verify_signature2(). shdr_verify_signature() and shdr_verify_signature2() are called using the wrapper FTMN_CALL_FUNC() which verifies that the correct function was called and that the return value hasn't been tampered with. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

Adds and enables fault mitigation in buf_ta_open() to check both the signature of the TA and then also the hash of the TA before returning success. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

Adds and enables fault mitigation in ree_fs_ta_open() to check the signature of the TA before returning success. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

jenswi-linaro · 2022-11-24T15:53:00Z

Squashed and tags applied.

etienne-lms · 2022-11-25T07:37:50Z

LGTM.
Acked-by: Etienne Carriere <etienne.carriere@linaro.org>

jenswi-linaro · 2022-11-25T08:10:41Z

Tag applied.

rejoiceliberty · 2023-01-11T12:11:13Z

core/crypto/signed_hdr.c

 	size_t hash_size = 0;
 	size_t hash_algo = 0;

 	if (shdr->magic != SHDR_MAGIC)
-		return TEE_ERROR_SECURITY;
+		goto err;


I changed this to 'goto err' on OP-TEE 3.14 and ran 'xtest -t regression 1008' and encountered an 'unhandled pageable abort' error. After analysis, I think it is because crypto_acipher_alloc_rsa_public_key is not executed but crypto_acipher_free_rsa_public_key is executed. Not sure if the latest version is OK.

Looking at the (latest) implementation, I think it currently works fine as key is 0 initialized and all public key free possible callback handles (ltc, mbedtls, versal, caam and se050 do behave nicely in such context. Yet, I may have missed something and all in one, it looks a bit fragile and I think you're right, shdr_verify_signature() exit error sequence should not always call crypto_acipher_free_rsa_public_key().

The error is really because the key was not initialized to 0 on the old version. After initialization, all xtests passed. Thanks!

For info: looking into struct rsa_keypair management, few other places assume it is safe to call crypto_acipher_free_rsa_public_key() on a key ref that was zero initialized and not allocated or partially allocated, for example sw_crypto_acipher_alloc_rsa_keypair() in lib/libtomcrypt/rsa.c and lib/libmbedtls/core/rsa.c. So contrary to above my comment, current latest implementation is fine.

ldts reviewed Nov 18, 2022

View reviewed changes

jenswi-linaro force-pushed the fault_mitigation branch from aba492b to af7b7bf Compare November 21, 2022 08:43

etienne-lms reviewed Nov 22, 2022

View reviewed changes

jforissier reviewed Nov 24, 2022

View reviewed changes

jenswi-linaro added 10 commits November 24, 2022 16:47

core: add fault mitigation tests

6497acc

Adds some simple test for the fault mitigation routines. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

core: add fault mitigations in ree_fs_ta_open()

705df78

Adds and enables fault mitigation in ree_fs_ta_open() to check the signature of the TA before returning success. Acked-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>

jenswi-linaro force-pushed the fault_mitigation branch from 6a93132 to 705df78 Compare November 24, 2022 15:52

jforissier merged commit 2d7720f into OP-TEE:master Nov 25, 2022

jenswi-linaro deleted the fault_mitigation branch November 25, 2022 11:14

rejoiceliberty reviewed Jan 11, 2023

View reviewed changes


		typedef int (ftmn_memcmp_t)(const void p1, const void *p2, size_t nb);

		/* The default hash used when xoring the result in struct ftmn_check */

Fault mitigation #5646

Fault mitigation #5646

Conversation

jenswi-linaro commented Nov 14, 2022

ldts commented Nov 18, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jenswi-linaro commented Nov 18, 2022

jenswi-linaro commented Nov 21, 2022

etienne-lms left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jenswi-linaro commented Nov 23, 2022

jforissier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jenswi-linaro commented Nov 24, 2022

etienne-lms commented Nov 25, 2022

jenswi-linaro commented Nov 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment