-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PAL/Linux-SGX] Re-init attestation key after AESM restarted #1881
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)
Suggestion:
Re-init attestation key after an AESM restart
-- commits
line 15 at r1:
ditto later in this sentence
Suggestion:
sending
pal/src/host/linux-sgx/enclave_ocalls.h
line 136 at r1 (raw file):
* * \param is_epid True if EPID is used, false if DCAP/ECDSA is used. * \param[out] qe_targetinfo Retrieved Quoting Enclave's target info; buffer must be submitted
I'd drop this note altogether, isn't it obvious? It's a single-level pointer, so there's no way for the function to allocate a buffer by itself and return it.
Suggestion:
buffer must be provided
pal/src/host/linux-sgx/enclave_ocalls.h
line 141 at r1 (raw file):
* \returns 0 on success, negative Linux error code otherwise. * * The obtained QE Target Info is not validated in any way (i.e., this function does not check
Suggestion:
this function does not check whether
pal/src/host/linux-sgx/enclave_ocalls.h
line 144 at r1 (raw file):
* Target Info contents make sense). */ int ocall_get_qe_targetinfo(bool is_epid, sgx_target_info_t* qe_targetinfo);
Double checking if I understand the security model here correctly: qe_targetinfo is completely untrusted, and thus we can't assume that we're talking directly to a real QE (we can be talking to a fake one, or be MitM-ed).
This is not a problem, because only the authentic QE has access to the keys to generate a valid quote (so, the fake QE can't), and our report targets whatever we got in qe_targetinfo
and the MitMer can't re-encrypt it and then send it to a real QE (due to how EREPORT works).
Is this right?
pal/src/host/linux-sgx/enclave_ocalls.c
line 2199 at r1 (raw file):
if (!ocall_qe_ti_args) { sgx_reset_ustack(old_ustack); return -ENOMEM;
Almost all other places return -EPERM
in this case.
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
/* this function assumes proper synchronization by callers, such that * g_pal_linuxsgx_state.qe_targetinfo is not racey */
This should be documented in the header, not here. Also, it's not only about calling this function multiple times concurrently, but also about anything else which could be reading g_pal_linuxsgx_state.qe_targetinfo
at the same time.
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
/* this function assumes proper synchronization by callers, such that * g_pal_linuxsgx_state.qe_targetinfo is not racey */
Do we actually have that synchronization in place? Also, it's quite a surprising/risky interface. Why not just use a lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 7 files reviewed, 8 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @llly, and @mkow)
-- commits
line 2 at r1:
Will do during final rebase
Previously, mkow (Michał Kowalczyk) wrote…
ditto later in this sentence
Will do during final rebase
pal/src/host/linux-sgx/enclave_ocalls.h
line 136 at r1 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
I'd drop this note altogether, isn't it obvious? It's a single-level pointer, so there's no way for the function to allocate a buffer by itself and return it.
Done.
pal/src/host/linux-sgx/enclave_ocalls.h
line 141 at r1 (raw file):
* \returns 0 on success, negative Linux error code otherwise. * * The obtained QE Target Info is not validated in any way (i.e., this function does not check
Done.
pal/src/host/linux-sgx/enclave_ocalls.h
line 144 at r1 (raw file):
This is correct. Though one of your phrases is vague:
the MitMer can't re-encrypt it and then send it to a real QE
What do you mean by re-encrypt? The MitMer can't replace the measurements in the intercepted SGX Report (like MRENCLAVE), because then the MAC of the SGX Report won't correspond to the overridden values. And the MitMer cannot generate the new MAC because she doesn't have the corresponding SGX key (with which the SGX Report was signed). And if the MitMer would call EREPORT instruction, then she won't be able to use victim-enclave measurements, because EREPORT + EINIT disallow this.
pal/src/host/linux-sgx/enclave_ocalls.c
line 2199 at r1 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Almost all other places return
-EPERM
in this case.
Done, though this doesn't sound right. But here it's not important which error code we return, so whatever.
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Do we actually have that synchronization in place? Also, it's quite a surprising/risky interface. Why not just use a lock?
We always had this assumption:
gramine/libos/src/fs/dev/attestation.c
Lines 13 to 15 in 929bb9d
* This pseudo-FS interface is not thread-safe. It is the responsibility of the application to | |
* correctly synchronize concurrent accesses to the pseudo-files. We expect attestation flows to | |
* be generally single-threaded and therefore do not introduce synchronization here. |
Even if we would need a lock, it shouldn't be at this level, but rather inside LibOS.
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
Done.
it's not only about calling this function multiple times concurrently, but also about anything else which could be reading
g_pal_linuxsgx_state.qe_targetinfo
at the same time
Yes, but the only function that operates on this global var's field is this one (sgx_get_quote()
). Do you want me to add something additionally?
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 4 files at r2, all commit messages.
Reviewable status: 6 of 7 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @llly)
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Done.
it's not only about calling this function multiple times concurrently, but also about anything else which could be reading
g_pal_linuxsgx_state.qe_targetinfo
at the same timeYes, but the only function that operates on this global var's field is this one (
sgx_get_quote()
). Do you want me to add something additionally?
Yes. Right now that line is:
sgx_target_info_t qe_targetinfo; /* received from untrusted host, use carefully */
This comment sounds like this field was received from the host and was constant afterwards. I can easily see how we could add some functionality using this and not notice that it can change.
Actually, maybe it would be cleaner if we moved that field from struct pal_linuxsgx_state
to a private variable in enclave_platform.c
and initialize in on ECALL_ENCLAVE_START
? (via calling something like initialize_quoting()
which would copy the enclave start argument into that private, global variable)
It should be obvious then that this variable is only for the use of sgx_quote()
and would be much harder to misuse.
This way you could also lock accesses to it properly and remove that tricky assumption from sgx_get_quote()
.
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We always had this assumption:
gramine/libos/src/fs/dev/attestation.c
Lines 13 to 15 in 929bb9d
* This pseudo-FS interface is not thread-safe. It is the responsibility of the application to * correctly synchronize concurrent accesses to the pseudo-files. We expect attestation flows to * be generally single-threaded and therefore do not introduce synchronization here. Even if we would need a lock, it shouldn't be at this level, but rather inside LibOS.
See above. I think it's obvious from the comment you quoted that e.g. setting /dev/attestation/user_report_data
and reading /dev/attestation/quote
cannot be concurrent, but it would still be surprising to me that I cannot read /dev/attestation/quote
concurrently (because it's just reading).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 9 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @llly and @mkow)
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Yes. Right now that line is:
sgx_target_info_t qe_targetinfo; /* received from untrusted host, use carefully */This comment sounds like this field was received from the host and was constant afterwards. I can easily see how we could add some functionality using this and not notice that it can change.
Actually, maybe it would be cleaner if we moved that field from
struct pal_linuxsgx_state
to a private variable inenclave_platform.c
and initialize in onECALL_ENCLAVE_START
? (via calling something likeinitialize_quoting()
which would copy the enclave start argument into that private, global variable)
It should be obvious then that this variable is only for the use ofsgx_quote()
and would be much harder to misuse.
This way you could also lock accesses to it properly and remove that tricky assumption fromsgx_get_quote()
.
Done.
pal/src/host/linux-sgx/enclave_platform.c
line 8 at r1 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
See above. I think it's obvious from the comment you quoted that e.g. setting
/dev/attestation/user_report_data
and reading/dev/attestation/quote
cannot be concurrent, but it would still be surprising to me that I cannot read/dev/attestation/quote
concurrently (because it's just reading).
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r3, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
pal/src/host/linux-sgx/enclave_platform.c
line 14 at r3 (raw file):
int init_qe_targetinfo(void* uptr_qe_targetinfo) { /* no need to acquire g_qe_targetinfo_lock at this point since there is a single thread */
Suggestion:
since there is only a single thread */
pal/src/host/linux-sgx/enclave_platform.c
line 34 at r3 (raw file):
if (retries) { /* new attempt, need to update QE target info */ ret = ocall_get_qe_targetinfo(/*is_epid=*/!!spid, &g_qe_targetinfo);
nitpick: you could OCALL into a local variable, and then only memcpy
under the spinlock
...which you actually do anyways, below, so just swap these two :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 9 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow)
pal/src/host/linux-sgx/enclave_platform.c
line 14 at r3 (raw file):
int init_qe_targetinfo(void* uptr_qe_targetinfo) { /* no need to acquire g_qe_targetinfo_lock at this point since there is a single thread */
Done.
pal/src/host/linux-sgx/enclave_platform.c
line 34 at r3 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
nitpick: you could OCALL into a local variable, and then only
memcpy
under the spinlock
...which you actually do anyways, below, so just swap these two :)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 6 files at r1, 2 of 4 files at r2, 3 of 4 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
Gramine uses the AESM daemon to request SGX Quotes from the Quoting Enclave. AESM runs in the background and may be restarted. For some reason, after a restart AESM "forgets" the platform's attestation key and requires an additional INIT_QUOTE_REQUEST before it is able to generate SGX Quotes again. Moreover, the Quoting Enclave's Target Info may change at this point, so Gramine must update its cached QE Target Info. This behavior is applicable only to DCAP attestation and is documented in Intel SGX PSW/DCAP software. Gramine didn't implement this behavior and thus failed with "AESM service returned error 42". This commit complies with the SGX PSW/DCAP documentation and adds the sending of INIT_QUOTE_REQUEST + updating QE Target Info + retrying the SGX quote retrieval. Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
d6e6bb9
to
8a804e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mkow)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Will do during final rebase
Done.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Will do during final rebase
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mkow)
a discussion (no related file):
After final rebase, tested this PR one last time on a DCAP system -- everything works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mkow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved
Description of the changes
Gramine uses the AESM daemon to request SGX Quotes from the Quoting Enclave. AESM runs in the background and may be restarted. For some reason, after a restart AESM "forgets" the platform's attestation key and requires an additional INIT_QUOTE_REQUEST before it is able to generate SGX Quotes again. Moreover, the Quoting Enclave's Target Info may change at this point, so Gramine must update its cached QE Target Info. This behavior is applicable only to DCAP attestation and is documented in Intel SGX PSW/DCAP software. Gramine didn't implement this behavior and thus failed with "AESM service returned error 42".
This PR complies with the SGX PSW/DCAP documentation and adds the send of INIT_QUOTE_REQUEST + update of QE Target Info + retry of SGX quote retrieval.
This is a re-creation of #1877, with a better design. Closes #1877.
How to test this PR?
Change
CI-Examples/python/scripts/sgx-quote.py
to this:Apply the above Python script, run it in one terminal, and in another terminal do
sudo systemctl restart aesmd
. Without this PR, Gramine will fail. With this PR, Gramine will continue execution of the endless loop.This change is