Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPE: Fix up atomic usage #9617

Merged
merged 1 commit into from Feb 19, 2019

Conversation

Projects
None yet
9 participants
@kjbracey-arm
Copy link
Contributor

commented Feb 5, 2019

Description

PSA SPE code was using atomics, but not consistently. On the assumption
that the atomics are needed, correct the code to be properly atomic.

  • Tighten up table full case - new_handle was left with a bogus value,
    and surrounding code (as a result?) used loop index to assert success.
    Make handle the sole output of the loop, and correct and use it.
  • Ensure handle in table is cleared last, with atomic store to release.
  • Make skipping of the invalid handle in handle generator loop atomic.
  • Use atomic load on state assert, and don't re-read on failure.
  • Tighten up types, and avoid casts by using new signed atomics.

Pull request type

[ ] Fix
[X] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change
SPE: Fix up atomic usage
PSA SPE code was using atomics, but not consistently. On the assumption
that the atomics are needed, correct the code to be properly atomic.

* Tighten up table full case - new_handle was left with a bogus value,
  and surrounding code (as a result?) used loop index to assert success.
  Make handle the sole output of the loop, and correct and use it.
* Ensure handle in table is cleared last, with atomic store to release.
* Make skipping of the invalid handle in handle generator loop atomic.
* Use atomic load on state assert, and don't re-read on failure.
* Tighten up types, and avoid casts by using new signed atomics.
@kjbracey-arm

This comment has been minimized.

Copy link
Contributor Author

commented Feb 5, 2019

Requires preceding PR #9600 for atomic updates.

@kjbracey-arm kjbracey-arm requested a review from pan- Feb 5, 2019

@orenc17

This comment has been minimized.

Copy link
Contributor

commented Feb 5, 2019

@alzix
Copy link
Contributor

left a comment

@kjbracey-arm , thanks for taking a look at our code.
I have couple of small remarks.

uint32_t tmp_handle;
do {
tmp_handle = core_util_atomic_incr_u16(&(handle_mgr->handle_generator), 1);
} while (tmp_handle == PSA_HANDLE_MGR_INVALID_HANDLE);

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

PSA_HANDLE_MGR_INVALID_HANDLE defined to be uint16_t.
i think it is better to compare to PSA_NULL_HANDLE instead.
PSA_HANDLE_MGR_INVALID_HANDLE is only used for static handle_generator initialisation.

Suggested change
} while (tmp_handle == PSA_HANDLE_MGR_INVALID_HANDLE);
} while (tmp_handle == PSA_NULL_HANDLE);

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

Yes, this one was debatable - all those constants are so coupled in their assumptions, it's a bit hard to tell which to use. I went for the PSA_HANDLE_MGR_INVALID_HANDLE as it is the direct 16-bit value of the handle_generator, despite having just popped it into a uint32_t ready for composition.

I think in principle PSA_NULL_HANDLE doesn't have to be PSA_HANDLE_MGR_INVALID_HANDLE zero-extended - if it wasn't my form is correct. (It might have been simpler if the invalidness was represented by a too-big value in the slot indicator, rather than a special generator count value, but maybe having the null value be 0 is convenient).

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

PSA_NULL_HANDLE is part of PSA FF spec. The purpose of having it as it simplifies programming model. For example it allows us to call psa_close() regardless of psa_connect() status, which simplifies error handling logic.
PSA_HANDLE_MGR_INVALID_HANDLE is zero as it is initialises our handle_generator.
The connection between them is rather occasional.

handle_mgr->handles_pool[pool_ix].handle_owner = PSA_HANDLE_MGR_INVALID_FRIEND_OWNER;
handle_mgr->handles_pool[pool_ix].handle_friend = PSA_HANDLE_MGR_INVALID_FRIEND_OWNER;
core_util_atomic_store_s32(&(handle_mgr->handles_pool[pool_ix].handle), PSA_NULL_HANDLE);

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

i do agree that .handle_owner should be set last, as it returns the handle to the pool. And your change is actually fixing a bug in our implementation. Great catch!
But IMO the atomic store is redundant here. .handle is an aligned 32bit value thus single bus transaction will be sufficient for storing the value.
The code here written with an assumption that handle is owned as long as it not set to PSA_NULL_HANDLE. Thus only the owner will be accessing it. If this assumption is wrong we need to rewrite the whole code. Otherwise single store instruction is sufficient.

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

The handle must be written last. Only the atomic store will guarantee that - it has a memory ordering semantic. Without the atomic store, the compiler is free to reorder those 3 stores, so might not perform the handle write last.

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

thanks for explaining it. You are correct!

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

Aside from the memory ordering, which is the most likely practical issue, there's the formal issue that doing a non-atomic write to NULL at the same time as a cas reads it is undefined behaviour. In theory there could be tearing effects. (On a 16-bit CPU this 32-bit handle would tear. On ARM a 64-bit variable would tear).

Plain reads and writes are only permitted if there is no possibility of a write and a read happening at the same time - either due being ordered against other synchronised accesses, like the handle_friend here is ordered by the atomic handle accesses, or by mutexes and the like.

// Handle generator uses only 16 bits, and wraps.
// The reason for this is that we use the 16 upper bits to store the handle's index in the handles pool (for performance reasons)
uint16_t handle_generator; /* A counter supplying handle numbers. */
uint16_t pool_size; /* The maximum number of handles that pool can contain. */

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

i would really prefer to have uint32_t for a size.
This will waste 4B in RAM but i will sleep better when next time this code will be modified/maintained.
Alternatively, please add casting in the for loop compare statement.

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

There's a fundamental assumption throughout all the shifting and masking that these are both 16 bits, so I felt that making very sure that we don't have more than 16 bits in the state variables seemed like a good plan in itself, aside from the memory saving.

A cast isn't necessary - there's no problem comparing different width types of the same signedness - but the loop could certainly be changed to using uint16_t.

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

Ah, no, pool_ix needs to be 32-bit, else casted before the left shift. Bleurgh.

{
if (*current_state != expected_state) {
uint8_t actual_state = core_util_atomic_load_u8(current_state);

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

I'm not convinced atomic load is necessary here.

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

It only wouldn't be necessary if you know for certain no other threads could be modifying it. If you knew you were the "owner" due to a previous cas, so you knew that you were in a state that no-one else would modify, you'd be fine.

But as this is an assert, a lot of your assertable cases would actually then make this undefined behaviour if it was a plain load. Getting the atomic load gets you to the assert in a defined way in the event of failure.

I don't actually know if the atomics are needed at all here - I guess it's a sort of mechanism where exactly 1 thread can transition from IDLE to ACTIVE, and then it gets control?

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

Note that the atomic load isn't actually particularly expensive on current Mbed OS platforms - it will just be inlined as a normal LDRB, with the compiler being a bit stricter about ordering.

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

atomic operations at handler allocations are needed for sure.
state change is also required to be atomic since it changed in SPM context and asserted in secure partition context.
atomic for write in destroy handle operation, you have convinced me it is needed to make sure compiler wont optimize stuff.
Atomic read still looks redundant to me.
perhaps may only be valid when using buffered writes...
Better safe than sorry :)

This comment has been minimized.

Copy link
@kjbracey-arm

kjbracey-arm Feb 6, 2019

Author Contributor

state change is also required to be atomic since it changed in SPM context and asserted in secure partition context.

So those are different threads, not just the other side of the security barrier in the same thread? If they are different threads, then the non-atomic read is undefined behaviour.

I think you can construct a convincing argument why the undefined behaviour won't cause a problem in this case, based on assumptions about the platform and how limited the compiler's hostility is, but compilers really are getting quite mean about exploiting undefined behaviour these days, so I really don't want to go anywhere near it. Making the code legal doesn't cost much, and serves as a documentation reminder. Plus atomics are hard enough to reason about as it is without trying to do "unsynchronised atomics".

Even if you never see a problem in practice on current hardware, one day some mean person is going to run you against a testing tool like this which would chew you straight out for that unsynchronised load.

This comment has been minimized.

Copy link
@alzix

alzix Feb 6, 2019

Contributor

ok

@alzix

This comment has been minimized.

Copy link
Contributor

commented Feb 6, 2019

@0xc0170, @NirSonnenschein travis failure seems to be irrelevant to the changes.

@alzix

alzix approved these changes Feb 6, 2019

Copy link
Contributor

left a comment

LGTM

@NirSonnenschein

This comment has been minimized.

Copy link
Contributor

commented Feb 7, 2019

starting CI

@cmonr

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2019

@pan- Would you be able to look at this any time soon?

@pan-

This comment has been minimized.

Copy link
Member

commented Feb 14, 2019

@cmonr Not today, but tomorrow I can.

@orenc17

This comment has been minimized.

Copy link
Contributor

commented Feb 18, 2019

@pan- review please

@0xc0170 0xc0170 added needs: CI and removed needs: review labels Feb 19, 2019

@0xc0170

This comment has been minimized.

Copy link
Member

commented Feb 19, 2019

While we wait for the final review, CI started

@mbed-ci

This comment has been minimized.

Copy link

commented Feb 19, 2019

Test run: SUCCESS

Summary: 12 of 12 test jobs passed
Build number : 1
Build artifacts

@pan-

pan- approved these changes Feb 19, 2019

@cmonr cmonr added ready for merge and removed needs: CI labels Feb 19, 2019

@cmonr cmonr merged commit 19474fc into ARMmbed:master Feb 19, 2019

27 checks passed

continuous-integration/jenkins/pr-head This commit looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jenkins-ci/build-ARM Success
Details
jenkins-ci/build-ARMC6 Success
Details
jenkins-ci/build-GCC_ARM Success
Details
jenkins-ci/build-IAR Success
Details
jenkins-ci/cloud-client-test Success
Details
jenkins-ci/dynamic-memory-usage RTOS ROM(+0 bytes) RAM(+0 bytes)
Details
jenkins-ci/exporter Success
Details
jenkins-ci/greentea-test Success
Details
jenkins-ci/mbed2-build-ARM Success
Details
jenkins-ci/mbed2-build-GCC_ARM Success
Details
jenkins-ci/mbed2-build-IAR Success
Details
jenkins-ci/unittests Success
Details
travis-ci/astyle Local astyle testing has passed
Details
travis-ci/docs Local docs testing has passed
Details
travis-ci/doxy-spellcheck Local doxy-spellcheck testing has passed
Details
travis-ci/events Passed, runtime is 9318 cycles (-438 cycles)
Details
travis-ci/gitattributestest Local gitattributestest testing has passed
Details
travis-ci/include_check Local include_check testing has passed
Details
travis-ci/licence_check Local licence_check testing has passed
Details
travis-ci/littlefs Passed, code size is 8408B (+0.00%)
Details
travis-ci/psa-autogen Local psa-autogen testing has passed
Details
travis-ci/tools-py2.7 Local tools-py2.7 testing has passed
Details
travis-ci/tools-py3.5 Local tools-py3.5 testing has passed
Details
travis-ci/tools-py3.6 Local tools-py3.6 testing has passed
Details
travis-ci/tools-py3.7 Local tools-py3.7 testing has passed
Details

@cmonr cmonr removed the ready for merge label Feb 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.