Skip to content
This repository has been archived by the owner on Jan 26, 2024. It is now read-only.

data race in memory update function #35

Open
psychocoderHPC opened this issue Jul 11, 2022 · 0 comments
Open

data race in memory update function #35

psychocoderHPC opened this issue Jul 11, 2022 · 0 comments

Comments

@psychocoderHPC
Copy link

IMO the memory counting functions contain data races which lead to the possibility that the counter for free memory is underflowing.

https://github.com/ROCm-Developer-Tools/ROCclr/blob/90f1f61a9d6c28ffd2f844dc773e921444752e47/device/rocm/rocdevice.cpp#L2086-L2104

  • Case 1
    • Two threads both allocating 2MiB going into line 2091 and the check is false because freeMem_ is 3MiB.
    • Both threads will decrement the counter freeMem_ in line 2101 which results in an underflow
  • Case 2
    • Two threads where one is deallocating 2MiB and the other is allocating 5MiB and freeMem_ is 4MiB
    • The allocating thread (5MiB) is checking line 2091 and is going into the if body to line 2096
    • The second thread with the deallocation of 2MiB is exciting line 2088, freeMem_ is now 6MiB
    • The first thread is executing line 2098 which is setting freeMem_ to zero.
    • The result is that we lose 2MiB of memory we could potential allocate but due to the data race is not available anymore

Possible solution:

  • When memory is freed freeMem_ is stored in a register and atomic CAS is used to reset the variable freeMem_
  • line 2101 must be guarded by atomic Cas too to avoid that other threads reducing the value of freeMem_ in the same moment which results into a variant of Case 1
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant