Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure cancellation safety #56

Closed
GTwhy opened this issue Apr 22, 2022 · 0 comments
Closed

Ensure cancellation safety #56

GTwhy opened this issue Apr 22, 2022 · 0 comments

Comments

@GTwhy
Copy link
Collaborator

GTwhy commented Apr 22, 2022

Panic or undefined behavior will happen if the future dropped(cancelled) during the execution of ibv_post_send. #47

GTwhy added a commit to GTwhy/async-rdma that referenced this issue May 15, 2022
Let `event_listener` holds the `Arc`s of the `LocalMrInner`s
that are being used by RDMA ops to ensure cancellation safety.

`LocalMr` was replaced with New struct `LocalMrInner`.
Because every struct that can use APIs should hold an `Arc`
of the mr's metadata, but previous `LocalMr` can't hold itself.

Fixes: datenlord#47 datenlord#56
GTwhy added a commit to GTwhy/async-rdma that referenced this issue May 19, 2022
Let `event_listener` holds the `Arc`s of the `LocalMrInner`s
that are being used by RDMA ops to ensure cancellation safety.

`LocalMr` was replaced with New struct `LocalMrInner`.
Because every struct that can use APIs should hold an `Arc`
of the mr's metadata, but previous `LocalMr` can't hold itself.

Fixes: datenlord#47 datenlord#56
GTwhy added a commit to GTwhy/async-rdma that referenced this issue May 20, 2022
Let `event_listener` holds the `Arc`s of the `LocalMrInner`s
that are being used by RDMA ops to ensure cancellation safety.

`LocalMr` was replaced with New struct `LocalMrInner`.
Because every struct that can use APIs should hold an `Arc`
of the mr's metadata, but previous `LocalMr` can't hold itself.

Add `MrState` to avoid potential race condition

When the RDMA operations future were canceled but the related
mr is still being used by netdev or kernel. This is a potential
race condition. So we add `MrState` to `LocalMrInner`, to make
sure we can get slice or mutable slice only if the mr is readable
or writeable.

Fixes: datenlord#47 datenlord#56
GTwhy added a commit to GTwhy/async-rdma that referenced this issue May 23, 2022
Let `event_listener` holds the `Arc`s of the `LocalMrInner`s
that are being used by RDMA ops to ensure cancellation safety.

`LocalMr` was replaced with New struct `LocalMrInner`.
Because every struct that can use APIs should hold an `Arc`
of the mr's metadata, but previous `LocalMr` can't hold itself.

Add `RwLock` to avoid potential race condition

When the RDMA operations future were canceled but the related
mr is still being used by netdev or kernel. This is a potential
race condition. So we add `Rwlock` to `LocalMrInner`, to make
sure we can get slice or mutable slice only if the mr is readable
or writeable.

Replace hashmap to reduce time overhead and ensure guards drop after
remove immediately to unlock the rwlock.`std::collections::HashMap`
has less time overhead than `LockFreeCuckooHash` in the low concurrency
situation. `LockFreeCuckooHash` can't drop guards immediately after remove
them and that led to a deadlock.

Add some unchecked methods without lock, which should be put into unsafe
trait later.

Integrate common `HashMap` methods in hashmap_utilities.

Fixes: datenlord#47 datenlord#56
rogercloud pushed a commit that referenced this issue May 24, 2022
Let `event_listener` holds the `Arc`s of the `LocalMrInner`s
that are being used by RDMA ops to ensure cancellation safety.

`LocalMr` was replaced with New struct `LocalMrInner`.
Because every struct that can use APIs should hold an `Arc`
of the mr's metadata, but previous `LocalMr` can't hold itself.

Add `RwLock` to avoid potential race condition

When the RDMA operations future were canceled but the related
mr is still being used by netdev or kernel. This is a potential
race condition. So we add `Rwlock` to `LocalMrInner`, to make
sure we can get slice or mutable slice only if the mr is readable
or writeable.

Replace hashmap to reduce time overhead and ensure guards drop after
remove immediately to unlock the rwlock.`std::collections::HashMap`
has less time overhead than `LockFreeCuckooHash` in the low concurrency
situation. `LockFreeCuckooHash` can't drop guards immediately after remove
them and that led to a deadlock.

Add some unchecked methods without lock, which should be put into unsafe
trait later.

Integrate common `HashMap` methods in hashmap_utilities.

Fixes: #47 #56
@GTwhy GTwhy closed this as completed May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant