-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 246 continuations wiring (II) #248
Issue 246 continuations wiring (II) #248
Conversation
I am worried about the depth of changes in this PR as they break subtle invariants in the system. We will step back and (1) fix the multi-async bug; (2) make sure your use case gets high performance. I think these goals can be accomplished without modifying the mono-threaded session access invariant. |
Also, FASTER is a latch-free library. We try to not take locks anywhere, except in one case during checkpointing, when forcing a session to move from one version to another. This is achieved by ensuring that only the mono-threaded session accesses/updates any state. Thus, we need the async continuations to simply be added to the async response queue (which already uses a latch-free ConcurrentQueue implementation), from which the mono-threaded session will remove elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the user does RMWAsync followed by ReadAsync with waitForCommit true. We would require the older RMW to complete as well, as part of this wait for commit.
cs/src/core/Index/FASTER/FASTER.cs
Outdated
|
||
await diskRequest.ValueTaskOfT; | ||
|
||
lock (clientSession.ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not want to take the overhead of locking the client session. This is the reason FASTER is mono-threaded. All work for a session is done on a single thread, and this PR breaks that contract.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the user does RMWAsync followed by ReadAsync with waitForCommit true. We would require the older RMW to complete as well, as part of this wait for commit.
That should be ensured by ClientSession.WaitForCommitAsync
(that went unchanged), no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If user does this sequence, the expectation is that all operations until that point are committed:
s.RMWAsync();
await s.ReadAsync(waitForCommit: true);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not want to take the overhead of locking the client session. This is the reason FASTER is mono-threaded. All work for a session is done on a single thread, and this PR breaks that contract.
I see.
However, currently there's other places where there's shared memory, which current synchronization techniques are equivalent to locks, even with the possibility of a context switch. Those are ConcurrentDictionaries and ConcurrentQueues.
Would that be better if we switched the lock
to Interlocked
calls and a Spinner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If user does this sequence, the expectation is that all operations until that point are committed:
s.RMWAsync();
await s.ReadAsync(waitForCommit: true);
This is happening.
RMWAsync will put an entry inside the pending requests collection, that in turn will be monitored by the CompletePendingAsync
call inside WaitForCommitAsync
if this is about FASTER.cs:389
This would have ate least these consequences:
In the end, we would need to use other type of queue, that would signal (by a task) every awaiter, so they can all just discard a continuation thats not theirs and re await for the next completion, until theirs come around, and they can return. That would produce a task for every completion multiplied by maybe a new state machine for every awaiter loop interation. |
cs/src/core/Index/FASTER/FASTER.cs
Outdated
try | ||
{ | ||
do | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is not identical in behavior to the original code. For example, I notice that InternalRefresh would not be called in this tight loop, resulting in never getting out of CPR. The debug asserts for version correctness are also skipped. We need to ensure that all code paths are functionally identical to the older case.
Why not use an async equivalent of HandleOperationStatus? This keeps the main FasterKV.ReadAsync method small so it can be inlined, leaving the complexity of uncommon cases to a sub-method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is not identical in behavior to the original code. For example, I notice that InternalRefresh would not be called in this tight loop, resulting in never getting out of CPR. The debug asserts for version correctness are also skipped. We need to ensure that all code paths are functionally identical to the older case.
You're right. Thank you! The last commit just fixed this.
Why not use an async equivalent of HandleOperationStatus? This keeps the main FasterKV.ReadAsync method small so it can be inlined, leaving the complexity of uncommon cases to a sub-method.
It was done this way because HandleOperationStatus has a lot of cases to thread.
Actually, any status different from Success or NotFound will cause it to be called.
It will even trigger the disk IO if needed.
That's where we need to fork paths.
When the disk IO is needed, the ReadAsync path will do the new wiring, thus we cannot call HandleOperationStatus at this point, and should just prepare for the await.
It's important to notice that the only changed path was the ReadAsync one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, current code will error out the Read if 2 consecutive CPR_SHIFT_DETECTED happen.
Should we keep this? I think this could be in error.
What is keeping the overall state to be in CPR_SHIFT_DETECTED twice, while being on other states in between the two tries, in a classic ABA Problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yesterday's comments handled.
cs/src/core/Index/FASTER/FASTER.cs
Outdated
try | ||
{ | ||
do | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is not identical in behavior to the original code. For example, I notice that InternalRefresh would not be called in this tight loop, resulting in never getting out of CPR. The debug asserts for version correctness are also skipped. We need to ensure that all code paths are functionally identical to the older case.
You're right. Thank you! The last commit just fixed this.
Why not use an async equivalent of HandleOperationStatus? This keeps the main FasterKV.ReadAsync method small so it can be inlined, leaving the complexity of uncommon cases to a sub-method.
It was done this way because HandleOperationStatus has a lot of cases to thread.
Actually, any status different from Success or NotFound will cause it to be called.
It will even trigger the disk IO if needed.
That's where we need to fork paths.
When the disk IO is needed, the ReadAsync path will do the new wiring, thus we cannot call HandleOperationStatus at this point, and should just prepare for the await.
It's important to notice that the only changed path was the ReadAsync one.
cs/src/core/Index/FASTER/FASTER.cs
Outdated
try | ||
{ | ||
do | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, current code will error out the Read if 2 consecutive CPR_SHIFT_DETECTED happen.
Should we keep this? I think this could be in error.
What is keeping the overall state to be in CPR_SHIFT_DETECTED twice, while being on other states in between the two tries, in a classic ABA Problem?
… it. It then is the user role to await every ReadAsync call eventually
As per last discussions, a new struct is wrapping the compute completion logic, so the user can trigger it. |
BTW, Thank you @badrishc for the patience and the high level discussions about this issue. |
/// <summary> | ||
/// Async Operation ValueTask backer | ||
/// </summary> | ||
public FasterAsyncOperation<AsyncIOContext<Key, Value>> asyncOperation; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good reason to use all the complicated logic in FasterAsyncOperation vs a TaskCompletionSource that we set completed when IO is done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid introducing new code if we can avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
TCS will yield a task (class) while our FAO will yield and back a ValueTask (struct).
Its one allocation less.
Lots of code inside .net core itself are using IValueTaskSource as is FasterAsyncOperation, as to avoid allocating a Task when possible.
Properly Wire ReadAsync Continuations.
Every commit tries to stage a step towards the PR objective.
The main point of concern is the RelaxedCPR treatment on cs/src/core/Index/FASTER/FASTER.cs
Fixes #246