Issue 246 continuations wiring (II) #248

ThiagoT1 · 2020-02-21T21:23:41Z

Important:

Sync methods behavior and API are unchanged

The only flow control changed is ReadAsync. Other async methods are unchanged.

Properly Wire ReadAsync Continuations.

Every commit tries to stage a step towards the PR objective.

The main point of concern is the RelaxedCPR treatment on cs/src/core/Index/FASTER/FASTER.cs

Fixes #246

Rebase fork

…waited for. Tests passing.

ThiagoT1 · 2020-02-26T04:57:36Z

Last commit done in attempt to meet FASTER concurrency model requirements.

All tests passing, including comprehensive ones on dependent libs, like FasterDictionary:

cs/src/core/Allocator/AsyncIOContext.cs

cs/src/core/Index/FASTER/FASTER.cs

cs/src/core/Index/FASTER/FASTERThread.cs

badrishc · 2020-02-26T19:36:12Z

I am worried about the depth of changes in this PR as they break subtle invariants in the system. We will step back and (1) fix the multi-async bug; (2) make sure your use case gets high performance. I think these goals can be accomplished without modifying the mono-threaded session access invariant.

badrishc · 2020-02-26T23:17:53Z

Also, FASTER is a latch-free library. We try to not take locks anywhere, except in one case during checkpointing, when forcing a session to move from one version to another. This is achieved by ensuring that only the mono-threaded session accesses/updates any state.

Thus, we need the async continuations to simply be added to the async response queue (which already uses a latch-free ConcurrentQueue implementation), from which the mono-threaded session will remove elements.

badrishc

What if the user does RMWAsync followed by ReadAsync with waitForCommit true. We would require the older RMW to complete as well, as part of this wait for commit.

badrishc · 2020-02-26T19:29:22Z

cs/src/core/Index/FASTER/FASTER.cs

+
+                    await diskRequest.ValueTaskOfT;
+
+                    lock (clientSession.ctx)


We do not want to take the overhead of locking the client session. This is the reason FASTER is mono-threaded. All work for a session is done on a single thread, and this PR breaks that contract.

What if the user does RMWAsync followed by ReadAsync with waitForCommit true. We would require the older RMW to complete as well, as part of this wait for commit.

That should be ensured by ClientSession.WaitForCommitAsync (that went unchanged), no?

If user does this sequence, the expectation is that all operations until that point are committed:

s.RMWAsync();
await s.ReadAsync(waitForCommit: true);

We do not want to take the overhead of locking the client session. This is the reason FASTER is mono-threaded. All work for a session is done on a single thread, and this PR breaks that contract.

I see.

However, currently there's other places where there's shared memory, which current synchronization techniques are equivalent to locks, even with the possibility of a context switch. Those are ConcurrentDictionaries and ConcurrentQueues.

Would that be better if we switched the lock to Interlocked calls and a Spinner?

If user does this sequence, the expectation is that all operations until that point are committed:

s.RMWAsync();
await s.ReadAsync(waitForCommit: true);

This is happening.
RMWAsync will put an entry inside the pending requests collection, that in turn will be monitored by the CompletePendingAsync call inside WaitForCommitAsync

ThiagoT1 · 2020-02-26T23:34:06Z

This is achieved by ensuring that only the mono-threaded session accesses/updates any state.

if this is about FASTER.cs:389 lock (clientSession.ctx), it was added since Native32.Read callbacks do not get executed in a mono-threaded way. And the continuation in the PR is following the same execution path.
Thus, such code will also be executed in a multi threaded way, hence the lock, that by the way also happens inside ConcurrentQuere, albeit by way of Interlocked methods and spinners.

Thus, we need the async continuations to simply be added to the async response queue (which already uses a latch-free ConcurrentQueue implementation), from which the mono-threaded session will remove elements.

This would have ate least these consequences:

The AsyncQueue only releases a single awaiter to go, every time an enqueue happens, so
Even if this awaiter completed someone elses continuation and stored it somewhere, the actual awaiter would still be hung on DequeueAsync (maybe forever)
To counter that, we would need a Timeout (lots of allocation, bcz its a task array with two tasks in it, at least)
other option would be to Re-Enqueue the continuation on AsyncQueue. Then we could have endless racing between several awaiters, plus a new Task and StateMachine every loop.

In the end, we would need to use other type of queue, that would signal (by a task) every awaiter, so they can all just discard a continuation thats not theirs and re await for the next completion, until theirs come around, and they can return.

That would produce a task for every completion multiplied by maybe a new state machine for every awaiter loop interation.

badrishc · 2020-03-03T04:08:58Z

cs/src/core/Index/FASTER/FASTER.cs

+            try
+            {
+                do
+                {


This loop is not identical in behavior to the original code. For example, I notice that InternalRefresh would not be called in this tight loop, resulting in never getting out of CPR. The debug asserts for version correctness are also skipped. We need to ensure that all code paths are functionally identical to the older case.

Why not use an async equivalent of HandleOperationStatus? This keeps the main FasterKV.ReadAsync method small so it can be inlined, leaving the complexity of uncommon cases to a sub-method.

This loop is not identical in behavior to the original code. For example, I notice that InternalRefresh would not be called in this tight loop, resulting in never getting out of CPR. The debug asserts for version correctness are also skipped. We need to ensure that all code paths are functionally identical to the older case.

You're right. Thank you! The last commit just fixed this.

Why not use an async equivalent of HandleOperationStatus? This keeps the main FasterKV.ReadAsync method small so it can be inlined, leaving the complexity of uncommon cases to a sub-method.

It was done this way because HandleOperationStatus has a lot of cases to thread.

Actually, any status different from Success or NotFound will cause it to be called.
It will even trigger the disk IO if needed.

That's where we need to fork paths.

When the disk IO is needed, the ReadAsync path will do the new wiring, thus we cannot call HandleOperationStatus at this point, and should just prepare for the await.

It's important to notice that the only changed path was the ReadAsync one.

Btw, current code will error out the Read if 2 consecutive CPR_SHIFT_DETECTED happen.

Should we keep this? I think this could be in error.

What is keeping the overall state to be in CPR_SHIFT_DETECTED twice, while being on other states in between the two tries, in a classic ABA Problem?

cs/src/core/Index/FASTER/FASTER.cs

ThiagoT1

Yesterday's comments handled.

ThiagoT1 · 2020-03-04T00:47:44Z

cs/src/core/Index/FASTER/FASTER.cs

+            try
+            {
+                do
+                {


This loop is not identical in behavior to the original code. For example, I notice that InternalRefresh would not be called in this tight loop, resulting in never getting out of CPR. The debug asserts for version correctness are also skipped. We need to ensure that all code paths are functionally identical to the older case.

You're right. Thank you! The last commit just fixed this.

Why not use an async equivalent of HandleOperationStatus? This keeps the main FasterKV.ReadAsync method small so it can be inlined, leaving the complexity of uncommon cases to a sub-method.

It was done this way because HandleOperationStatus has a lot of cases to thread.

Actually, any status different from Success or NotFound will cause it to be called.
It will even trigger the disk IO if needed.

That's where we need to fork paths.

When the disk IO is needed, the ReadAsync path will do the new wiring, thus we cannot call HandleOperationStatus at this point, and should just prepare for the await.

It's important to notice that the only changed path was the ReadAsync one.

ThiagoT1 · 2020-03-04T00:50:35Z

cs/src/core/Index/FASTER/FASTER.cs

+            try
+            {
+                do
+                {


Btw, current code will error out the Read if 2 consecutive CPR_SHIFT_DETECTED happen.

Should we keep this? I think this could be in error.

What is keeping the overall state to be in CPR_SHIFT_DETECTED twice, while being on other states in between the two tries, in a classic ABA Problem?

… it. It then is the user role to await every ReadAsync call eventually

ThiagoT1 · 2020-03-07T15:18:42Z

As per last discussions, a new struct is wrapping the compute completion logic, so the user can trigger it.
The completion lock is gone and the user must call ReadAsyncResult.CompleteRead in a single threaded manner.

ThiagoT1 · 2020-03-07T16:00:06Z

BTW,

Thank you @badrishc for the patience and the high level discussions about this issue.

badrishc · 2020-03-08T05:07:39Z

cs/src/core/Allocator/AsyncIOContext.cs

+        /// <summary>
+        /// Async Operation ValueTask backer
+        /// </summary>
+        public FasterAsyncOperation<AsyncIOContext<Key, Value>> asyncOperation;


Is there a good reason to use all the complicated logic in FasterAsyncOperation vs a TaskCompletionSource that we set completed when IO is done?

We should avoid introducing new code if we can avoid it.

Yes.
TCS will yield a task (class) while our FAO will yield and back a ValueTask (struct).
Its one allocation less.

Lots of code inside .net core itself are using IValueTaskSource as is FasterAsyncOperation, as to avoid allocating a Task when possible.

cs/src/core/Index/FASTER/FASTERImpl.cs

cs/src/core/Index/FASTER/FASTERThread.cs

ThiagoT1 and others added 6 commits February 19, 2020 17:38

Merge pull request #1 from microsoft/master

e158d14

Rebase fork

Ignoring serialNo on ReadAsync continuations

36909de

Removed code that could eat readAsync return.

27539d0

Refactored for clarity (DRY)

c6c4468

Read never causes RETRY

c3e1e44

Properly Wire ReadAsync Continuations

34756ad

ThiagoT1 requested a review from badrishc February 21, 2020 21:43

ThiagoT1 added 3 commits February 21, 2020 22:23

RMW and ReadAsync do not share the continuation execution path

ff75ef7

Ensure mono threaded (but not FIFO) continuations

95d1039

Exposes a way for pending reads to be tracked, and their completion a…

63bd56b

…waited for. Tests passing.

badrishc reviewed Feb 26, 2020

View reviewed changes

cs/src/core/Allocator/AsyncIOContext.cs Outdated Show resolved Hide resolved

cs/src/core/Index/FASTER/FASTER.cs Outdated Show resolved Hide resolved

cs/src/core/Index/FASTER/FASTERThread.cs Show resolved Hide resolved

SerialNum increment place fix + Revert AsyncIOContext to struct

36d27fa

badrishc reviewed Feb 26, 2020

View reviewed changes

Merge branch 'master' into issue_246_continuations_wiring_II

c9e314c

badrishc reviewed Mar 3, 2020

View reviewed changes

cs/src/core/Index/FASTER/FASTER.cs Outdated Show resolved Hide resolved

ThiagoT1 added 2 commits March 3, 2020 21:32

Fix Loop Behavior when CPR_SHIFT_DETECTED

b84226b

RelaxedCPR always true on ReadAsync

ba53db9

ThiagoT1 commented Mar 4, 2020

View reviewed changes

Wrapping completions compute inside a struct, so the user can trigger…

647d561

… it. It then is the user role to await every ReadAsync call eventually

ThiagoT1 added 4 commits March 7, 2020 14:30

Dont eat exceptions

ce42b35

Short Circuit Completion

582263a

Prevent multiple completions of the same read.

2847a95

A space

142cff9

badrishc reviewed Mar 8, 2020

View reviewed changes

cs/src/core/Index/FASTER/FASTERImpl.cs Outdated Show resolved Hide resolved

badrishc reviewed Mar 8, 2020

View reviewed changes

cs/src/core/Index/FASTER/FASTERThread.cs Show resolved Hide resolved

ThiagoT1 and others added 4 commits March 8, 2020 08:35

Fixed pendingContext.Id attribution

2cc6186

Make AsyncCountDown lock free

abe95aa

Added random read benchmark for sync/async and batched/unbatched.

35908ca

updated root namespace for benchmark project.

d7e2bd7

badrishc merged commit 26569b2 into microsoft:master Mar 9, 2020

badrishc mentioned this pull request Sep 24, 2020

[C#] Async version of RMW that returns status #339

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 246 continuations wiring (II) #248

Issue 246 continuations wiring (II) #248

ThiagoT1 commented Feb 21, 2020 •

edited

Loading

ThiagoT1 commented Feb 26, 2020

badrishc commented Feb 26, 2020

badrishc commented Feb 26, 2020 •

edited

Loading

badrishc left a comment •

edited

Loading

badrishc Feb 26, 2020

ThiagoT1 Feb 26, 2020 •

edited

Loading

badrishc Feb 27, 2020

ThiagoT1 Feb 27, 2020 •

edited

Loading

ThiagoT1 Feb 27, 2020

ThiagoT1 commented Feb 26, 2020

badrishc Mar 3, 2020

ThiagoT1 Mar 4, 2020

ThiagoT1 Mar 4, 2020

ThiagoT1 left a comment

ThiagoT1 Mar 4, 2020

ThiagoT1 Mar 4, 2020

ThiagoT1 commented Mar 7, 2020

ThiagoT1 commented Mar 7, 2020 •

edited

Loading

badrishc Mar 8, 2020

badrishc Mar 8, 2020

ThiagoT1 Mar 8, 2020

Issue 246 continuations wiring (II) #248

Issue 246 continuations wiring (II) #248

Conversation

ThiagoT1 commented Feb 21, 2020 • edited Loading

ThiagoT1 commented Feb 26, 2020

badrishc commented Feb 26, 2020

badrishc commented Feb 26, 2020 • edited Loading

badrishc left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThiagoT1 Feb 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThiagoT1 Feb 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThiagoT1 commented Feb 26, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThiagoT1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThiagoT1 commented Mar 7, 2020

ThiagoT1 commented Mar 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThiagoT1 commented Feb 21, 2020 •

edited

Loading

badrishc commented Feb 26, 2020 •

edited

Loading

badrishc left a comment •

edited

Loading

ThiagoT1 Feb 26, 2020 •

edited

Loading

ThiagoT1 Feb 27, 2020 •

edited

Loading

ThiagoT1 commented Mar 7, 2020 •

edited

Loading