Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tx fixes #4616

Merged
merged 6 commits into from May 23, 2018
Merged

Tx fixes #4616

merged 6 commits into from May 23, 2018

Conversation

sebastianburckhardt
Copy link
Contributor

Several bug fixes for bugs that caused hangs in my randomized tests

  • fix bug in commit logic that causes hangs in some situations
  • fix 2 bugs in azure table transactional state storage (one in tracing, one causing hangs)
  • make TA notify participants of aborts for read-only transactions

also, I made tracing in azure table transactional state storage more uniform.

Copy link
Contributor

@jason-bragg jason-bragg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, mostly looks good.
Are there any test cases covering these areas? How did you find these issues?

@@ -102,6 +105,9 @@ public Task<ITransactionInfo> StartTransaction(bool readOnly, TimeSpan timeout)
if (logger.IsEnabled(LogLevel.Debug))
logger.Debug($"{stopwatch.Elapsed.TotalMilliseconds:f2} timeout {transactionInfo.TransactionId} prepare responses");

foreach(var p in participants)
p.Key.Abort(transactionInfo.TransactionId).Ignore();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the agent make this call? Isn't TM authoritative? What if tm has committed transaction the, but this times out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the logic for read-only transaction. For those, there is no TM, instead the TA "commits" the transaction after it receives all the prepare responses. These responses are sent by the participants after they unlock, and after they persist the clock.

@@ -93,6 +93,9 @@ public Task<ITransactionInfo> StartTransaction(bool readOnly, TimeSpan timeout)
if (logger.IsEnabled(LogLevel.Debug))
logger.Debug($"{stopwatch.Elapsed.TotalMilliseconds:f2} fail {transactionInfo.TransactionId} prepare response status={status}");

foreach (var p in participants)
p.Key.Abort(transactionInfo.TransactionId).Ignore();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we wait for the aborts? If not, won't this allow future calls to fail because they may depend on the aborted operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of these "aborts" is simply to tell the grains that they can unlock now. All locks time out pretty quickly (7 sec), but it is of course better to send the explicit abort messages, so that any transactions waiting for the lock can proceed immediately.

@sebastianburckhardt
Copy link
Contributor Author

All of these bugs were found by the randomized consistency tests I am currently developing. These checks are not yet ready to check in - I haven't even gotten to the serializability checking yet - but they are already flushing out lots of bugs. I found one more yesterday afternoon, see the extra commit from this morning.

I will create a separate PR for these tests when they are ready.

@jason-bragg
Copy link
Contributor

@dotnet-bot test Ubuntu please

@jason-bragg
Copy link
Contributor

@dotnet-bot test Ubuntu16.04 Release Build please

@jason-bragg jason-bragg merged commit 44a0050 into dotnet:master May 23, 2018
@github-actions github-actions bot locked and limited conversation to collaborators Dec 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants