Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix nondeterminism on Activity/Suborchestration failures #145

Merged
merged 8 commits into from
Jan 24, 2020

Conversation

ConnorMcMahon
Copy link

@ConnorMcMahon ConnorMcMahon commented Jan 14, 2020

When we throw errors in the JS layer, we need some way to embed the orchestration replay state in the errors so that the C# extension can replay behavior. This is necessary so that the Durable Task does not throw an non-determinism exception.

This also requires a fix on the C# extension to be able to read this embedded data from the RPC exception. Here is the PR for the fix in Durable 1.x: Azure/azure-functions-durable-extension#1171.

Resolve #125

When we wrap exceptions from the C# layer (i.e. Activity or
SubOrchestration failures), we cannot fail the function execution. If we
do, the C# layer will not have the chance to replay all of the actions
performed by the JS SDK, meaning we will result in a non-determinism
error.

To avoid this, we mark the JS execution as successful, and put the
burden on the C# shim for out-of-proc languages to surface these
failures.
@antempus
Copy link

@ConnorMcMahon Awesome 👏

To avoid this, we mark the JS execution as successful and put the burden on the C# shim for out-of-proc languages to surface these
failures.

Not knowing how C# shim works under the hood yet, could we have a situation where the shim sees a failure and mistakenly treats it as a successful execution to continue processing?

A cursory review of the changes seems to indicate that's not a possibility, but just a curiosity.

@ConnorMcMahon
Copy link
Author

ConnorMcMahon commented Jan 15, 2020

@antempus

At least in the latest version of both Durable 1.x and Durable 2.x, that should not be the case.

Given that the code to correctly handle this has been in the C# out-of-proc logic from the beginning, this was a misunderstanding in how the JS-to-C# layer transfers data in the function failure case. I have now fixed the Durable JS tests so that they accurately reflect the data contract that C# expects.

That being said, it would be a good idea to add some tests to the C# side of things to make sure it does not regress its side of the contract.

package-lock.json Outdated Show resolved Hide resolved
src/orchestrator.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@cgillum cgillum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some feedback related to class names and null/undefined handling that I'd like to discuss.

src/durablefailure.ts Outdated Show resolved Hide resolved
src/outofprocerrorwrapper.ts Outdated Show resolved Hide resolved
src/outofprocerrorwrapper.ts Outdated Show resolved Hide resolved
src/orchestrator.ts Outdated Show resolved Hide resolved
src/durableerror.ts Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception in activity function fails orchestrator with non-deterministic error
4 participants