-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix nondeterminism on Activity/Suborchestration failures #145
Conversation
When we wrap exceptions from the C# layer (i.e. Activity or SubOrchestration failures), we cannot fail the function execution. If we do, the C# layer will not have the chance to replay all of the actions performed by the JS SDK, meaning we will result in a non-determinism error. To avoid this, we mark the JS execution as successful, and put the burden on the C# shim for out-of-proc languages to surface these failures.
@ConnorMcMahon Awesome 👏
Not knowing how C# shim works under the hood yet, could we have a situation where the shim sees a failure and mistakenly treats it as a successful execution to continue processing? A cursory review of the changes seems to indicate that's not a possibility, but just a curiosity. |
At least in the latest version of both Durable 1.x and Durable 2.x, that should not be the case. Given that the code to correctly handle this has been in the C# out-of-proc logic from the beginning, this was a misunderstanding in how the JS-to-C# layer transfers data in the function failure case. I have now fixed the Durable JS tests so that they accurately reflect the data contract that C# expects. That being said, it would be a good idea to add some tests to the C# side of things to make sure it does not regress its side of the contract. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some feedback related to class names and null/undefined handling that I'd like to discuss.
When we throw errors in the JS layer, we need some way to embed the orchestration replay state in the errors so that the C# extension can replay behavior. This is necessary so that the Durable Task does not throw an non-determinism exception.
This also requires a fix on the C# extension to be able to read this embedded data from the RPC exception. Here is the PR for the fix in Durable 1.x: Azure/azure-functions-durable-extension#1171.
Resolve #125