Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to deserialize SqlException from TaskActivity #697

Closed
expertsender-marcinsynak opened this issue Apr 10, 2019 · 18 comments
Closed

Failed to deserialize SqlException from TaskActivity #697

expertsender-marcinsynak opened this issue Apr 10, 2019 · 18 comments
Labels

Comments

@expertsender-marcinsynak

Error similar to the one fixed in #459 is still happening in some cases in v1.8.0.

I have a much simpler case where simple orchestrator calls an activity that executes some SQL query. The query fails and activity throws SqlException. It seems that this kind of exception is especially nasty to serialize:

Microsoft.Azure.WebJobs.FunctionFailedException: The activity function 'GetViewers' failed: "Failed to deserialize exception from TaskActivity: {"$type":"System.Data.SqlClient.SqlException, System.Data.SqlClient","ClassName":"System.Data.SqlClient.SqlException","Message":"Invalid object name 'ActiveViewer_1_19'."

The whole stack trace is very long, but I think this is an interesting part:

System.InvalidCastException: Unable to cast object of type 'Newtonsoft.Json.Linq.JValue' to type 'System.Guid'. at System.Data.SqlClient.SqlException..ctor(SerializationInfo si, StreamingContext sc) at lambda_method(Closure , Object[] ) at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.CreateISerializable(JsonReader reader, JsonISerializableContract contract, JsonProperty member, String id)

There is indeed a Guid field in SqlException and it seems it causes the serialization issue:

"ClientConnectionId":"b8c66db2-dfcc-47f1-9f33-f3863f07b387"

This should be very easy to reproduce. Just make an activity that executes wrong SQL.

@ghost ghost added the Needs: Triage 🔍 label Apr 10, 2019
@cgillum cgillum added this to the Triage milestone Apr 12, 2019
@tompostler
Copy link

I believe this is because Azure Functions (and Durable Tasks) run with Newtonsoft v9. Newtonsoft v10 has a fix to load assemblies by partial name match (lines 80-81: JamesNK/Newtonsoft.Json@4605894#diff-0bb45c7f4b1341bf2e93b461f66bd0f5).

However, I'm struggling to confirm that a later version of Newtonsoft would fix this issue. No level of static or runtime binding redirects that I can define appear to affect the v9 Newtonsoft usage.

@ConnorMcMahon
Copy link
Contributor

We can potentially fix this for the v2 release of Durable Functions extension. We should at least be able to fix this for Functions V2. Functions V1 extensions have more limitations about what dependencies they can use due to the lack of separation of app domains in the Functions V1 runtime.

@ghost
Copy link

ghost commented Jan 21, 2020

I just hit this problem as well. However the message in my case is the following:

Unable to find a constructor to use for type System.Data.SqlClient.SqlException. A class should either have a default constructor, one constructor with arguments or a constructor marked with the JsonConstructor attribute. Path 'ClassName', line 1, position 81.

I am using .NET Core 3.1.
Is there something that I can use to work around this issue?

@ConnorMcMahon
Copy link
Contributor

I would try injecting an IErrorSerializerSettingsFactory (introduced in v2.1) with custom JsonConverter objects on it to get around this issue. We will be adding official documentation for how to do this soon (#1208).

@brendon-r
Copy link

I'm seeing the same error message as @b10-dslappendel recently.

@ConnorMcMahon could you elaborate on your workaround?

@ConnorMcMahon
Copy link
Contributor

This stack overflow answer shows how to use dependency injection to use customer serialization settings for messages. You can follow a nearly identical solution, but with IErrorSerializerSettingsFactory to support custom JSON serialization for exceptions.

@junalmeida
Copy link

This is also a thing for me because it invalidates the code in RetryOptions.Handle when using the method CallActivityWithRetryAsync.

Is there any estimates?

@ConnorMcMahon
Copy link
Contributor

@junalmeida, we are unlikely to change our own serialization strategy, as that can have broad reaching implications. This is why we exposed the IErrorSerializerSettingsFactory via dependency injection, so that you can provide your own application with the serialization settings you need to handle errors thrown by your application in a robust way.

@junalmeida
Copy link

junalmeida commented Oct 28, 2020

Currently my fast-track workaround is to catch the SqlException, determine if it is a transient one to then throw another exception that durable functions can handle.

I understand that a fix on that may cause side effects, however, I don't think keeping it as it is today is an acceptable path, because the internal serializer used causes an exception and obfuscates the real one, clearly a bug or at least an incompatibility between two Microsoft owned libraries. Maybe at least a plan to fix on the next major version release?

@ConnorMcMahon
Copy link
Contributor

@junalmeida, I would challenge the notion that this is a bug, and assert this is a general limitation. Certain types do not serialize/deserialize cleanly, and one of the requirements for Durable Functions is that the data that is persisted (be that orchestration/activity inputs, outputs or exceptions) have to be serializable and deserializable. I think we need to more clearly document these limitations, but at the end of the day, there is not much we can do if the data given to us cannot be serialized/deserialized ina consistent way.

There are sets of settings with Newtonsoft that would potentially work with a wider variety of types that we could consider down the road in a Durable Functions v3.x (which is not planned anytime soon), but these often have a tradeoff of being larger strings and being more rigid and fragile to things like type renames. Essentially, serialization is a hard problem to make work in a generic fashion, so we chose what we deem to be reasonable defaults, and allow apps to customize the serialization/deserialization logic to their own needs.

We are looking into solutions that would handle this more gracefully, but for now, the answer is to provide your own serialization.

@ConnorMcMahon
Copy link
Contributor

As for this specific exception, it looks like the owner of that library are working to make it work better with serialization in their new major version: dotnet/SqlClient#524.

@junalmeida
Copy link

I understand that certain types can't serialize/deserialize cleanly, but we are not talking about user-generated code, we are talking about the very well known SqlException, and the impact that failing to deserialize it causes on the RetryOptions.Handle method, a piece of code supposed to be supported by this library, currently not fully working as expected due to this bug or limitation.

I agree with you that these limitations should be better documented, however do you think that stating that SqlException can't be thrown is something to be well received by your users? I don't think so, but this is a call on your team to decide. I can speak for myself and I don't agree that is acceptable to add a custom code to deal with a limitation between two libraries owned by same vendor, specially touching the serialization subject, one of the main foundations Durable Functions is built on top of.

Don't get me wrong, this library is perfect, works very well on a variety of scenarios I need today. I just gently disagree on a decision to not fixing a limitation caused by the combination of two libraries owned by the same vendor that are more than supposedly meant to be used together.

@ConnorMcMahon
Copy link
Contributor

That's completely fair @junalmeida. My main hesitation is that there are alot of official Microsoft libraries, and making sure we serialize all of the exceptions from each of them (even just the most commonly used ones) would put a fairly large burden on us as a development team, especially since new libraries are being created all of the time.

Ideally, each library's developers should ensure that their data/exceptions is more easily serializable with common serialization libraries. In this specific case, it looks like the SqlException defined in versions greater than v2.0.0 of Microsoft.Data.SqlClient works much better with Newtonsoft than the version defined inthe legacy System.Data.SqlClient namespace.

@robs
Copy link

robs commented Jan 27, 2021

I'm getting a similar issue with "Failed to deserialize" when using context.CallHttpAsync if the target endpoint doesn't exist.

It looks like it's because the string passed to JsonDataConverter.Deserialize isn't the serialized exception, it's just the exception message.

Change line 174 in DurableTask.Core\TaskActivityDispatcher.cs
from
? $"Unhandled exception while executing task: {e}\n\t{e.StackTrace}"
to
new JsonDataConverter().Serialize(e)

That resolves the raft of "Failed to deserialize" errors in the logs because it's now the serialized exception instead of a message.
I appreciate that creating a new JsonDataConverter isn't the right way to do that but thought I'd mention it as it's nothing to do with the serializability of the exception.

@ConnorMcMahon
Copy link
Contributor

@cgillum, taking a look at the code @robs linked to here, this does seem like a bug in DurableTask.Core, as the details we pass into TaskFailedEvent should be a serialized exception, not a constructed string like at line 174.

I'm a little concerned about that change, considering it would impact all Durable Task customers, but I don't see any situation where this would be negative unless customers have written around this bug by explicitly catching TaskFailedExceptionDeserializationException. Thoughts?

@cgillum
Copy link
Collaborator

cgillum commented Jan 27, 2021

I'm a little concerned about that change, considering it would impact all Durable Task customers

@ConnorMcMahon I'm a little less concerned about this because this particular catch block is primarily intended to catch infrastructure exceptions and not those thrown by activity functions. Rather, the catch (TaskFailureException) block above this is meant to handle exceptions in customer code. That said, the way the exception is thrown doesn't make a lot of sense to me and I think it would be useful to fix it so that orchestrations can properly handle the exception. I acknowledge that this would be a breaking change though if existing DTFx users took a dependency on this buggy behavior.

I'm getting a similar issue with "Failed to deserialize" when using context.CallHttpAsync if the target endpoint doesn't exist.

This probably means we need to add a catch block in TaskHttpActivityShim.cs that catches exceptions from HttpClient and rethrows them as a TaskFailureException, just like we do when we encounter OperationCanceledException. @bachuv has some context on this from the timeout work she did.

@ConnorMcMahon
Copy link
Contributor

ConnorMcMahon commented Jan 27, 2021

I think the best course of action would be to split @robs concerns into two separate issues. One in the durabletask repo to track improving the infrastructure exceptions handling, and another in this repo to track the CallHttpAsync() exception.

I'll try to file these today, and close this issue, because I believe the original issue was addressed as well, and these are completely separate.

@ConnorMcMahon
Copy link
Contributor

Both issues created.

Since they were unrelated to the initial issue, and that initial issue has a workaround, I'm going to close this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants