Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing type hierarchies doesn't seem to work #8200

Closed
erik-kallen opened this issue May 28, 2024 · 10 comments
Closed

Deserializing type hierarchies doesn't seem to work #8200

erik-kallen opened this issue May 28, 2024 · 10 comments
Labels
8.x Relates to 8.x client version Category: Not an issue

Comments

@erik-kallen
Copy link

Elastic.Clients.Elasticsearch version: 8.12.0

Elasticsearch version: 8.5.2

.NET runtime version: 8

Operating system version: Windows 10

Description of the problem including expected versus actual behavior:

Deserializing response type hierarchies doesn't seem to work. My code:

[JsonDerivedType(typeof(Derived1), typeDiscriminator: "d1")]
[JsonDerivedType(typeof(Derived2), typeDiscriminator: "d2")]
public abstract class Base
{
}

public class Derived1 : Base {}
public class Derived2 : Base {}

...

var response = await _client.MultiSearchAsync<Base>(req => req.AddSearch(...));

**Expected behavior**
The responses should be deserialized to the correct derived type

**Actual behavior**

An exception with this stack trace:

Elastic.Transport.UnexpectedTransportException : Unable to deserialize union.
----> System.Text.Json.JsonException : Unable to deserialize union.
at Elastic.Transport.DefaultHttpTransport1.ThrowUnexpectedTransportException[TResponse](Exception killerException, List1 seenExceptions, RequestData requestData, TResponse response, RequestPipeline pipeline)
at Elastic.Transport.DefaultHttpTransport`1.RequestCoreAsync[TResponse](Boolean isAsync, HttpMethod method, String path, PostData data, RequestParameters requestParameters, OpenTelemetryData openTelemetryData, CancellationToken cancellationToken)
at ...

--JsonException
at Elastic.Clients.Elasticsearch.Serialization.UnionConverter.DerivedUnionConverterInner3.Read(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options) in /_/src/Elastic.Clients.Elasticsearch.Shared/Serialization/UnionConverter.cs:line 127 at System.Text.Json.Serialization.JsonConverter1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue)
at System.Text.Json.Serialization.JsonCollectionConverter2.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, TCollection& value) at System.Text.Json.Serialization.JsonConverter1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue)
at System.Text.Json.Serialization.Metadata.JsonPropertyInfo1.ReadJsonAndSetMember(Object obj, ReadStack& state, Utf8JsonReader& reader) at System.Text.Json.Serialization.Converters.ObjectDefaultConverter1.OnTryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value)
at System.Text.Json.Serialization.JsonConverter1.TryRead(Utf8JsonReader& reader, Type typeToConvert, JsonSerializerOptions options, ReadStack& state, T& value, Boolean& isPopulatedValue) at System.Text.Json.Serialization.JsonConverter1.ReadCore(Utf8JsonReader& reader, JsonSerializerOptions options, ReadStack& state)
at System.Text.Json.Serialization.Metadata.JsonTypeInfo1.DeserializeAsync(Stream utf8Json, CancellationToken cancellationToken) at Elastic.Transport.DefaultResponseBuilder1.SetBodyCoreAsync[TResponse](Boolean isAsync, ApiCallDetails details, RequestData requestData, Stream responseStream, String mimeType, CancellationToken cancellationToken)
at Elastic.Transport.DefaultResponseBuilder1.ToResponseAsync[TResponse](RequestData requestData, Exception ex, Nullable1 statusCode, Dictionary2 headers, Stream responseStream, String mimeType, Int64 contentLength, IReadOnlyDictionary2 threadPoolStats, IReadOnlyDictionary2 tcpStats, CancellationToken cancellationToken) at Elastic.Transport.HttpTransportClient.RequestCoreAsync[TResponse](Boolean isAsync, RequestData requestData, CancellationToken cancellationToken) at Elastic.Transport.DefaultRequestPipeline1.CallProductEndpointCoreAsync[TResponse](Boolean isAsync, RequestData requestData, CancellationToken cancellationToken)
at Elastic.Transport.DefaultHttpTransport`1.RequestCoreAsync[TResponse](Boolean isAsync, HttpMethod method, String path, PostData data, RequestParameters requestParameters, OpenTelemetryData openTelemetryData, CancellationToken cancellationToken)


@erik-kallen erik-kallen added 8.x Relates to 8.x client version Category: Bug labels May 28, 2024
@flobernd
Copy link
Member

flobernd commented May 28, 2024

Hi @erik-kallen,

this is not a bug in the Elasticsearch client. The root exception is not correctly propagated it seems:

System.NotSupportedException: 'Deserialization of types without a parameterless constructor, a singular parameterized constructor, or a parameterized constructor annotated with 'JsonConstructorAttribute' is not supported.

This indicates that the descriminator value is missing and the deserializer tries to instantiate the abstract Base class.

Could you please share the code that you use to index your entities? It's important to make sure you actually serialize a Base object and not Derived1/Derived2. Otherwise System.Text.Json does not emit the descriminator property.

For example:

JsonSerializer.Serialize(new Derived1()); // <- {}
JsonSerializer.Serialize((Base)new Derived1()); // <- {"$type":"d1"}

This is a bit unintuitive, but unfortunately the way STJ works internally.

If you - for some reason - can't cast to the base class before indexing, the following might serve as a workaround as well (untested):

public class Derived1 : Base
{
	[JsonPropertyName("$type")]
	public string Discriminator => "d1";
}

public class Derived2 : Base
{
	[JsonPropertyName("$type")]
	public string Discriminator => "d2";
}

Please let me know if that solves your issue.

@erik-kallen
Copy link
Author

No, I checked and the $type property was present in the response.

@erik-kallen
Copy link
Author

But you are right, my contrived repro does, indeed, work. So there is something else that triggers the problem somewhere. I'll investigate further.

@erik-kallen
Copy link
Author

@flobernd The problem has to do with SourceConfig. I'm not sure whether I'd call this a bug in ElasticSearch.Net, but it is an issue nevertheless. This will actually fail:

[TestFixture]
public class TestIt
{
    [JsonDerivedType(typeof(Derived1), typeDiscriminator: "D1")]
    [JsonDerivedType(typeof(Derived2), typeDiscriminator: "D2")]
    abstract class Base
    {
    }
    class Derived1 : Base
    {
        public string? P1 { get; set; }
    }
    class Derived2 : Base
    {
        public string? P2 { get; set; }
    }

    [Test]
    public async Task DoTestIt()
    {
        var client = Universe.ElasticClient;
        var settings = new JsonSerializerOptions { PropertyNamingPolicy = JsonNamingPolicy.CamelCase };
        var s = JsonSerializer.Serialize<Base>(new Derived1(), settings);
        var d = JsonSerializer.Deserialize<Base>(s, settings);

        await client.Indices.DeleteAsync("test");
        await client.Indices.RefreshAsync();
        await client.Indices.CreateAsync<Base>(req => req.Index("test"));
        await client.IndexManyAsync(new Base[] {  new Derived1 { P1 = "hello" }, new Derived2 { P2 = "world" } }, "test");
        await client.Indices.RefreshAsync();
        var response = await client.MultiSearchAsync<Base>(req => req.Indices("test").AddSearch(new SearchRequestItem(new MultisearchBody() { Source = new SourceConfig(new SourceFilter { Includes = new[] { (Field)"$type", Prop<Derived1>(d => d.P1) } }) })));
        var firstItem = response.Responses.Single();
        firstItem.Match(success =>
        {
            var docs = success.HitsMetadata.Hits.ToList();
        }, fail => Assert.Fail(fail.ToString()!));
    }
}

The problem is that in the presence of a SourceConfig, Elastic will return fields out of order, which the deserializer is not happy with.

@erik-kallen
Copy link
Author

Seems like this can be fixed in .net 9: dotnet/runtime#72604

@flobernd
Copy link
Member

flobernd commented May 28, 2024

@erik-kallen Thanks for investigating. I can see why STJ wants the descriminator as the first property for performance reasons, but it's obviously a blocker in this case.

We had to solve a similar issue internally:

Due to the use of code-generation it's possible for us to implement a solution without a noticable performance hit. We have a custom converter for all polymorphic classes that collects the value of all possible properties in local variables. At the very end, we create the actual instance of the class based on the discriminator and initialize all properties.

A more generic approach would look like this:

using System;
using System.Text.Json;
using System.Text.Json.Serialization;

using Elastic.ClientGenerator.Schema.Values;

using ZySharp.Validation;

namespace Elastic.ClientGenerator.Schema.Serialization;

#pragma warning disable CA1812

internal sealed class SchemaValueConverter :
    JsonConverter<SchemaValue>

#pragma warning restore CA1812
{
    public override bool CanConvert(Type typeToConvert)
    {
        ValidateArgument.For(typeToConvert, nameof(typeToConvert), v => v.NotNull());

        return typeToConvert.IsAssignableFrom(typeof(SchemaValue));
    }

    public override SchemaValue? Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        ValidateArgument.For(typeToConvert, nameof(typeToConvert), v => v.NotNull());

#pragma warning disable CA2000
        if (!JsonDocument.TryParseValue(ref reader, out var doc))
#pragma warning restore CA2000
        {
            throw new JsonException("Failed to parse JsonDocument");
        }

        if (!doc.RootElement.TryGetProperty("kind", out var kind))
        {
            throw new JsonException("Failed to read 'kind' property.");
        }

        var kindValue = kind.Deserialize<SchemaValueKind>();
        var rootElement = doc.RootElement.GetRawText();

        doc.Dispose();

        return kindValue switch
        {
            SchemaValueKind.InstanceOf => JsonSerializer.Deserialize<SchemaValueInstanceOf>(rootElement, options),
            SchemaValueKind.ArrayOf => JsonSerializer.Deserialize<SchemaValueArrayOf>(rootElement, options),
            SchemaValueKind.UnionOf => JsonSerializer.Deserialize<SchemaValueUnionOf>(rootElement, options),
            SchemaValueKind.DictionaryOf => JsonSerializer.Deserialize<SchemaValueDictionaryOf>(rootElement, options),
            SchemaValueKind.UserDefinedValue => JsonSerializer.Deserialize<SchemaValueUserDefinedValue>(rootElement, options),
            SchemaValueKind.LiteralValue => JsonSerializer.Deserialize<SchemaValueLiteralValue>(rootElement, options),
            _ => throw new NotSupportedException($"Unsupported '{typeToConvert.Name}' variant")
        };
    }

    public override void Write(Utf8JsonWriter writer, SchemaValue value, JsonSerializerOptions options)
    {
        throw new NotImplementedException();
    }
}
[JsonConverter(typeof(SchemaValueConverter))]
public abstract record SchemaValue
{
    ...

This obviously comes with a performance hit due to the double deserialization. Maybe it helps anyways.

@flobernd
Copy link
Member

Since we are using the STJ NuGet package anyways, I'm happy to upgrade to 9.0.0 as soon as there is a stable version. For now I guess, we have to live with the workarounds.

There is nothing I can do from my side.

@erik-kallen
Copy link
Author

I agree, I don't think you should change anything.

But would it be feasible to make the JsonSerializerOptions configurable?

Also, what do you mean with that you are using the STJ NuGet package? Won't that cause issues if someone is using a .net version whose STJ version doesn't match the one you are referencing from Nuget?

@flobernd
Copy link
Member

I agree, I don't think you should change anything.

👍 Going to close this issue for now.

But would it be feasible to make the JsonSerializerOptions configurable?

This is already possible (a little bit hidden) by using the ElasticsearchClientSettings constructor overload that accepts a SourceSerializerFactory lambda:

var settings = new ElasticsearchClientSettings(new SingleNodePool(new Uri("...")),
    (serializer, settings) =>
        new DefaultSourceSerializer(settings, options =>
        {
            options.AllowTrailingCommas = true;
        }));

Also, what do you mean with that you are using the STJ NuGet package? Won't that cause issues if someone is using a .net version whose STJ version doesn't match the one you are referencing from Nuget?

It's a transient dependency pulled in by Elastic.Transport and so far there havn't been any problems I'm aware of:
https://www.nuget.org/packages/Elastic.Transport/#dependencies-body-tab

The package is only pulled in, if needed. For net8 targets, the native framework is used, but e.g. net6 references the NuGet package.

@erik-kallen
Copy link
Author

Then I guess upgrading STJ to 9.0 wouldn't be a good idea. But perhaps there could be an option to change this setting if on .net9? Or perhaps that's not necessary because there is a workaround available so the user can specify the settings they want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.x Relates to 8.x client version Category: Not an issue
Projects
None yet
Development

No branches or pull requests

2 participants