Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffered deserialisation #774

Merged
merged 32 commits into from Apr 3, 2023

Conversation

JJ11teen
Copy link
Contributor

@JJ11teen JJ11teen commented Feb 2, 2023

Hello!

This is a draft PR to add support for deserialising yaml into different types, depending on values found during deserialisation. Specifically, it addresses scenarios mentioned in: #36 #343, #605, #731.

The foundational code included in this PR was written by @atruskie in a gist available here. I am opening this PR with their permission.

Notably, this PR only includes a subset of the functionality, specifically the ability to discern type by looking for a key that has the desired type encoded in its value. This PR does not include the functionality to discern type by looking at the presence of unique keys. It does lay the foundation to bring in this additional functionality in the future if desired.

My use case is very similar to the kubernetes use case (where there is a "kind" key), and this PR should satisfy the kubernetes use case.

In addition to less functionality, I have also:

  • Removed specific use case types that @atruskie included as examples in the gist.
  • Added functionality to specify a max depth/length for buffering values during deserialisation.
  • Added a helper method to DeserializerBuilder.

Here is an example of usage:

var deserializer = new DeserializerBuilder()
    .WithNamingConvention(CamelCaseNamingConvention.Instance)
    .WithBufferedNodeDeserializer(options => {
        options.AddKeyValueDiscriminator<UserSpecifiedBaseClass>(
            "kind",
            ("deployment", typeof(UserSpecifiedSubclassA)),
            ("service", typeof(UserSpecifiedSubclassB)));
    })
    .Build();

I wanted to put this up and gauge interest before working on this more, but the remaining things I know I need to do before merging:

  • Comments/code style changes - I have yet to read up on coding conventions but will do
  • Tests - I have only tested on net7 currently
  • Anything else?

Some other things that could be done in this or future PRs:

  • Add back in the other use of buffered deserialisation from @atruskie's gist?
  • Handle various case-translations for keys/values?

It is also worth mentioning @EdwardCooke that this also works with the AOT branch we've been testing, it just needs a similar helper method on the StaticDeserializerBuilder as that file is not in the codebase yet.

Let me know if this is desired and I'll keep working on it, and thanks for maintaining this library (:

@atruskie
Copy link
Contributor

atruskie commented Feb 2, 2023

Love this; let me know if you want some help to finish it off.

I'd say that the resolver you didn't implement is equally useful and is probably worth adding. I found it more common that yaml files did not include a discriminator key.

Or if you don't agree, I think it would be better to provide a configuration option method to add a custom ITypeDiscriminator so others can extend as they like.

@EdwardCooke
Copy link
Collaborator

This is great! I've seen a few questions/issues related to this. It'll be nice to have it built in.

A few things I see right off the top that would be great to have addressed

  • Comments on public classes and methods
  • Adding a way to add a custom IValueTypeDescriminator would be a good thing and allow for additional flexibility
  • It may be beneficial to mark your methods on your KeyValueTypeDiscriminator as virtual so people can use that class and extend it easily.
  • It looks like the determined class has to be implementing a base class, and only the inherited class is checked. That may not always be the case. Example use cases where that would not be sufficient.
    • The property being assigned may be an object type, not requiring it to be abstract or doing any kind of inheritance.
    • It may also be the class itself, not abstract or inherited.
    • Deep inheritance (class c inherits b, which inherits a and the property type is a).
    • Implementing an interface and the property type is that interface
    • One possible solution would be to have the discriminator do the check to see if it can assign itself to the type, adds more flexibility. And in the default implementation it would call the isassignablefrom method on the type. Pretty sure that's what it is.

For the tests, add them to the tests project. It runs them against net47, net 3.1 (for netstandard2.0), net6.0 and net7.0.

@JJ11teen
Copy link
Contributor Author

JJ11teen commented Feb 3, 2023

Thanks both, good to know there is appetite for this! 😊 I'll work on the things mentioned and let you know when its ready for a closer look (:

@JJ11teen
Copy link
Contributor Author

JJ11teen commented Feb 3, 2023

Okay, I think I have addressed what I consider "design" changes at this point. Still have comments & tests to do though (:
Changes:

  • Renamed "IValueTypeDiscriminator" -> "ITypeDiscriminator" and moved into a sub namespace/directory

  • Added in @atruskie's other "UniqueKeyTypeDiscriminator" functionality

  • Pulled out "TryFindMappingEntry" into existing "ParserExtensions" file

  • Added "AddTypeDiscriminator" to Options class to enable users to add their own custom ITypeDiscriminator implementation more easily

  • Updated "BufferedNodeDeserializer" to use "IsAssignableFrom", and have removed null & identity type checks - but have kept the concept of a "BaseType" on all discriminators. I believe this supports the scenarios you described @EdwardCooke:

    • The property being assigned may be an object type, not requiring it to be abstract or doing any kind of inheritance.
    • It may also be the class itself, not abstract or inherited.
    • Deep inheritance (class c inherits b, which inherits a and the property type is a).
    • Implementing an interface and the property type is that interface
  • However I stopped short of having the discriminator doing the assignment check, as my understanding is that this would result in everything (ie the whole yaml stream) being buffered. The current implementation only starts buffering when the expected type has an associated discriminator. Users can still explicitly use object for the BaseType as you describe, but it is the users responsibility. We can make the BaseType default to object if you think this is a common enough use case and always buffering by default is worth it?

@JJ11teen
Copy link
Contributor Author

JJ11teen commented Feb 3, 2023

Actually the BaseType = object scenario is more complex than I previously thought - won't get to it today but will keep thinking (:

@JJ11teen
Copy link
Contributor Author

JJ11teen commented Feb 6, 2023

Okay, I have reworked things such that BaseType = object is now supported (see tests).

In order to accomplish this, I have made the BufferedNodeDeserializer now wrap the entire list of NodeDeserializers that the Builder has specified. I found this was needed as otherwise the DictionaryNodeDeserializer would take precedence when BaseType=object, and the BufferedNodeDeserializer was never reached. I also tried just inserting the previous implementation earlier in the builder's deserializer list, but that failed when the deserialized type had a Dictionary as a sub type (as in the case for kubernetes labels/annotations). I'm wondering if you're aware of a better way to handle this NodeDeserializer dependency than what I've got now @EdwardCooke ?

@JJ11teen
Copy link
Contributor Author

JJ11teen commented Feb 6, 2023

I can't think of a better solution without changing the signature for INodeDeserializer.

public interface INodeDeserializer
{
    bool Deserialize(IParser reader, Type expectedType, Func<IParser, Type, object?> nestedObjectDeserializer, out object? value);
}

The only output of each INodeDeserializer is the deserialized value, (and whether or not the deserializer was able to deserialize any value). This is at odds with the desire for the BufferedNodeDeserializer, which is to "Buffer and deserialize to determine a target type, then reset the buffer and redirect to an appropriate existing INodeDeserializer that matches the newly determined target type".

This is why the BufferedNodeDeserializer needs to wrap all other INodeDeserializers. However all the others also need to exist outside of the BufferedNodeDeserializer for cases when buffering isn't needed/desired.

I don't like the duplication of all of the INodeDeserializers but I think it is a trade off between that and more significantly impacting the existing codebase/functionality/user expectations. Does anyone have any ideas for a more elegant solution? 😂

@JJ11teen JJ11teen marked this pull request as ready for review February 7, 2023 01:21
@JJ11teen
Copy link
Contributor Author

Hey @EdwardCooke, just a ping that this is ready for review (:
I rebased on master and added the helper to the static builder yesterday too.

@EdwardCooke
Copy link
Collaborator

Great! I’ll look at it in greater detail when I get some time. Most likely this weekend.

@JJ11teen
Copy link
Contributor Author

Thanks for the review @EdwardCooke! Will aim to get back to this over the coming week 😊

@JJ11teen
Copy link
Contributor Author

JJ11teen commented Mar 5, 2023

I think I've addressed everything @EdwardCooke! 😊
@atruskie if you have time it would be great to get your eyes over the change to TryFindMappingEntry and how I have described it's functionality in the xml doc. I've given it a go with my understanding but I may have missed a quirk of it's original design/intention.

Copy link
Contributor

@atruskie atruskie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic!

A few notes:

  • Your tests don't cover a fall through case (i.e. two registrations of AddKeyValueTypeDiscriminator or AddUniqueKeyTypeDiscriminator) but the code seems to support it. I.e. would this work:
            var bufferedDeserializer = new DeserializerBuilder()
                .WithNamingConvention(CamelCaseNamingConvention.Instance)
                .WithBufferedNodeDeserializer(options => {
                    options.AddKeyValueTypeDiscriminator<KubernetesResource>(
                        "kind",
                        new Dictionary<string, Type>()
                        {
                            { "Namespace", typeof(KubernetesNamespace) },
                        })
                       .AddKeyValueTypeDiscriminator<KubernetesResource>(
                        "kind",
                        new Dictionary<string, Type>()
                        {
                            { "Service", typeof(KubernetesService) }
                        });
                    },
                    maxDepth: 3,
                    maxLength: 40)
                .Build();
  • The name BufferedNodeDeserializer is probably one I would change (I know I chose the name originally, my bad) since it doesn't describe what it's doing but rather how - and how is unimportant. Suggestion: TypeDiscriminatingNodeDeserializer.

I'm so hyped you did the work for this. There's no way I would have had the capacity, so thank you!

YamlDotNet/Core/ParserExtensions.cs Outdated Show resolved Hide resolved
YamlDotNet/Core/ParserExtensions.cs Outdated Show resolved Hide resolved
@JJ11teen
Copy link
Contributor Author

Thanks @atruskie! Have made those changes (:

@EdwardCooke
Copy link
Collaborator

Once the latest comments are fixed we should be good to merge this in.

@JJ11teen
Copy link
Contributor Author

All done @EdwardCooke! (:

@EdwardCooke
Copy link
Collaborator

I’ll merge this in when I get time, should be this week.

@EdwardCooke EdwardCooke merged commit 9a5900f into aaubry:master Apr 3, 2023
@pmikstacki
Copy link

Hi! When it will be available as nuget package?

@aaubry
Copy link
Owner

aaubry commented Apr 16, 2023

This feature has been released in version 13.1.0.

@rcdailey
Copy link

I have just pulled down the 13.1.0 release and tried using this, but the lack of documentation is killing me.

  • This PR didn't serve as a helpful form of documentation due to the change in class names over time
  • There's no documentation markdown anywhere that has tutorials or the like to help people get started with this.
  • Some public interfaces, like ITypeDiscriminatingNodeDeserializerOptions, nave no C# documentation so it's unclear, for example, what the difference is between AddUniqueKeyTypeDiscriminator and AddKeyValueTypeDiscriminator.

Don't get me wrong, I'm not here just to complain. Just pointing out something that I think will make it difficult for users to adopt this new feature. I'm sure if I swim into the unit tests I'll eventually figure it out, but hours are pretty valuable these days...

In any case, I appreciate the months you folks spent on getting this ready.

@EdwardCooke
Copy link
Collaborator

Thanks for the feedback, I'll work on building out some documentation in the wiki, I know this particular feature was a bit complex in nature. I'll also work on getting those missing XML comments on the interface.

@EdwardCooke
Copy link
Collaborator

Here's a little wiki page documenting how to use this new feature

https://github.com/aaubry/YamlDotNet/wiki/Deserialization---Type-Discrimators

Copy link

@R3ven6eR R3ven6eR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • [ ]

@rcdailey
Copy link

Here's a little wiki page documenting how to use this new feature

aaubry/YamlDotNet/wiki/Deserialization---Type-Discrimators

For future visitors, this is the correct link: https://github.com/aaubry/YamlDotNet/wiki/Deserialization---Type-Discriminators

Thanks again for the examples!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants