Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

object array not serialized. #26

Closed
bj0 opened this issue May 6, 2013 · 38 comments
Closed

object array not serialized. #26

bj0 opened this issue May 6, 2013 · 38 comments

Comments

@bj0
Copy link

bj0 commented May 6, 2013

If I try to serialize an object array like so:

var s = new Serializer();
var tw = new StringWriter();
s.Serialize(tw, new { Name="test", Item=new object[] { 1, 2, "3"} } );
var output = tw.ToString()

I get output like this:

Name: test
Item: 
- {}
- {}
- {}

It seems that it does not properly get the boxed values from the object array.

@roji
Copy link
Contributor

roji commented May 7, 2013

Antoine, I can confirm this, will work on a pull request for a fix.

@roji
Copy link
Contributor

roji commented May 7, 2013

@aaubry, I've looked into this and I just want your opinion before I go ahead and do a semi-major refactor...

The problem starts in FullObjectGraphTraversalStrategy.SerializeList; for any non-generic list, the item type passed to Traverse() is object, and Traverse() than attempts to serialize the item types (which are boxed integers) as objects, and this doesn't work.

What I'm not sure about, is why Traverse() receives a type parameter externally in the first place. Instead of determining the type outside Traverse() each time (in complicated and sometimes wrong ways), I would like to have Traverse() simply accept a value and extract its type internally with GetType(). This would simplify everything and fix this bug.

Here's another example of why the current system is not so good... Say you have a generic list List, which contains some instances of B (where B inherits A). Under the current system, the type used for serialization is A (the generic parameter of the list), and so instances of B are serialized as A. This could provoke some bad errors if the type is consulted during serialization (e.g. property attributes like aliasing, ignore).

Makes sense to you? Can you think of a good reason why types should be determined outside Traverse()?

@roji
Copy link
Contributor

roji commented May 14, 2013

Hey @aaubry, what do you think of the above? Will it be easier to understand if I just implement my proposal and submit a pull request?

@cdhunt
Copy link

cdhunt commented May 15, 2013

Would this also affect a List<> in a class?

public class IisSite
    {
        public string Id { get; set; }

        public string Name { get; set; }

        [Display(Name = "AppPool Name")]
        public string AppPool { get; set; }

        [Display(Name = "Home Directory")]
        public string Path { get; set; }

        [Display(Name = "Host Names")]
        public List<String> HostHeaders = new List<string>();

        [Display(Name = "Virtual Directories")]
        public StringDictionary VirtualDirectories = new StringDictionary();
    }

Right now, the strings get serialized fine, but there is nothing serialized for List or StringDictionary.

@roji
Copy link
Contributor

roji commented May 15, 2013

I'm not sure, I don't think so... The problem I described above is related only to non-generic lists containing primitives (since their type is assumed to be object and object serialization of primitives doesn't work).

Are you sure it isn't because the strings are properties, but the list and dictionary are fields? If I remember correctly YamlDotNet only serializes properties...

@cdhunt
Copy link

cdhunt commented May 15, 2013

That makes sense. Thanks.

@roji
Copy link
Contributor

roji commented May 15, 2013

Sure. When you have some time let me know if my proposal makes sense, or if I should just write it up as a pull request to make it more understandable.

@cdhunt
Copy link

cdhunt commented May 15, 2013

I see Traverse(propertyDescriptor.Name, typeof(string), visitor, currentDepth); where the type value is statically passed. I assume that propertyDescriptor.Name is a string so the value of type doesn't need to be static for that case. Your proposal make sense as far as I understand it, but it looks like quite a bit of work is involved to remove the type parameter.

@roji
Copy link
Contributor

roji commented May 15, 2013

That's right, although the idea would be to remove the type parameter from all Traverse methods...

It's indeed a bit of work and a non-trivial change, let me give it a shot and we'll see how it goes. As I said, this would resolve a number of other bugs as well, so I think it's worth it...

@aaubry
Copy link
Owner

aaubry commented May 15, 2013

You can't remove the type parameter from the Traverse methods. You can't use the actual type of the object, at least not always, because in many cases that's not the intended behavior. When you are working with explicit interface implementations or with internal classes that inherit from a base you will encounter issues.
Consider the following types:

public interface ICustomer {
    int Id { get; }
    string Name { get; }
}

public class DbCustomer : ICustomer {
    public int CustomerId { get; set; }
    public string Name { get; set; }
    int ICustomer.Id { get { return CustomerId; } }
    public List<Order> Orders { get; set; }
}

If you serialize a List that contains instances of DbCustomer, you will get unwanted properties CustomerId and Orders, and you won't get Id.

Here's another example:

public class Customer {
    public int Id { get; set; }
    public string Name { get; set; }
}

public class DbCustomerRepository {
    public List<Customer> GetCustomers() {
        // Return a list of DbCustomer with the Repository property set to this.
    }

    private class DbCustomer : Customer {
        public DbCustomerRepository Repository { get; set; }
    }
}

We should see how other serializers address this issue. I believe that the current behavior is consistent with serializers such as DataContractSerializer and XmlSerializer.

On the other hand, I agree that it the other behavior is sometimes desired. There should be a way to enable it, but I am not sure how it should be done. In the worse case, one could need both behaviors in different parts of the same object graph.

And, of course, when the detected type is Object, it seems to always makes sense to use the actual type.

What do you think?

@roji
Copy link
Contributor

roji commented May 17, 2013

Hi Antoine.

I think we have a "philosophical" issue here :) In your first example, I don't think I'd consider the CustomerId and Orders properties unwanted, to the contrary - I would expect/want them to be serialized... :) Did not understand the second example though.

So the question is if YamlDotNet should serialize according to actual types encountered in the runtime traversal of the object graph (which I was proposing), or according to statically declared types of properties (the current behavior).

  • I've checked and I think XmlSerializer looks at runtime and not static types, here's an example: https://gist.github.com/roji/5597851. Although the list is of type List, the DerivedClass's property it still emitted. I can confirm that protobuf.net (https://code.google.com/p/protobuf-net/) has the same behavior, and I'm pretty sure DataSerializer has the same as well (although I haven't confirmed).
  • The current behavior seems to create an inconsistency between generic and non-generic collections. Consider what happens when we serialize List and when we serialize List; for List, we have no type that can be used, and so the item type is inferred via GetType() (in GetObjectType()) - "polymorphic" serialization. For List, we serialize as object, which has no properties and will always come out empty. I think this is an argument in favor of looking at the actual runtime type...
  • Note that issue Derived classes serialization with YamlDotNet.UnitTests.RepresentationModel #14 complains about the lack of support for derived classes, and includes some tests on this.
  • Having said all this, I can definitely see cases where you wouldn't want to serialize according to the runtime type (the current behavior). IMHO the default behavior should be to serialize according to the object's actual type, but allow a SerializeAs type override for special cases where you want to ignore subclasses (or just serialize using a different type). I think this would correspond to people's intuitions (serialize object in the graph by their actual types) and fit most scenarios out of the box, while allowing an override when needed.

    Hope this wasn't too long (probably was though). Let me know your thoughts...

@aaubry
Copy link
Owner

aaubry commented May 21, 2013

That is a convincing argument and I agree that many times it is better to
use the actual type. There are some use cases, however where it is
necessary to have the current behavior. I think we need to offer both
options with the default being to use the actual type.

I am currently on vacation and I will have difficulty following this thread
until the end of this week. I need to take a look at the code to get a
better sense of what changes are required.

@roji
Copy link
Contributor

roji commented May 21, 2013

Thanks for taking the time to read through and respond...

Is it OK if I work on this and submit a pull request in the coming days, or do you prefer to work on this yourself?

Regarding the override mechanism (not to use the actual type), here is a thought... We are starting to have many different attributes YamlDotNet (YamlAlias, YamlIgnore). How about consolidating these into a single [[YamlMember]](a bit like .NET's [[DataMember]]), with attributes like Ignore, Alias, SerializeAsType (which would be the new "don't use actual type"), etc.

Tell me what you think, or just enjoy your vacation :)

@cdhunt
Copy link

cdhunt commented May 21, 2013

Some more reference material.

Json.Net is a very well documented serializer.

The authors of Service Stack TypeSerializer claim it to be pretty much the best text serializer.
https://code.google.com/p/servicestack/wiki/TypeSerializer

@roji This bug is affecting me. I'd be happy to test your proposed fix.

@roji
Copy link
Contributor

roji commented May 21, 2013

@cdhunt, will try to work on a fix for this ASAP. Unfortunately your original issue is part of a much larger design decision/discussion.

@cdhunt
Copy link

cdhunt commented May 21, 2013

No rush. Testing is about all I'm good for so just offering that sort of help if/when it's needed.

@aaubry
Copy link
Owner

aaubry commented May 29, 2013

I think this should be easily fixed by making minor changes to FullObjectGraphTraversalStrategy. I think we can safely alter the default behavior to always use the actual type of the object.

For the rare cases where the base type / interface is desired, we can provide an attribute, e.g.:

public interface ICustomer {
    int Id { get; }
    string Name { get; }
}

[SerializeAs(typeof(ICustomer))]
public class DbCustomer : ICustomer {
    public int CustomerId { get; set; }
    public string Name { get; set; }
    int ICustomer.Id { get { return CustomerId; } }
    public List<Order> Orders { get; set; }
}

The Type parameter is still needed on the Traverse methods because if the actual type is different from the declared type, we need to add a tag to indicate it. Otherwise, it won't be possible to deserialize the document.

@roji, are you working on this, or should I do it?

@roji
Copy link
Contributor

roji commented May 30, 2013

@aaubry, I'll work on it and submit a pull request later in the next few days.

Agree with everything you say... I'll rename the type parameter on the Traverse methods to typeOverride.

One question though (raised above): how about creating a single YamlMember attribute, rather than keep adding things like YamlIgnore, YamlAlias? My inspiration here would be Protobuf (https://code.google.com/p/protobuf-net/wiki/Attributes). What do you think?

roji pushed a commit to roji/YamlDotNet that referenced this issue May 30, 2013
roji pushed a commit to roji/YamlDotNet that referenced this issue May 30, 2013
Previously, serialization happened according to the properties's
declared type. This created problems for cases where the actual value
type differed from the declared type (polymorphism, ints boxed as
objects...).

Closes aaubry#26, aaubry#14
roji pushed a commit to roji/YamlDotNet that referenced this issue May 30, 2013
- Allows manual control of the type used when serializing
- Added unit tests
- Related to aaubry#26
- NOTE: Although serialization works great, the two unit tests currently
  fail because of round-tripping issues related to the new polymorphism
@roji
Copy link
Contributor

roji commented May 30, 2013

@aaubry, take a look at my work in the last commit. We have a small problem when trying to roundtrip, though...

Although the serializer is now properly aware of everything, the deserializer isn't. So if a property of type Parent contains an actual value of type Child, the deserializer fails since it tries to instantiate a Parent and assign it the Child property from the yaml (and the property doesn't exist).

Here's what I propose... In any case where the actual type differs from the property type, we can add a yaml tag to indicate the actual class name (with !). What do you think?

@aaubry
Copy link
Owner

aaubry commented May 31, 2013

Yes, if the actual type that is serialized is different from the statically known type, it is necessary to add a tag with the type.

Speaking of tags, the last time I checked the spec, it mentioned that a tag should always be a valid URI. The current scheme for .NET type names is to use the full name of the type, which is not an URI. I think we need to fix this problem. Maybe simply use some custom scheme, so instead of:

!<!MyType%2C%20MyAssembly>

we would use:

!<!clr:MyType%2C%20MyAssembly>

What do you think?

@aaubry
Copy link
Owner

aaubry commented May 31, 2013

Also, I agree with YamlMemberAttribute. I forgot to mention that :)

@roji
Copy link
Contributor

roji commented May 31, 2013

I seem to remember a distinction between local tags, which aren't required to be URIs, and global tags which are... Let me dig in the specs a little about this. In any case I'll work on integrating the tag for when the static type differs from the runtime type.

I'll also add the functionality of the YamlAlias attribute into YamlMember and mark YamlAlias obsolete.

@roji
Copy link
Contributor

roji commented Jun 4, 2013

OK @aaubry, here's some more work: roji@1975255

When the runtime type being serialized differs from the static (e.g. property) type, a tag is emitted with the full qualified type name. This allows us to roundtrip while preserving all types. All unit tests pass now.

Some comments:

  • It seems OK to do whatever we want in YAML local tags (tags that being with !), they don't have to be URIs... It might be a bit nicer to define a global tag prefix for YamlDotNet and put all .NET types under it, but it really doesn't seem very important...
  • The type tags are only emitted if Roundtrip is requested, since they're useless otherwise.
  • Yes, the tag is really really ugly and long (run the new tests). We may be able to use tag handles to make things more pretty, but until we look into it it the current situation seem good enough - it's not a horrible price to pay for full roundtripping, and only where the type is necessary (polymorphism).
  • The tag generation should probably be centralized somewhere in the serializer, for now I just put code like "!" + Type.AssemblyQualifiedName in the visitors.

Let me know what you think about all of this... and @cdhunt, if you want to test this and confirm it solves your problem that would be great.

@aaubry
Copy link
Owner

aaubry commented Jun 11, 2013

I haven't managed to fully review you commit, but it seems to me that you are making the IObjectGraphVisitor interface too specific. Adding emitTag does not seem to make sense because it is only useful to the emitter visitor. Also, I think it is misplaced because there are other places where it is necessary to emit the tag, like in a sequence.

@roji
Copy link
Contributor

roji commented Jun 13, 2013

I see what you mean @aaubry, let me try to make it a bit cleaner.

It's a little tricky since three things affect the decision of whether to emit the tag:

  1. We need to be in roundtrip mode - I'll have the ObjectEmittingVisitor accept this flag in its constructor, this is clean.
  2. Whether the actual type differs from the static type. This means I'll have to add the static type to the visitor API.
  3. Whether the SerializeAs feature is used and how - I'll think about this a little bit more.

So I'll submit another attempt in a couple days...

@roji
Copy link
Contributor

roji commented Jun 19, 2013

@aaubry, I've refactored the polymorphic serialization to better respect the boundaries between the traversal strategy and the visitors. Let me know what you think:

https://github.com/roji/YamlDotNet/tree/bug26

PS Because some actual functionality moved from the traversal strategy to the visitors, there is now some substantial duplication between EmittnigObjectGraphVisitor and AnchorAssigningObjectGraphVisitor - I'm thinking they may need to be merged into one class which has emitAnchors true/false and roundtrip true/false. We can talk about this later...

@roji
Copy link
Contributor

roji commented Aug 23, 2013

Hey @aaubry, I kind forgot this was left in a hanging state... Have you had any time to look at what I've done?

@aaubry
Copy link
Owner

aaubry commented Aug 25, 2013

Yes, I know. Sorry about that. I did start looking at your changes, but I have been having some personal issues and was unable to dedicate enough time to this. I expect to have more time during this week and I promise to finish analyzing your code and give you feedback.

@roji
Copy link
Contributor

roji commented Aug 25, 2013

No problem Antoine, was just checking... Whenever you have time. Hope everything is OK.

aaubry added a commit that referenced this issue Aug 29, 2013
@aaubry
Copy link
Owner

aaubry commented Aug 29, 2013

So, I have looked at your branch, and I have made some adjustments that I have committed into a branch. I did those changes because I think some of the changes introduced responsibilities to types that were already responsible for something else.

One such issue was with FullObjectGraphTraversalStrategy. Its purpose is to traverse an object graph, but it also had to know about YamlMemberAttribute and serializeAsType.

AnchorAssigningObjectGraphVisitor and EmittingObjectGraphVisitor also had a problem because they had to take care of adding the tag when there's already a class that takes care of that, TypeAssigningEventEmitter.

The changes that I made are an attempt to fix these issues. There was already ITypeDescriptor for taking care of obtaining metadata about a type. I added the logic of selecting the runtime type there, then I had to adapt the remainder of the code to it. TypeAssigningEventEmitter is now in charge of adding the tag when needed.

I would like your feedback on this code. This is work in progress. There's some stuff that needs to be refactored and at least one type, ITypeDescriptor, is poorly named and needs to be renamed.

@roji
Copy link
Contributor

roji commented Aug 31, 2013

Thanks for taking the time and review so seriously. I'll look over your changes in the next two days.

@aaubry
Copy link
Owner

aaubry commented Sep 29, 2013

@roji, did you have a chance to look at this? If you agree, I would like to merge this into the master branch.

@roji
Copy link
Contributor

roji commented Sep 29, 2013

@aaubry, really sorry, I had an unexpectedly busy time recently. I promise to get around and look within the next few days, if that somehow still doesn't happen you're welcome to merge.

@roji
Copy link
Contributor

roji commented Sep 30, 2013

@aaubry, I finally managed to take a look, here are some comments.

Your code definitely fits the traversal strategy and visitor pattern much better than what I did. I admit that to my taste, there is an over-abundance of interfaces hiding descriptors, resolvers and other such abstractions which create what feels like an overly abstract / engineered system that is a bit difficult to follow... but I guess it really is a matter of taste. But overall it does feel cleaner.

I see that you provide a configurable option for static vs. dynamic type resolving. I understand the the desire to give as much flexibility as possible to users but this seems a bit too much for me... It's a pretty deep/complicated difference that most users won't really understand, and I've yet to see a convincing use case where static resolving makes more sense (or another serializer that provides this feature). But it's no big deal :)

One last thing - a significantly larger number of objects seem to be created in this new versions. I'm thinking mainly of all the IObjectDescriptors, but there may be others. I'm wondering if this isn't starting to affect performance a little, it might be a good idea to do a bit of comparative profiling.

@aaubry
Copy link
Owner

aaubry commented Oct 2, 2013

Thanks for taking the time to look at this. The point of the abstractions you mention is to be able to continue adding features by plugging-in other implementations without having to change the existing code. I agree that it introduces some complexity, but the opposite would be to have tightly-coupled code that is difficult to change.

The option for static typing is there just for compatibility with the old behavior, in case somebody relies on it. I agree that this is a rather obscure feature and maybe I should remove it.

The amount of objects created is a valid concern. I will look into it and see what could be done to improve this area. I will also create some tests to compare the performance of the new version versus the previous one.

@roji
Copy link
Contributor

roji commented Oct 2, 2013

OK. Hopefully the memory/performance thing won't be a real issue. Please let me know if I can help in any way.

aaubry added a commit that referenced this issue Oct 17, 2013
Conflicts:
	YamlDotNet.Core.Test/EmitterTests.cs
	YamlDotNet.RepresentationModel.Test/SerializationTests.cs
@aaubry
Copy link
Owner

aaubry commented Oct 20, 2013

I have merged the code that fixes this issue and published a prerelease package with the new features. I have yet to test the performance of this version versus the previous one. As soon as I am confident that there are no issues, and if no bugs are detected, I will publish a release package.

@aaubry aaubry closed this as completed Aug 30, 2016
@tomkerkhove
Copy link

Is there a sample where this is illustrated? Would be perfect if it did serialization & deserialization.

This is something I currently have to do but a bit hard to find how to achieve this.

Example:

items:
  - name: "abc"
    type: "Foo"
    description: "This is a sample"
  - name: "xyz"
    type: "Bar"
    url: "https://google.com"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants