New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
object array not serialized. #26
Comments
Antoine, I can confirm this, will work on a pull request for a fix. |
@aaubry, I've looked into this and I just want your opinion before I go ahead and do a semi-major refactor... The problem starts in FullObjectGraphTraversalStrategy.SerializeList; for any non-generic list, the item type passed to Traverse() is object, and Traverse() than attempts to serialize the item types (which are boxed integers) as objects, and this doesn't work. What I'm not sure about, is why Traverse() receives a type parameter externally in the first place. Instead of determining the type outside Traverse() each time (in complicated and sometimes wrong ways), I would like to have Traverse() simply accept a value and extract its type internally with GetType(). This would simplify everything and fix this bug. Here's another example of why the current system is not so good... Say you have a generic list List, which contains some instances of B (where B inherits A). Under the current system, the type used for serialization is A (the generic parameter of the list), and so instances of B are serialized as A. This could provoke some bad errors if the type is consulted during serialization (e.g. property attributes like aliasing, ignore). Makes sense to you? Can you think of a good reason why types should be determined outside Traverse()? |
Hey @aaubry, what do you think of the above? Will it be easier to understand if I just implement my proposal and submit a pull request? |
Would this also affect a List<> in a class? public class IisSite
{
public string Id { get; set; }
public string Name { get; set; }
[Display(Name = "AppPool Name")]
public string AppPool { get; set; }
[Display(Name = "Home Directory")]
public string Path { get; set; }
[Display(Name = "Host Names")]
public List<String> HostHeaders = new List<string>();
[Display(Name = "Virtual Directories")]
public StringDictionary VirtualDirectories = new StringDictionary();
} Right now, the strings get serialized fine, but there is nothing serialized for List or StringDictionary. |
I'm not sure, I don't think so... The problem I described above is related only to non-generic lists containing primitives (since their type is assumed to be object and object serialization of primitives doesn't work). Are you sure it isn't because the strings are properties, but the list and dictionary are fields? If I remember correctly YamlDotNet only serializes properties... |
That makes sense. Thanks. |
Sure. When you have some time let me know if my proposal makes sense, or if I should just write it up as a pull request to make it more understandable. |
I see |
That's right, although the idea would be to remove the type parameter from all Traverse methods... It's indeed a bit of work and a non-trivial change, let me give it a shot and we'll see how it goes. As I said, this would resolve a number of other bugs as well, so I think it's worth it... |
You can't remove the type parameter from the Traverse methods. You can't use the actual type of the object, at least not always, because in many cases that's not the intended behavior. When you are working with explicit interface implementations or with internal classes that inherit from a base you will encounter issues. public interface ICustomer {
int Id { get; }
string Name { get; }
}
public class DbCustomer : ICustomer {
public int CustomerId { get; set; }
public string Name { get; set; }
int ICustomer.Id { get { return CustomerId; } }
public List<Order> Orders { get; set; }
} If you serialize a List that contains instances of DbCustomer, you will get unwanted properties CustomerId and Orders, and you won't get Id. Here's another example: public class Customer {
public int Id { get; set; }
public string Name { get; set; }
}
public class DbCustomerRepository {
public List<Customer> GetCustomers() {
// Return a list of DbCustomer with the Repository property set to this.
}
private class DbCustomer : Customer {
public DbCustomerRepository Repository { get; set; }
}
} We should see how other serializers address this issue. I believe that the current behavior is consistent with serializers such as DataContractSerializer and XmlSerializer. On the other hand, I agree that it the other behavior is sometimes desired. There should be a way to enable it, but I am not sure how it should be done. In the worse case, one could need both behaviors in different parts of the same object graph. And, of course, when the detected type is Object, it seems to always makes sense to use the actual type. What do you think? |
Hi Antoine. I think we have a "philosophical" issue here :) In your first example, I don't think I'd consider the CustomerId and Orders properties unwanted, to the contrary - I would expect/want them to be serialized... :) Did not understand the second example though. So the question is if YamlDotNet should serialize according to actual types encountered in the runtime traversal of the object graph (which I was proposing), or according to statically declared types of properties (the current behavior).
Having said all this, I can definitely see cases where you wouldn't want to serialize according to the runtime type (the current behavior). IMHO the default behavior should be to serialize according to the object's actual type, but allow a SerializeAs type override for special cases where you want to ignore subclasses (or just serialize using a different type). I think this would correspond to people's intuitions (serialize object in the graph by their actual types) and fit most scenarios out of the box, while allowing an override when needed. Hope this wasn't too long (probably was though). Let me know your thoughts... |
That is a convincing argument and I agree that many times it is better to I am currently on vacation and I will have difficulty following this thread |
Thanks for taking the time to read through and respond... Is it OK if I work on this and submit a pull request in the coming days, or do you prefer to work on this yourself? Regarding the override mechanism (not to use the actual type), here is a thought... We are starting to have many different attributes YamlDotNet (YamlAlias, YamlIgnore). How about consolidating these into a single [[YamlMember]](a bit like .NET's [[DataMember]]), with attributes like Ignore, Alias, SerializeAsType (which would be the new "don't use actual type"), etc. Tell me what you think, or just enjoy your vacation :) |
Some more reference material. Json.Net is a very well documented serializer.
The authors of Service Stack TypeSerializer claim it to be pretty much the best text serializer. @roji This bug is affecting me. I'd be happy to test your proposed fix. |
@cdhunt, will try to work on a fix for this ASAP. Unfortunately your original issue is part of a much larger design decision/discussion. |
No rush. Testing is about all I'm good for so just offering that sort of help if/when it's needed. |
I think this should be easily fixed by making minor changes to FullObjectGraphTraversalStrategy. I think we can safely alter the default behavior to always use the actual type of the object. For the rare cases where the base type / interface is desired, we can provide an attribute, e.g.: public interface ICustomer {
int Id { get; }
string Name { get; }
}
[SerializeAs(typeof(ICustomer))]
public class DbCustomer : ICustomer {
public int CustomerId { get; set; }
public string Name { get; set; }
int ICustomer.Id { get { return CustomerId; } }
public List<Order> Orders { get; set; }
} The Type parameter is still needed on the Traverse methods because if the actual type is different from the declared type, we need to add a tag to indicate it. Otherwise, it won't be possible to deserialize the document. @roji, are you working on this, or should I do it? |
@aaubry, I'll work on it and submit a pull request later in the next few days. Agree with everything you say... I'll rename the type parameter on the Traverse methods to typeOverride. One question though (raised above): how about creating a single YamlMember attribute, rather than keep adding things like YamlIgnore, YamlAlias? My inspiration here would be Protobuf (https://code.google.com/p/protobuf-net/wiki/Attributes). What do you think? |
- Allows manual control of the type used when serializing - Added unit tests - Related to aaubry#26 - NOTE: Although serialization works great, the two unit tests currently fail because of round-tripping issues related to the new polymorphism
@aaubry, take a look at my work in the last commit. We have a small problem when trying to roundtrip, though... Although the serializer is now properly aware of everything, the deserializer isn't. So if a property of type Parent contains an actual value of type Child, the deserializer fails since it tries to instantiate a Parent and assign it the Child property from the yaml (and the property doesn't exist). Here's what I propose... In any case where the actual type differs from the property type, we can add a yaml tag to indicate the actual class name (with !). What do you think? |
Yes, if the actual type that is serialized is different from the statically known type, it is necessary to add a tag with the type. Speaking of tags, the last time I checked the spec, it mentioned that a tag should always be a valid URI. The current scheme for .NET type names is to use the full name of the type, which is not an URI. I think we need to fix this problem. Maybe simply use some custom scheme, so instead of:
we would use:
What do you think? |
Also, I agree with YamlMemberAttribute. I forgot to mention that :) |
I seem to remember a distinction between local tags, which aren't required to be URIs, and global tags which are... Let me dig in the specs a little about this. In any case I'll work on integrating the tag for when the static type differs from the runtime type. I'll also add the functionality of the YamlAlias attribute into YamlMember and mark YamlAlias obsolete. |
OK @aaubry, here's some more work: roji@1975255 When the runtime type being serialized differs from the static (e.g. property) type, a tag is emitted with the full qualified type name. This allows us to roundtrip while preserving all types. All unit tests pass now. Some comments:
Let me know what you think about all of this... and @cdhunt, if you want to test this and confirm it solves your problem that would be great. |
I haven't managed to fully review you commit, but it seems to me that you are making the IObjectGraphVisitor interface too specific. Adding emitTag does not seem to make sense because it is only useful to the emitter visitor. Also, I think it is misplaced because there are other places where it is necessary to emit the tag, like in a sequence. |
I see what you mean @aaubry, let me try to make it a bit cleaner. It's a little tricky since three things affect the decision of whether to emit the tag:
So I'll submit another attempt in a couple days... |
@aaubry, I've refactored the polymorphic serialization to better respect the boundaries between the traversal strategy and the visitors. Let me know what you think: https://github.com/roji/YamlDotNet/tree/bug26 PS Because some actual functionality moved from the traversal strategy to the visitors, there is now some substantial duplication between EmittnigObjectGraphVisitor and AnchorAssigningObjectGraphVisitor - I'm thinking they may need to be merged into one class which has emitAnchors true/false and roundtrip true/false. We can talk about this later... |
Hey @aaubry, I kind forgot this was left in a hanging state... Have you had any time to look at what I've done? |
Yes, I know. Sorry about that. I did start looking at your changes, but I have been having some personal issues and was unable to dedicate enough time to this. I expect to have more time during this week and I promise to finish analyzing your code and give you feedback. |
No problem Antoine, was just checking... Whenever you have time. Hope everything is OK. |
So, I have looked at your branch, and I have made some adjustments that I have committed into a branch. I did those changes because I think some of the changes introduced responsibilities to types that were already responsible for something else. One such issue was with FullObjectGraphTraversalStrategy. Its purpose is to traverse an object graph, but it also had to know about YamlMemberAttribute and serializeAsType. AnchorAssigningObjectGraphVisitor and EmittingObjectGraphVisitor also had a problem because they had to take care of adding the tag when there's already a class that takes care of that, TypeAssigningEventEmitter. The changes that I made are an attempt to fix these issues. There was already ITypeDescriptor for taking care of obtaining metadata about a type. I added the logic of selecting the runtime type there, then I had to adapt the remainder of the code to it. TypeAssigningEventEmitter is now in charge of adding the tag when needed. I would like your feedback on this code. This is work in progress. There's some stuff that needs to be refactored and at least one type, ITypeDescriptor, is poorly named and needs to be renamed. |
Thanks for taking the time and review so seriously. I'll look over your changes in the next two days. |
@roji, did you have a chance to look at this? If you agree, I would like to merge this into the master branch. |
@aaubry, really sorry, I had an unexpectedly busy time recently. I promise to get around and look within the next few days, if that somehow still doesn't happen you're welcome to merge. |
@aaubry, I finally managed to take a look, here are some comments. Your code definitely fits the traversal strategy and visitor pattern much better than what I did. I admit that to my taste, there is an over-abundance of interfaces hiding descriptors, resolvers and other such abstractions which create what feels like an overly abstract / engineered system that is a bit difficult to follow... but I guess it really is a matter of taste. But overall it does feel cleaner. I see that you provide a configurable option for static vs. dynamic type resolving. I understand the the desire to give as much flexibility as possible to users but this seems a bit too much for me... It's a pretty deep/complicated difference that most users won't really understand, and I've yet to see a convincing use case where static resolving makes more sense (or another serializer that provides this feature). But it's no big deal :) One last thing - a significantly larger number of objects seem to be created in this new versions. I'm thinking mainly of all the IObjectDescriptors, but there may be others. I'm wondering if this isn't starting to affect performance a little, it might be a good idea to do a bit of comparative profiling. |
Thanks for taking the time to look at this. The point of the abstractions you mention is to be able to continue adding features by plugging-in other implementations without having to change the existing code. I agree that it introduces some complexity, but the opposite would be to have tightly-coupled code that is difficult to change. The option for static typing is there just for compatibility with the old behavior, in case somebody relies on it. I agree that this is a rather obscure feature and maybe I should remove it. The amount of objects created is a valid concern. I will look into it and see what could be done to improve this area. I will also create some tests to compare the performance of the new version versus the previous one. |
OK. Hopefully the memory/performance thing won't be a real issue. Please let me know if I can help in any way. |
Conflicts: YamlDotNet.Core.Test/EmitterTests.cs YamlDotNet.RepresentationModel.Test/SerializationTests.cs
I have merged the code that fixes this issue and published a prerelease package with the new features. I have yet to test the performance of this version versus the previous one. As soon as I am confident that there are no issues, and if no bugs are detected, I will publish a release package. |
Is there a sample where this is illustrated? Would be perfect if it did serialization & deserialization. This is something I currently have to do but a bit hard to find how to achieve this. Example: items:
- name: "abc"
type: "Foo"
description: "This is a sample"
- name: "xyz"
type: "Bar"
url: "https://google.com" |
If I try to serialize an object array like so:
I get output like this:
It seems that it does not properly get the boxed values from the object array.
The text was updated successfully, but these errors were encountered: