Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization of Actions, Funcs, delegates and events. #11

Open
rikimaru0345 opened this issue Dec 29, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@rikimaru0345
Copy link
Owner

commented Dec 29, 2018

tl;dr: Static methods: 馃啑 Everything else: WIP 馃毀 (work in progress)

Can Ceras serialize my Action, Func, event field, ... ?

At the moment only references to static methods are supported.
That means:

Action myAction = SomeStaticClass.MyStaticMethod;

ceras.Serialize(myAction); // works fine

Static methods are a special case. It's trivial because the only thing one really needs to remember is the target method (and ceras already serializes MethodInfos).

What about serializing references to instance methods? Or lambda expressions?

Beyond the simple case things become pretty complicated pretty quickly.
So that's why I'm opening this issue:

  • let people know about the limitations, and general things you should be aware of when trying to serialize delegates
  • track the state of the "delegate serialization" feature in Ceras.

The main issue with delegates is that they drag in other stuff in a completely invisible way.
And not just that, often times the compiler will also add in completely unrelated variables to the hidden capturing class!

Example:

You'd think that the only thing that's getting captured is the 6.

  • The 5 is already part of the generated code (because it is a static class and a static field).
  • The 6 is a local variable, so it gets captured into a anonymous field.
  • The 7 is part of the code itself, nothing to do here.

Now lets take a look at the actual code that's getting executed here:

Ok... so far so good.
But once we check out the compiler generated object (which holds those captured variables) we're in for a pretty big surprise.

Where does DelegateValueHolder suddenly come from?

DelegateValueHolder is a simple test class that is being used further down in my DelegatesTest() method, and it's completely unrelated to what we're doing.

So why is it being included in the generated class then? WTF?

The answer is: optimization. The compiler only generates one capture class for each method that is being used as a capture source.

So that means when serializing delegates of any kind, we can't even be sure that we're not accidentally capturing all sorts of crazy unrelated stuff. The compiler can (and will as we've seen) put whatever it wants into one big capture-object.

Now what does that mean for Ceras and delegate serialization?
It means that we cannot assume anything about how a delegate works or is constructed.
There are 4 things to know in order to construct (deserialize) an arbitrary delegate:

  1. The type. That one is obvious, we need to know if our delegate is a Func<>, Action, MyCustomDelegateDefinition, System.EventHandler, or whatever else. All of them inherit from System.MulticastDelegate, which brings us to:
  2. The invocation list. Since pretty much all delegates are MulticastDelegate, that means invoking one can actually cause multiple method calls (yes, that includes also includes any normal Action)! Anyone can call Delegate.Combine(a,b); and assign the result back to what looks like a simple direct method reference. So there's no way around that. Luckily myFunc.GetInvocationList(); gives us all targets.
  3. The target object.
  4. The target method.

The last two points also have their own intricacies since there are 3 different cases:

  • Static: Target method is static. So the method is saved directly. The object is null. This is the most simple case, there are no possible side-effects and everything should always work perfectly fine.
  • Instance: For instance methods, the method is saved directly, while the object becomes the object on which to call the method. This is where the first issues appear. If you save a delegate like this var my
  • Lambda: This is where stuff gets complicated. The method is part of a hidden, compiler-generated class! And as we've seen it might include all sorts of references!

In the object-case Ceras is pretty much forced to include a reference to the object.
After all, how else could it construct the delegate when deserializing again?
So what would we do? Simply include the object in full? Most likely that target object has some fields and properties, which in turn reference other objects...
So that'd essentially drag in a huge graph of objects.
Obviously this is just asking for trouble.

In the lambda-case things are even more difficult.
Lambda functions, more often than not, capture multiple objects (not just one as in the object-case!).
And to make it even worse, some of the objects might even not be related to the delegate at all!
But wait! There's more!
Not only does the compiler give us some auto-generated object full of random methods and object references, it is of course also allowed to name all those hidden things however it likes.
Why this is a huge problem (with no clear or easy solution at all) is explained in an answer on StackOverflow; which basically says that any code change can throw off the namings (and thus make previously serialized data invalid). Even adding/removing/changing some other, completely unrelated, thing can (and will) break everything. (Link to the post)

Possible Solutions

So what can we do?

  • Naive way
    Simply allowing every object to be dragged into the serialization graph is a sure-fire way to immediately get all sorts of bugs, and that is generously assuming that the serialization/deserialization itself won't already break in all sorts of places. So that one will obviously get us nowhere.

  • Simply resolve by ID
    If we only consider delegates pointing to instance-methods (so no lambda-expressions!), we could just have a sort of resolver.
    The idea is really very simple and in fact it already works. If we mark the target object with the IExternalRootObject interface, then Ceras would simply use the given ID that the object provides. And at deserialization time Ceras gives us that ID again and we resolve the correct kind of object. I'll post an example of that soon (remind me if I don't 馃槃).
    Now while that works perfectly fine, it's not exactly obvious. So maybe some new implementation which does exactly the same but is just named differently...

  • White-List?
    Maybe some sort of white-list on each delegate field could be used?
    When serializing and we're dealing with some crazy lambda expression, Ceras would use the white-list to check which of the fields in the hidden compiler-generated object it should include.

I'm not sure what's the best way to do this.
Maybe a good start would be to first collect all sorts of use-cases, so feel free to post here and tell me about what delegates you'd like to serialize, why, and how you'd expect Ceras to deal with references to the target object, or objects/variables.
Any ideas welcome :P

@rikimaru0345

This comment has been minimized.

Copy link
Owner Author

commented Jan 18, 2019

As mentioned support for delegates that point to static methods is trivial.
So from 62f62f9 onwards Ceras supports that case, Delegate and descendant types are no longer banned.
There's a check in the DelegateFormatter to ensure no instance-data is accidentally captured.
Invocation lists are fully supported as well.

@RubyNova

This comment has been minimized.

Copy link

commented Jan 29, 2019

Based on my experience with my own OSS projects, I would recommend giving each object a Guid you use as part of the metadata identification. I do something similar in a MUD engine I'm working on with a few friends, you can see how that's being done here, along with the persistence layer we've implemented:
https://github.com/RubyNova/SharperUniverse/blob/master/SharperUniverse/Core/SharperEntity.cs
https://github.com/RubyNova/SharperUniverse/blob/master/SharperUniverse/Persistence/LiteDbProvider.cs

While this is being used to reconstruct an ECS state, and not directly for delegates per-se, the same idea still applies, and I think it's something you should consider. If you're happy with it then I'd be happy to implement it in a PR.

EDIT: Just realised there's a minor issue I need to address in this particular implementation but the principle remains the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.