Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Orleans.Serialization library as a high-fidelity, version-tolerant serializer #7070

Merged

Conversation

ReubenBond
Copy link
Member

@ReubenBond ReubenBond commented May 6, 2021

This PR integrates a new version-tolerant serializer and RPC primitives into Orleans. For more context, and prior discussion, please see #2653 and https://github.com/ReubenBond/Hagar.

The serializer implemented in this PR requires that developers annotate their types with an attribute on the type and an attribute on each serialized field/property. There is an included code analyzer to ease the process, generating the per-data-member attributes. Here is an example of an annotated type:

[GenerateSerializer]
public class ClassWithEnumTestData
{
    [Id(0)]
    public TestEnum EnumValue { get; set; }

    [Id(1)]
    public CampaignEnemyTestType Enemy { get; set; }
}

The wire format takes inspiration from Bond, and Protocol Buffers, but adds support for nominal types and references, which would otherwise need to be encoded on top of the base protocol. This is the primary reason for its existence, since it is believed that this greatly improves the experience for .NET developers, giving them substantially more freedom when designing their applications and modelling their data.

Developers can configure the system to use Newtonsoft.Json or another package as the serializer for their types instead, similarly to how IExternalSerializer works today. Unlike the existing serializer, even data serialized by external serializers must conform to a basic wire format (typically, by emitting a length-prefixed sequence).

Deep copiers are decoupled from serializers, which is an issue with the existing system: developers often do not want to replace copiers when replacing serializers.

Previously unserializable/undeserializable Exceptions can be serialized with a decent degree of fidelity using this serializer, so that the receiver can see the exception type, message, stack trace, and other core properties even if the concrete exception type is not available. If the concrete exception type is not available, then the properties are deserialized into a special-purpose UnavailableExceptionFallbackException object.

A new NuGet package, named Orleans.Sdk, takes the place of installing multiple NuGet packages on grain interface/flass/

Application Parts are removed entirely: the code generator package generates attributes which serve the purpose of Application Parts in identifying the closure of assemblies which contain types that Orleans must know about. There is still a way to remove grain classes & interfaces programmatically, by configuring GrainTypeOptions, which has a Classes and an Interfaces property. Assemblies can also be manually included in the closure by adding an assembly-level attribute to the main assembly.

While Orleans serialization is not intended for use between untrusted parties, this serializer greatly improves security aspects of serialization in a variety of ways:

  • Only known types are allowed to be serialized - types are disabled by default. Types which are annotated are allowed to be deserialized, and predicates for which types are allowed can be specified at configuration time.
  • Length fields are validated to prevent an attacker to force the deserializing party to pre-allocate a buffer which is larger than the payload.

The serializer is also much faster than the existing serializer, making use of hardware intrinsics where available, and avoiding unnecessary allocations.

RPC

Serialization and RPC are closely related. A large component of RPC is the serialization of method calls for later deserialization and invocation. This PR also includes new RPC primitives, which replace both InvokeMethodRequest and generated IGrainMethodInvoker implementations with generated IInvokable classes (two become one). There is one generated IInvokable class for each method on each grain interface. Generated GrainReference classes change minorly to make use of the new generated IInvokable classes.

The RPC supports efficient serialization of method arguments, avoiding most or all unnecessary allocations and avoiding unnecessary type information

As a part of a concerted effort to reduce unnecessary overheads, the RPC is customizable enough to allow us to implement Orleans.Transactions entirely outside of the core libraries, and this PR also includes that change. This means that non-transactional calls avoid checking for transaction contexts or requirements. It hopefully also means that we can support more experimentation in the future without requiring forking the codebase.


TODOs for this PR:

  • Implement renaming tolerance for RPC invokable types

Areas for future investigation/improvements:

  • Move existing SerializationManager and related types to a new package (Microsoft.Orleans.Serialization.Legacy)
  • Analyzer: allow customization of the serializer attributes through build time properties, just as the code generator does
  • Add in-place copying optimization for RPC types, supporting [Immutable] on parameters and return values (via [return: Immutable])
  • Explore emitting a separate, diffable database of types included in an application, so that annotations can be removed from code. This may end up being too 'magic' to be worthwhile, but we won't know for sure unless we explore it.

Fixes #2653
Fixes #5021

@ReubenBond ReubenBond force-pushed the feature/version-tolerant-serializer branch 5 times, most recently from 43a6533 to 659650c Compare May 10, 2021 20:10
@ReubenBond ReubenBond changed the title [WIP] Add Orleans.Serialization library as a high-fidelity, version-tolerant serializer Add Orleans.Serialization library as a high-fidelity, version-tolerant serializer May 10, 2021
@ReubenBond ReubenBond marked this pull request as ready for review May 10, 2021 20:57
@ReubenBond ReubenBond force-pushed the feature/version-tolerant-serializer branch 3 times, most recently from d6f5e45 to edfe0b9 Compare May 11, 2021 04:56
@ReubenBond ReubenBond added this to the 4.0.0 milestone May 11, 2021
@ReubenBond ReubenBond force-pushed the feature/version-tolerant-serializer branch from 063942f to fb11df4 Compare May 11, 2021 18:24
@haefele
Copy link

haefele commented May 11, 2021

Hey, I know I'm not a contributor to this project, I just saw your tweet and thought: Gotta check this thing out!

So I noticed that you still have some Hagar leftovers in your filenames.

@ReubenBond
Copy link
Member Author

Thanks, @haefele, I'll fix them

Copy link
Member

@galvesribeiro galvesribeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome! Looking forward to use it :shipit:

@@ -85,7 +85,7 @@ public async Task<IList<IBatchContainer>> GetQueueMessagesAsync(int maxCount)
IEnumerable<SQSMessage> messages = await task;

List<IBatchContainer> azureQueueMessages = messages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I made a wrong copy and paste here and kept the azureQueueMessages on this field o.O

@@ -31,24 +32,24 @@ public class AzureBlobGrainStorage : IGrainStorage, ILifecycleParticipant<ISiloL
private ILogger logger;
private readonly string name;
private AzureBlobStorageOptions options;
private SerializationManager serializationManager;
private Serializer serializationManager;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should rename those fields to avoid confusion with the old SerializationManager

Copy link
Member Author

@ReubenBond ReubenBond May 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'll run a find & replace. I also want to migrate to the more common .NET naming scheme and prefix all field names with underscores - in a later PR. Now that we have version tolerant serialization, we know that renaming fields wont break anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, That was a major drama recently...

@ReubenBond ReubenBond force-pushed the feature/version-tolerant-serializer branch from feb11e8 to 29128a7 Compare June 16, 2021 20:29
@benjaminpetit benjaminpetit merged commit 5356f64 into dotnet:main Jun 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants