Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First step of the replacement of Roslyn based ahead of time code generator with Mono.Cecil based post compiled time assembly modifier. #846

Closed

Conversation

pCYSl5EDgo
Copy link
Contributor

@pCYSl5EDgo pCYSl5EDgo commented Mar 18, 2020

Summary

This pull request consists of three commits.

  • src/UnityClient change
    • Installer for the Mono.Cecil Unity Package.
    • MSPack unitypackage files.
  • LICENSE update
  • README.md update
    • Add description of mspc dotnet global tool.

Related Issues

Background

mpc is Roslyn based ahead of time code generator.
mpc analyzes C# source codes and generate a formatter resolver and formatters.
mpc has some limitations and lacks of some features such as private/internal member access.

mspc is Mono.Cecil based post compile time assembly modifier.
mspc analyzes compiled managed assemblies.
mspc modifies formatter resolver and generates formatters under formatter resolver.
mspc emits IL and utilize many optimization that cannot be done in C#.

What is changed in src/UnityClient

MSPack Unity editor extension hooks the build process after C# compilation and modifies dlls.

MSPack Unity editor extension is disabled by default. Click the Window -> MessagePack -> MSPackCodeModifier to enable MSPack Unity editor extension.

MSPack Unity editor extension searches formatter resolver in Assembly-CSharp.dll and modifies first found type during building the player.

Pros

Allow private/internal field/property serialization/deserialization.

mspc analyze target input dll whether it has valid IgnoresAccessChecksToAttribute definition.
If there is no definition, mspc defines it in target assembly.
AllowPrivateFormatters can be generated by mspc.

Decrease memory allocation of String-Key type formatter

Target type code is here.

[MessagePackObject]
public class StringKeyTestType
{
    [Key("KeyA")]
    public int A { get; set; }
    [Key("KeyB")]
    public string B { get; set; }
}
`mpc` currently generates codes as belows.
namespace MessagePack.Formatters
{
    using System;
    using System.Buffers;
    using MessagePack;

    public sealed class StringKeyTestTypeFormatter : global::MessagePack.Formatters.IMessagePackFormatter<global::StringKeyTestType>
    {


        private readonly global::MessagePack.Internal.AutomataDictionary ____keyMapping;
        private readonly byte[][] ____stringByteKeys;

        public StringKeyTestTypeFormatter()
        {
            this.____keyMapping = new global::MessagePack.Internal.AutomataDictionary()
            {
                { "KeyA", 0 },
                { "KeyB", 1 },
            };

            this.____stringByteKeys = new byte[][]
            {
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("KeyA"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("KeyB"),
            };
        }

        public void Serialize(ref MessagePackWriter writer, global::StringKeyTestType value, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (value == null)
            {
                writer.WriteNil();
                return;
            }

            IFormatterResolver formatterResolver = options.Resolver;
            writer.WriteMapHeader(2);
            writer.WriteRaw(this.____stringByteKeys[0]);
            writer.Write(value.A);
            writer.WriteRaw(this.____stringByteKeys[1]);
            formatterResolver.GetFormatterWithVerify<string>().Serialize(ref writer, value.B, options);
        }

        public global::StringKeyTestType Deserialize(ref MessagePackReader reader, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (reader.TryReadNil())
            {
                return null;
            }

            options.Security.DepthStep(ref reader);
            IFormatterResolver formatterResolver = options.Resolver;
            var length = reader.ReadMapHeader();
            var __A__ = default(int);
            var __B__ = default(string);

            for (int i = 0; i < length; i++)
            {
                ReadOnlySpan<byte> stringKey = global::MessagePack.Internal.CodeGenHelpers.ReadStringSpan(ref reader);
                int key;
                if (!this.____keyMapping.TryGetValue(stringKey, out key))
                {
                    reader.Skip();
                    continue;
                }

                switch (key)
                {
                    case 0:
                        __A__ = reader.ReadInt32();
                        break;
                    case 1:
                        __B__ = formatterResolver.GetFormatterWithVerify<string>().Deserialize(ref reader, options);
                        break;
                    default:
                        reader.Skip();
                        break;
                }
            }

            var ____result = new global::StringKeyTestType();
            ____result.A = __A__;
            ____result.B = __B__;
            reader.Depth--;
            return ____result;
        }
    }
}
`mspc` generates these classes as nested types of `formatter resolver`.
public static class AutomataDeserializeHelper
{
    public unsafe static int GetIndex_0000_0002(ReadOnlySpan<byte> span)
    {
        ReadOnlySpan<byte> readOnlySpan = span;
        if (readOnlySpan.Length == 4 && readOnlySpan[0] + (*(ushort*)(&readOnlySpan[1]) << 8) == 7955787)
        {
            if (readOnlySpan[3] == 65)
            {
                return 0;
            }
            if (readOnlySpan[3] == 66)
            {
                return 1;
            }
        }
        return -1;
    }
}

public sealed class SCFormatter0_StringKeyTestType : IMessagePackFormatter<StringKeyTestType>, IMessagePackFormatter
{
    public void Serialize(ref MessagePackWriter writer, StringKeyTestType value, MessagePackSerializerOptions options)
    {
        if (value == null)
        {
            writer.WriteNil();
            return;
        }
        writer.WriteMapHeader(2);
        writer.WriteRaw(new byte[5]
        {
            164,
            75,
            101,
            121,
            65
        });
        writer.Write(value.A);
        writer.WriteRaw(new byte[5]
        {
            164,
            75,
            101,
            121,
            66
        });
        options.Resolver.GetFormatterWithVerify<string>().Serialize(ref writer, value.B, options);
    }

    public StringKeyTestType Deserialize(ref MessagePackReader reader, MessagePackSerializerOptions options)
    {
        if (reader.TryReadNil())
        {
            return null;
        }
        options.Security.DepthStep(ref reader);
        int num = reader.ReadMapHeader();
        IFormatterResolver resolver = options.Resolver;
        StringKeyTestType stringKeyTestType = new StringKeyTestType();
        int i = default(int);
        for (; i < num; i++)
        {
            switch (AutomataDeserializeHelper.GetIndex_0000_0002(CodeGenHelpers.ReadStringSpan(ref reader)))
            {
            default:
                reader.Skip();
                break;
            case 0:
                stringKeyTestType.A = reader.ReadInt32();
                break;
            case 1:
                stringKeyTestType.B = resolver.GetFormatterWithVerify<string>().Deserialize(ref reader, options);
                break;
            }
        }
        reader.Depth--;
        return stringKeyTestType;
    }
}

mspc generated serialize method seems to allocate byte array but actually it does not allocate any new heap memory.

Emitted IL shows actual behaviour.
.method public final hidebysig newslot virtual 
	instance void Serialize (
		valuetype [MessagePack]MessagePack.MessagePackWriter& writer,
		class StringKeyTestType 'value',
		class [MessagePack]MessagePack.MessagePackSerializerOptions options
	) cil managed 
{
	// Method begins at RVA 0x20f8
	// Code size 90 (0x5a)
	.maxstack 5

	IL_0000: ldarg.2
	IL_0001: brtrue.s IL_000a

	IL_0003: ldarg.1
	IL_0004: call instance void [MessagePack]MessagePack.MessagePackWriter::WriteNil()
	IL_0009: ret

	IL_000a: ldarg.1
	IL_000b: ldc.i4.2
	IL_000c: call instance void [MessagePack]MessagePack.MessagePackWriter::WriteMapHeader(int32)
	IL_0011: ldarg.1
	IL_0012: ldsflda valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=5' '<PrivateImplementationDetails>'::f0
	IL_0017: ldc.i4.5
	IL_0018: newobj instance void valuetype [System.Runtime]System.ReadOnlySpan`1<uint8>::.ctor(void*, int32)
	IL_001d: call instance void [MessagePack]MessagePack.MessagePackWriter::WriteRaw(valuetype [System.Runtime]System.ReadOnlySpan`1<uint8>)
	IL_0022: ldarg.1
	IL_0023: ldarg.2
	IL_0024: ldfld int32 StringKeyTestType::'<A>k__BackingField'
	IL_0029: call instance void [MessagePack]MessagePack.MessagePackWriter::Write(int32)
	IL_002e: ldarg.1
	IL_002f: ldsflda valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=5' '<PrivateImplementationDetails>'::f1
	IL_0034: ldc.i4.5
	IL_0035: newobj instance void valuetype [System.Runtime]System.ReadOnlySpan`1<uint8>::.ctor(void*, int32)
	IL_003a: call instance void [MessagePack]MessagePack.MessagePackWriter::WriteRaw(valuetype [System.Runtime]System.ReadOnlySpan`1<uint8>)
	IL_003f: ldarg.3
	IL_0040: callvirt instance class [MessagePack]MessagePack.IFormatterResolver [MessagePack]MessagePack.MessagePackSerializerOptions::get_Resolver()
	IL_0045: dup
	IL_0046: call class [MessagePack]MessagePack.Formatters.IMessagePackFormatter`1<!!0> [MessagePack]MessagePack.FormatterResolverExtensions::GetFormatterWithVerify<string>(class [MessagePack]MessagePack.IFormatterResolver)
	IL_004b: ldarg.1
	IL_004c: ldarg.2
	IL_004d: ldfld string StringKeyTestType::'<B>k__BackingField'
	IL_0052: ldarg.3
	IL_0053: callvirt instance void class [MessagePack]MessagePack.Formatters.IMessagePackFormatter`1<string>::Serialize(valuetype [MessagePack]MessagePack.MessagePackWriter&, !0, class [MessagePack]MessagePack.MessagePackSerializerOptions)
	IL_0058: pop
	IL_0059: ret
} // end of method SCFormatter0_StringKeyTestType::Serialize

Serialize/Deserialize backing field of auto-property directly

mspc reads getter and setter method's IL instructions.
If they match instruction of auto-property, mspc generates formatter which directly serialize/desrialize backing field.

Serialize/Deserialize fixed size buffer of unsafe struct

Here is the sample target type.

using MessagePack;

[MessagePackObject]
public unsafe struct FixedSizeBufferTestType
{
    [Key(0)]
    public fixed char Text[12];
}
Here is the disassembled formatter C# code generated by `mspc`.
// MessagePack.Resolvers.GeneratedResolver.ISFormatter0_FixedSizeBufferTestType
using MessagePack.Formatters;
using System;
using System.Runtime.CompilerServices;

public sealed class ISFormatter0_FixedSizeBufferTestType : IMessagePackFormatter<FixedSizeBufferTestType>, IMessagePackFormatter
{
	public unsafe void Serialize(ref MessagePackWriter writer, FixedSizeBufferTestType value, MessagePackSerializerOptions options)
	{
		writer.WriteArrayHeader(1);
		writer.WriteArrayHeader(12);
		IntPtr intPtr;
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(&value.Text))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)(long)(intPtr = (IntPtr)(void*)((long)intPtr + 2L))));
		writer.Write((char)(*(ushort*)((long)intPtr + 2L)));
	}

	public unsafe FixedSizeBufferTestType Deserialize(ref MessagePackReader reader, MessagePackSerializerOptions options)
	{
		if (reader.TryReadNil())
		{
			throw new InvalidOperationException("typecode is null, struct not supported");
		}
		options.Security.DepthStep(ref reader);
		int num = reader.ReadArrayHeader();
		IFormatterResolver resolver = options.Resolver;
		int i = default(int);
		FixedSizeBufferTestType result = default(FixedSizeBufferTestType);
		for (; i < num; i++)
		{
			switch (i)
			{
			default:
				reader.Skip();
				break;
			case 0:
			{
				if (reader.ReadArrayHeader() != 12)
				{
					throw new InvalidOperationException("Fixed size buffer field should have 12 element(s). field : FixedSizeBufferTestType/<Text>e__FixedBuffer FixedSizeBufferTestType::Text");
				}
				ref FixedSizeBufferTestType.<Text>e__FixedBuffer text = ref result.Text;
				*(char*)(&text) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				*(char*)(&Unsafe.AddByteOffset(ref text, 2)) = reader.ReadChar();
				break;
			}
			}
		}
		reader.Depth--;
		return result;
	}
}

Generics Support

Here is the sample target type.

using MessagePack;
using System;

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct, Inherited = false, AllowMultiple = true)]
public sealed class MessagePackObjectGenericVariationAttribute : Attribute
{
    public uint[] NotPrimitiveValueTypeIndices { get; set; }

    public MessagePackObjectGenericVariationAttribute(Type serializeTargetType) { }
}

[MessagePackObject]
[MessagePackObjectGenericVariation(typeof(GenericTestType<int, string>))]
[MessagePackObjectGenericVariation(typeof(GenericTestType<char, string>))]
[MessagePackObjectGenericVariation(typeof(GenericTestType<char, System.Delegate>))]
public class GenericTestType<T0, T1>
    where T0 : struct, IEquatable<T0>
    where T1 : class
{
    [Key(0)]
    public T0 Value { get; set; }

    [Key(1)]
    public T1 Another { get; set; }
}

mpc cannot generate appropriate code for this target type. mpc requires Closed type.

Here is the disassembled formatter C# code generated by `mspc`.
// MessagePack.Resolvers.GeneratedResolver.IGCFormatter0_GenericTestType<T0Emulate,T1Emulate>
using MessagePack.Formatters;
using System;

public sealed class IGCFormatter0_GenericTestType<T0Emulate, T1Emulate> : IMessagePackFormatter<GenericTestType<T0Emulate, T1Emulate>>, IMessagePackFormatter where T0Emulate : struct, IEquatable<T0Emulate> where T1Emulate : class
{
	public void Serialize(ref MessagePackWriter writer, GenericTestType<T0Emulate, T1Emulate> value, MessagePackSerializerOptions options)
	{
		if (value == null)
		{
			writer.WriteNil();
			return;
		}
		writer.WriteArrayHeader(2);
		IFormatterResolver resolver = options.Resolver;
		resolver.GetFormatterWithVerify<T0Emulate>().Serialize(ref writer, value.Value, options);
		resolver.GetFormatterWithVerify<T1Emulate>().Serialize(ref writer, value.Another, options);
	}

	public GenericTestType<T0Emulate, T1Emulate> Deserialize(ref MessagePackReader reader, MessagePackSerializerOptions options)
	{
		if (reader.TryReadNil())
		{
			return null;
		}
		options.Security.DepthStep(ref reader);
		int num = reader.ReadArrayHeader();
		IFormatterResolver resolver = options.Resolver;
		GenericTestType<T0Emulate, T1Emulate> genericTestType = new GenericTestType<T0Emulate, T1Emulate>();
		for (int i = 0; i < num; i++)
		{
			switch (i)
			{
			default:
				reader.Skip();
				break;
			case 0:
				genericTestType.Value = resolver.GetFormatterWithVerify<T0Emulate>().Deserialize(ref reader, options);
				break;
			case 1:
				genericTestType.Another = resolver.GetFormatterWithVerify<T1Emulate>().Deserialize(ref reader, options);
				break;
			}
		}
		reader.Depth--;
		return genericTestType;
	}
}

If you want dotnet global tool, you can use MSPack.Processor.CLI.
If you want core source code, you can use MSPack.Processor.Core.
@AArnott
Copy link
Collaborator

AArnott commented Mar 18, 2020

I'm against any solution based on IL-rewriting. Across many years of .NET development, I've never seen an IL-rewriting tool that didn't corrupt the .dll in some way in some circumstances and I don't want to get stuck maintaining use of such a tool.

I haven't reviewed the diff, @pCYSl5EDgo. Your PR description suggests several benefits and I'm wondering if/why switching to an IL-rewriter is necessary to get some of those benefits, or if we can still get them through mpc.

@pCYSl5EDgo
Copy link
Contributor Author

pCYSl5EDgo commented Mar 18, 2020

The most successful IL-rewriter tool I've ever seen is the Burst Compiler. It forces very strict rules on the programmers in order not to emit invalid assembly.

A few important benefits are inavailable without IL-rewriter tool(MSPack).

  • No allocation cost of string-key type serialization.
    • Only IL could use .data section of PE file without any managed heap allocation. It directly reads address of the field of <PrivateImplementationDetails> internal type.
    • C# code generation embeds raw byte array in .data section. C# code cannot use the name "<PrivateImplementationDetails>".
    • How C# embeds binary data..
  • Using IgnoresAccessChecksToAttribute and private/internal access.
    • Specialized C# compiler is needed when using IgnoresAccessChecksToAttribute in C# source code.
    • Mono.Cecil can insert IgnoresAccessChecksToAttribute into assembly like AssemblyBuilder.
    • I add this fact to the previous comment as one of the pros.

Other benefits are available by future mpc, Roslyn C# code generation.

mspc only overwrites the specific GetFormatter<T> place-holder method of type which implements IFormatterResolver. mspc does not delete any other parts.
The risk of corruption of assembly can be lowered by appropriate tests.

Copy link
Collaborator

@AArnott AArnott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, well it seems interesting and I like how it removes the need to periodically generate source code. I don't like IL rewriting because whenever the compiler comes up with something new, IL rewriters tend to break (or remove the new thing from the rewritten assembly without warning).
But it appears this basically adds documentation and a hook for Unity. It doesn't remove mpc.
@neuecc is the mastermind behind mpc and unity support here, so he needs to approve too.

README.md Outdated Show resolved Hide resolved

- Cannot serialize/deserialize generics types.
- Cannot serialize/deserialize fixed size buffers.
- Cannot serialize/deserialize private/internal members.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Unity even support IgnoresAccessChecksToAttribute?

@AArnott AArnott requested a review from neuecc March 28, 2020 17:20
@neuecc
Copy link
Member

neuecc commented Mar 29, 2020

I think your mpcc is not meant to replace the mpc, but could be offered as another option.
It doesn't need to be integrated into this repository and would be nice to be offered as a separate tool.

I'm looking forward to the code generation extensions in C# 9.0(?) code generation extensions in C# 9.0(?) and ( dotnet/csharplang#107 )
The response would be based on Roslyn's code analysis of the current mpc.
The inclusion of IL-based analysis code in this repository will result in bloated maintenance costs.

@pCYSl5EDgo
Copy link
Contributor Author

pCYSl5EDgo commented Mar 30, 2020

bloated maintenance costs

I have no doubt that there is bloated maintenance costs. I close this pull request.

Thank you for your reviews and comments.

@pCYSl5EDgo pCYSl5EDgo closed this Mar 30, 2020
@pCYSl5EDgo pCYSl5EDgo deleted the aot-mspack-v0.5.1 branch October 27, 2020 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants