Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpc: More aggressive ahead of time optimization for string-key type formatter. #861

Merged
merged 32 commits into from
Sep 27, 2020

Conversation

pCYSl5EDgo
Copy link
Contributor

Summary

Current mpc generated code of string-key formatter unnecessarily allocates byte[] array and others.
This pull request reduces memory allocation, improves runtime initialization performance and embeds specialized automaton code for deserialization.

Current problems

Sample target string-key type code.
using MessagePack;

[MessagePackObject(true)]
public class TestType
{
    public int Value { get; set; }
    public int Value0 { get; set; }
    public int Value1 { get; set; }
    public int Value2 { get; set; }

    public int EightSensesFifth0 { get; set; }
    public int EightSensesFifth1 { get; set; }
    public int EightSensesFifth2 { get; set; }
    public int EightSensesFifth3 { get; set; }
    public int EightSensesFifth4 { get; set; }

    public string _0123456_0123456 { get; set; }
}
Current generated formmater code.
namespace MessagePack.Formatters
{
    using System;
    using System.Buffers;
    using MessagePack;

    public sealed class TestTypeFormatter : global::MessagePack.Formatters.IMessagePackFormatter<global::TestType>
    {


        private readonly global::MessagePack.Internal.AutomataDictionary ____keyMapping;
        private readonly byte[][] ____stringByteKeys;

        public TestTypeFormatter()
        {
            this.____keyMapping = new global::MessagePack.Internal.AutomataDictionary()
            {
                { "Value", 0 },
                { "Value0", 1 },
                { "Value1", 2 },
                { "Value2", 3 },
                { "EightSensesFifth0", 4 },
                { "EightSensesFifth1", 5 },
                { "EightSensesFifth2", 6 },
                { "EightSensesFifth3", 7 },
                { "EightSensesFifth4", 8 },
                { "_0123456_0123456", 9 },
            };

            this.____stringByteKeys = new byte[][]
            {
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("Value"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("Value0"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("Value1"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("Value2"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("EightSensesFifth0"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("EightSensesFifth1"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("EightSensesFifth2"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("EightSensesFifth3"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("EightSensesFifth4"),
                global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes("_0123456_0123456"),
            };
        }

        public void Serialize(ref MessagePackWriter writer, global::TestType value, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (value == null)
            {
                writer.WriteNil();
                return;
            }

            IFormatterResolver formatterResolver = options.Resolver;
            writer.WriteMapHeader(10);
            writer.WriteRaw(this.____stringByteKeys[0]);
            writer.Write(value.Value);
            writer.WriteRaw(this.____stringByteKeys[1]);
            writer.Write(value.Value0);
            writer.WriteRaw(this.____stringByteKeys[2]);
            writer.Write(value.Value1);
            writer.WriteRaw(this.____stringByteKeys[3]);
            writer.Write(value.Value2);
            writer.WriteRaw(this.____stringByteKeys[4]);
            writer.Write(value.EightSensesFifth0);
            writer.WriteRaw(this.____stringByteKeys[5]);
            writer.Write(value.EightSensesFifth1);
            writer.WriteRaw(this.____stringByteKeys[6]);
            writer.Write(value.EightSensesFifth2);
            writer.WriteRaw(this.____stringByteKeys[7]);
            writer.Write(value.EightSensesFifth3);
            writer.WriteRaw(this.____stringByteKeys[8]);
            writer.Write(value.EightSensesFifth4);
            writer.WriteRaw(this.____stringByteKeys[9]);
            formatterResolver.GetFormatterWithVerify<string>().Serialize(ref writer, value._0123456_0123456, options);
        }

        public global::TestType Deserialize(ref MessagePackReader reader, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (reader.TryReadNil())
            {
                return null;
            }

            options.Security.DepthStep(ref reader);
            IFormatterResolver formatterResolver = options.Resolver;
            var length = reader.ReadMapHeader();
            var __Value__ = default(int);
            var __Value0__ = default(int);
            var __Value1__ = default(int);
            var __Value2__ = default(int);
            var __EightSensesFifth0__ = default(int);
            var __EightSensesFifth1__ = default(int);
            var __EightSensesFifth2__ = default(int);
            var __EightSensesFifth3__ = default(int);
            var __EightSensesFifth4__ = default(int);
            var ___0123456_0123456__ = default(string);

            for (int i = 0; i < length; i++)
            {
                ReadOnlySpan<byte> stringKey = global::MessagePack.Internal.CodeGenHelpers.ReadStringSpan(ref reader);
                int key;
                if (!this.____keyMapping.TryGetValue(stringKey, out key))
                {
                    reader.Skip();
                    continue;
                }

                switch (key)
                {
                    case 0:
                        __Value__ = reader.ReadInt32();
                        break;
                    case 1:
                        __Value0__ = reader.ReadInt32();
                        break;
                    case 2:
                        __Value1__ = reader.ReadInt32();
                        break;
                    case 3:
                        __Value2__ = reader.ReadInt32();
                        break;
                    case 4:
                        __EightSensesFifth0__ = reader.ReadInt32();
                        break;
                    case 5:
                        __EightSensesFifth1__ = reader.ReadInt32();
                        break;
                    case 6:
                        __EightSensesFifth2__ = reader.ReadInt32();
                        break;
                    case 7:
                        __EightSensesFifth3__ = reader.ReadInt32();
                        break;
                    case 8:
                        __EightSensesFifth4__ = reader.ReadInt32();
                        break;
                    case 9:
                        ___0123456_0123456__ = formatterResolver.GetFormatterWithVerify<string>().Deserialize(ref reader, options);
                        break;
                    default:
                        reader.Skip();
                        break;
                }
            }

            var ____result = new global::TestType();
            ____result.Value = __Value__;
            ____result.Value0 = __Value0__;
            ____result.Value1 = __Value1__;
            ____result.Value2 = __Value2__;
            ____result.EightSensesFifth0 = __EightSensesFifth0__;
            ____result.EightSensesFifth1 = __EightSensesFifth1__;
            ____result.EightSensesFifth2 = __EightSensesFifth2__;
            ____result.EightSensesFifth3 = __EightSensesFifth3__;
            ____result.EightSensesFifth4 = __EightSensesFifth4__;
            ____result._0123456_0123456 = ___0123456_0123456__;
            reader.Depth--;
            return ____result;
        }
    }
}

____stringByteKeys

Current generated TestTypeFormatter has a byte[][] field ____stringByteKeys. ____stringByteKeys does not need to be byte[][]. Accessing ____stringByteKeys in Serialize method results in a performance penalty because of array boundary check.

No need for runtime call of global::MessagePack.Internal.CodeGenHelpers.GetEncodedStringBytes

Initialization of ____stringByteKeys uses unnecessary cpu power. String keys are known before runtime execution. Encoded string byte keys can be calculated by mpc and can be embedded as array initialization list.

____keyMapping

____keyMapping is a global::MessagePack.Internal.AutomataDictionary type field. global::MessagePack.Internal.AutomataDictionary allocates unnecessary arrays. AutomataDictionary builds a trie tree from given string-keys in constructor and performs an automaton search in Deserialize method.

Achieved Pros

Generated formatter code by this pull request.
namespace MessagePack.Formatters
{
    using System;
    using System.Buffers;
    using System.Runtime.InteropServices;
    using MessagePack;

    public sealed class TestTypeFormatter : global::MessagePack.Formatters.IMessagePackFormatter<global::TestType>
    {
        private static byte[] ____stringByteKeys0;
        private static byte[] ____stringByteKeys1;
        private static byte[] ____stringByteKeys2;
        private static byte[] ____stringByteKeys3;
        private static byte[] ____stringByteKeys4;
        private static byte[] ____stringByteKeys5;
        private static byte[] ____stringByteKeys6;
        private static byte[] ____stringByteKeys7;
        private static byte[] ____stringByteKeys8;
        private static byte[] ____stringByteKeys9;
        public TestTypeFormatter()
        {
            ____stringByteKeys0 = new byte[]{ 0xA5, 0x56, 0x61, 0x6C, 0x75, 0x65, };
            ____stringByteKeys1 = new byte[]{ 0xA6, 0x56, 0x61, 0x6C, 0x75, 0x65, 0x30, };
            ____stringByteKeys2 = new byte[]{ 0xA6, 0x56, 0x61, 0x6C, 0x75, 0x65, 0x31, };
            ____stringByteKeys3 = new byte[]{ 0xA6, 0x56, 0x61, 0x6C, 0x75, 0x65, 0x32, };
            ____stringByteKeys4 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x30, };
            ____stringByteKeys5 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x31, };
            ____stringByteKeys6 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x32, };
            ____stringByteKeys7 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x33, };
            ____stringByteKeys8 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x34, };
            ____stringByteKeys9 = new byte[]{ 0xB0, 0x5F, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x5F, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, };
        }

        public void Serialize(ref MessagePackWriter writer, global::TestType value, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (value == null)
            {
                writer.WriteNil();
                return;
            }

            IFormatterResolver formatterResolver = options.Resolver;
            writer.WriteMapHeader(10);
            writer.WriteRaw(____stringByteKeys0);
            writer.Write(value.Value);
            writer.WriteRaw(____stringByteKeys1);
            writer.Write(value.Value0);
            writer.WriteRaw(____stringByteKeys2);
            writer.Write(value.Value1);
            writer.WriteRaw(____stringByteKeys3);
            writer.Write(value.Value2);
            writer.WriteRaw(____stringByteKeys4);
            writer.Write(value.EightSensesFifth0);
            writer.WriteRaw(____stringByteKeys5);
            writer.Write(value.EightSensesFifth1);
            writer.WriteRaw(____stringByteKeys6);
            writer.Write(value.EightSensesFifth2);
            writer.WriteRaw(____stringByteKeys7);
            writer.Write(value.EightSensesFifth3);
            writer.WriteRaw(____stringByteKeys8);
            writer.Write(value.EightSensesFifth4);
            writer.WriteRaw(____stringByteKeys9);
            formatterResolver.GetFormatterWithVerify<string>().Serialize(ref writer, value._0123456_0123456, options);
        }

        public global::TestType Deserialize(ref MessagePackReader reader, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (reader.TryReadNil())
            {
                return null;
            }

            options.Security.DepthStep(ref reader);
            IFormatterResolver formatterResolver = options.Resolver;
            var __Value__ = default(int);
            var __Value0__ = default(int);
            var __Value1__ = default(int);
            var __Value2__ = default(int);
            var __EightSensesFifth0__ = default(int);
            var __EightSensesFifth1__ = default(int);
            var __EightSensesFifth2__ = default(int);
            var __EightSensesFifth3__ = default(int);
            var __EightSensesFifth4__ = default(int);
            var ___0123456_0123456__ = default(string);
            var isBigEndian = !global::System.BitConverter.IsLittleEndian;

            for (int i = 0, length = reader.ReadMapHeader(); i < length; i++)
            {
                var stringKey = global::MessagePack.Internal.CodeGenHelpers.ReadStringSpan(ref reader);
                switch(stringKey.Length)
                {
                    case 5:
                        {
                            ulong last = stringKey[4];
                            last <<= 8;
                            last |= stringKey[3];
                            last <<= 8;
                            last |= stringKey[2];
                            last <<= 8;
                            last |= stringKey[1];
                            last <<= 8;
                            last |= stringKey[0];
                            switch (last)
                            {
                                default: goto FAIL;
                                case 0x65756C6156UL:
                                    __Value__ = reader.ReadInt32();
                                    continue;
                            }
                        }
                    case 6:
                        {
                            ulong last = stringKey[5];
                            last <<= 8;
                            last |= stringKey[4];
                            last <<= 8;
                            last |= stringKey[3];
                            last <<= 8;
                            last |= stringKey[2];
                            last <<= 8;
                            last |= stringKey[1];
                            last <<= 8;
                            last |= stringKey[0];
                            switch (last)
                            {
                                default: goto FAIL;
                                case 0x3065756C6156UL:
                                    __Value0__ = reader.ReadInt32();
                                    continue;
                                case 0x3165756C6156UL:
                                    __Value1__ = reader.ReadInt32();
                                    continue;
                                case 0x3265756C6156UL:
                                    __Value2__ = reader.ReadInt32();
                                    continue;
                            }
                        }
                }

                ReadOnlySpan<ulong> ulongs = isBigEndian ? stackalloc ulong[stringKey.Length >> 3] : MemoryMarshal.Cast<byte, ulong>(stringKey);
                if(isBigEndian)
                {
                    for(var index = 0; index < ulongs.Length; index++)
                    {
                        var index8times = index << 3;
                        ref var number = ref global::System.Runtime.CompilerServices.Unsafe.AsRef(ulongs[index]);
                        number = stringKey[index8times + 7];
                        for (var numberIndex = index8times + 6; numberIndex >= index8times; numberIndex--)
                        {
                            number <<= 8;
                            number |= stringKey[numberIndex];
                        }
                    }
                }

                switch(stringKey.Length)
                {
                    default: goto FAIL;
                    case 16:
                        if (ulongs[0] != 0x363534333231305FUL ||
                            ulongs[1] != 0x363534333231305FUL) goto FAIL;

                        ___0123456_0123456__ = formatterResolver.GetFormatterWithVerify<string>().Deserialize(ref reader, options);
                        continue;
                    case 17:
                        if (ulongs[0] != 0x6E65537468676945UL) goto FAIL;
                        if (ulongs[1] != 0x6874666946736573UL) goto FAIL;
                        {
                            uint last = stringKey[16];
                            switch (last)
                            {
                                default: goto FAIL;
                                case 0x30U:
                                    __EightSensesFifth0__ = reader.ReadInt32();
                                    continue;
                                case 0x31U:
                                    __EightSensesFifth1__ = reader.ReadInt32();
                                    continue;
                                case 0x32U:
                                    __EightSensesFifth2__ = reader.ReadInt32();
                                    continue;
                                case 0x33U:
                                    __EightSensesFifth3__ = reader.ReadInt32();
                                    continue;
                                case 0x34U:
                                    __EightSensesFifth4__ = reader.ReadInt32();
                                    continue;
                            }
                        }
                }

            FAIL:
                reader.Skip();
            }

            var ____result = new global::TestType();
            ____result.Value = __Value__;
            ____result.Value0 = __Value0__;
            ____result.Value1 = __Value1__;
            ____result.Value2 = __Value2__;
            ____result.EightSensesFifth0 = __EightSensesFifth0__;
            ____result.EightSensesFifth1 = __EightSensesFifth1__;
            ____result.EightSensesFifth2 = __EightSensesFifth2__;
            ____result.EightSensesFifth3 = __EightSensesFifth3__;
            ____result.EightSensesFifth4 = __EightSensesFifth4__;
            ____result._0123456_0123456 = ___0123456_0123456__;
            reader.Depth--;
            return ____result;
        }
    }
}

Embedding raw byte array initialization list

____stringByteKeys0 = new byte[]{ 0xA5, 0x56, 0x61, 0x6C, 0x75, 0x65, }; is compiled by C# compiler and IL codes are like this link. System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray is very performant method.

Inlined searching of Deserialize

Embedded inlined searching codes are now generated. Generated code's literal is Little-Endian. The performance penalty will appear when the runtime endian is Big-Endian.

Remove line IFormatterResolver formatterResolver = options.Resolver; when not needed.

IFormatterResolver formatterResolver = options.Resolver; is not needed when there are no field whose type is primitive or has a custom formatter. This pull request removes unnecessary line from generated codes.

FormatterTemplte : generates only int-key type formatter.
IFomatterTemplate : common method/property between FormatterTemplate and StringKeyFormatterTemplate.
CodeGenerator : objectFormatterTemplates is IFormatterTemplate[]
ShouldUseFormatterResolverHelper : Optimization of IFormatterResolver.
@AArnott
Copy link
Collaborator

AArnott commented Mar 28, 2020

Wow. I've no doubt this is an improvement. Thanks for contributing!
But I have two questions:

  1. How much faster is it really? And I mean for the typical use case not some contrived case that does nothing other than exercise the code that has been changed.
  2. Is the speed improvement worth the substantially decreased readability of the generated code? I was just recommending earlier this week that someone who needed customized serialization to use mpc to generate their own formatter/resolver and then change the output to meet their needs. They can do this today, but after this PR, I can't imagine anyone would want to use mpc output as a basis for customizing their formatter. If the perf improvement is substantial, and most people don't use mpc to customize serialization, then perhaps the loss in readability is worth it.

@neuecc, what do you think?

@neuecc
Copy link
Member

neuecc commented Mar 29, 2020

There are a number of improvements in there, but I think the basic point is that mpc also does the inlined constant embedding automata that we do during dynamic code generation.
When I first introduced to automaton, I omit inlining to mpc.
http://neue.cc/2017/08/28_558.html
However, I agree with this PR policy because I think it's better to be close.

I've also made custom formatters based on code generation by MPC and modified them.
That said, the current MPC-generated results are also by no means easy to fix, so you may not have to think about that.
However, replacing GetEncodedStringBytes with a new byte is a complete loss of name, so I think then __stringByteKeysX needs to keep the name information, like __EightSensesFifth0Key__.

Also, I'd like the code to be similar to the current dynamic code generation.
For example, using AutomataKeyGen.GetKey to use the same key, and so on.
This is for maintenance purposes.
I don't have the confidence to do the maintenance if it generates code that is clearly of a different strain.

When ULong value consists purely of ASCII characters, 8 ascii chars are printed as a comment.
@pCYSl5EDgo
Copy link
Contributor Author

pCYSl5EDgo commented Mar 29, 2020

However, replacing GetEncodedStringBytes with a new byte is a complete loss of name, so I think then __stringByteKeysX needs to keep the name information, like EightSensesFifth0Key.

I agree with this comment. I changed the naming rule from ____stringByteKey(Number) to ____stringByteKey_(Member.Name).

decreased readability

The main cause of readability decrease is the little-endian ulong keys embedded in switch case or if statements. I added comment lines for each of the ulong keys which contain only ascii compatible utf8 characters.

Pull Request Generated Formatter Code with 2 additional commits.
namespace MessagePack.Formatters
{
    using System;
    using System.Buffers;
    using System.Runtime.InteropServices;
    using MessagePack;

    public sealed class TestTypeFormatter : global::MessagePack.Formatters.IMessagePackFormatter<global::TestType>
    {
        private static byte[] ____stringByteKeys_Value;
        private static byte[] ____stringByteKeys_Value0;
        private static byte[] ____stringByteKeys_Value1;
        private static byte[] ____stringByteKeys_Value2;
        private static byte[] ____stringByteKeys_EightSensesFifth0;
        private static byte[] ____stringByteKeys_EightSensesFifth1;
        private static byte[] ____stringByteKeys_EightSensesFifth2;
        private static byte[] ____stringByteKeys_EightSensesFifth3;
        private static byte[] ____stringByteKeys_EightSensesFifth4;
        private static byte[] ____stringByteKeys__0123456_0123456;

        public TestTypeFormatter()
        {
            ____stringByteKeys_Value = new byte[]{ 0xA5, 0x56, 0x61, 0x6C, 0x75, 0x65, };
            ____stringByteKeys_Value0 = new byte[]{ 0xA6, 0x56, 0x61, 0x6C, 0x75, 0x65, 0x30, };
            ____stringByteKeys_Value1 = new byte[]{ 0xA6, 0x56, 0x61, 0x6C, 0x75, 0x65, 0x31, };
            ____stringByteKeys_Value2 = new byte[]{ 0xA6, 0x56, 0x61, 0x6C, 0x75, 0x65, 0x32, };
            ____stringByteKeys_EightSensesFifth0 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x30, };
            ____stringByteKeys_EightSensesFifth1 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x31, };
            ____stringByteKeys_EightSensesFifth2 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x32, };
            ____stringByteKeys_EightSensesFifth3 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x33, };
            ____stringByteKeys_EightSensesFifth4 = new byte[]{ 0xB1, 0x45, 0x69, 0x67, 0x68, 0x74, 0x53, 0x65, 0x6E, 0x73, 0x65, 0x73, 0x46, 0x69, 0x66, 0x74, 0x68, 0x34, };
            ____stringByteKeys__0123456_0123456 = new byte[]{ 0xB0, 0x5F, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x5F, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, };
        }

        public void Serialize(ref MessagePackWriter writer, global::TestType value, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (value == null)
            {
                writer.WriteNil();
                return;
            }

            IFormatterResolver formatterResolver = options.Resolver;
            writer.WriteMapHeader(10);
            writer.WriteRaw(____stringByteKeys_Value);
            writer.Write(value.Value);
            writer.WriteRaw(____stringByteKeys_Value0);
            writer.Write(value.Value0);
            writer.WriteRaw(____stringByteKeys_Value1);
            writer.Write(value.Value1);
            writer.WriteRaw(____stringByteKeys_Value2);
            writer.Write(value.Value2);
            writer.WriteRaw(____stringByteKeys_EightSensesFifth0);
            writer.Write(value.EightSensesFifth0);
            writer.WriteRaw(____stringByteKeys_EightSensesFifth1);
            writer.Write(value.EightSensesFifth1);
            writer.WriteRaw(____stringByteKeys_EightSensesFifth2);
            writer.Write(value.EightSensesFifth2);
            writer.WriteRaw(____stringByteKeys_EightSensesFifth3);
            writer.Write(value.EightSensesFifth3);
            writer.WriteRaw(____stringByteKeys_EightSensesFifth4);
            writer.Write(value.EightSensesFifth4);
            writer.WriteRaw(____stringByteKeys__0123456_0123456);
            formatterResolver.GetFormatterWithVerify<string>().Serialize(ref writer, value._0123456_0123456, options);
        }

        public global::TestType Deserialize(ref MessagePackReader reader, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (reader.TryReadNil())
            {
                return null;
            }

            options.Security.DepthStep(ref reader);
            IFormatterResolver formatterResolver = options.Resolver;
            var __Value__ = default(int);
            var __Value0__ = default(int);
            var __Value1__ = default(int);
            var __Value2__ = default(int);
            var __EightSensesFifth0__ = default(int);
            var __EightSensesFifth1__ = default(int);
            var __EightSensesFifth2__ = default(int);
            var __EightSensesFifth3__ = default(int);
            var __EightSensesFifth4__ = default(int);
            var ___0123456_0123456__ = default(string);
            var isBigEndian = !global::System.BitConverter.IsLittleEndian;

            for (int i = 0, length = reader.ReadMapHeader(); i < length; i++)
            {
                var stringKey = global::MessagePack.Internal.CodeGenHelpers.ReadStringSpan(ref reader);
                switch(stringKey.Length)
                {
                    case 5:
                        {
                            ulong last = stringKey[4];
                            last <<= 8;
                            last |= stringKey[3];
                            last <<= 8;
                            last |= stringKey[2];
                            last <<= 8;
                            last |= stringKey[1];
                            last <<= 8;
                            last |= stringKey[0];
                            switch (last)
                            {
                                default: goto FAIL;
                                case 0x65756C6156UL:
                                    __Value__ = reader.ReadInt32();
                                    continue;
                            }
                        }
                    case 6:
                        {
                            ulong last = stringKey[5];
                            last <<= 8;
                            last |= stringKey[4];
                            last <<= 8;
                            last |= stringKey[3];
                            last <<= 8;
                            last |= stringKey[2];
                            last <<= 8;
                            last |= stringKey[1];
                            last <<= 8;
                            last |= stringKey[0];
                            switch (last)
                            {
                                default: goto FAIL;
                                case 0x3065756C6156UL:
                                    __Value0__ = reader.ReadInt32();
                                    continue;
                                case 0x3165756C6156UL:
                                    __Value1__ = reader.ReadInt32();
                                    continue;
                                case 0x3265756C6156UL:
                                    __Value2__ = reader.ReadInt32();
                                    continue;
                            }
                        }
                }

                ReadOnlySpan<ulong> ulongs = isBigEndian ? stackalloc ulong[stringKey.Length >> 3] : MemoryMarshal.Cast<byte, ulong>(stringKey);
                if(isBigEndian)
                {
                    for(var index = 0; index < ulongs.Length; index++)
                    {
                        var index8times = index << 3;
                        ref var number = ref global::System.Runtime.CompilerServices.Unsafe.AsRef(ulongs[index]);
                        number = stringKey[index8times + 7];
                        for (var numberIndex = index8times + 6; numberIndex >= index8times; numberIndex--)
                        {
                            number <<= 8;
                            number |= stringKey[numberIndex];
                        }
                    }
                }

                switch(stringKey.Length)
                {
                    default: goto FAIL;
                    case 16:
                        if (
                            ulongs[0] != 0x363534333231305FUL || // _0123456
                            ulongs[1] != 0x363534333231305FUL // _0123456
                        ) goto FAIL;

                        ___0123456_0123456__ = formatterResolver.GetFormatterWithVerify<string>().Deserialize(ref reader, options);
                        continue;
                    case 17:
                        if (ulongs[0] != 0x6E65537468676945UL) goto FAIL; // EightSen
                        if (ulongs[1] != 0x6874666946736573UL) goto FAIL; // sesFifth
                        {
                            uint last = stringKey[16];
                            switch (last)
                            {
                                default: goto FAIL;
                                case 0x30U:
                                    __EightSensesFifth0__ = reader.ReadInt32();
                                    continue;
                                case 0x31U:
                                    __EightSensesFifth1__ = reader.ReadInt32();
                                    continue;
                                case 0x32U:
                                    __EightSensesFifth2__ = reader.ReadInt32();
                                    continue;
                                case 0x33U:
                                    __EightSensesFifth3__ = reader.ReadInt32();
                                    continue;
                                case 0x34U:
                                    __EightSensesFifth4__ = reader.ReadInt32();
                                    continue;
                            }
                        }
                }

            FAIL:
                reader.Skip();
            }

            var ____result = new global::TestType();
            ____result.Value = __Value__;
            ____result.Value0 = __Value0__;
            ____result.Value1 = __Value1__;
            ____result.Value2 = __Value2__;
            ____result.EightSensesFifth0 = __EightSensesFifth0__;
            ____result.EightSensesFifth1 = __EightSensesFifth1__;
            ____result.EightSensesFifth2 = __EightSensesFifth2__;
            ____result.EightSensesFifth3 = __EightSensesFifth3__;
            ____result.EightSensesFifth4 = __EightSensesFifth4__;
            ____result._0123456_0123456 = ___0123456_0123456__;
            reader.Depth--;
            return ____result;
        }
    }
}

@pCYSl5EDgo
Copy link
Contributor Author

pCYSl5EDgo commented Mar 31, 2020

I made a benchmark project to know how much my pull request is faster than the original one.
Gist.

BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363
Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.1.200
  [Host]     : .NET Core 3.1.2 (CoreCLR 4.700.20.6602, CoreFX 4.700.20.6702), X64 RyuJIT
  Job-ITDJDY : .NET Core 3.1.2 (CoreCLR 4.700.20.6602, CoreFX 4.700.20.6702), X64 RyuJIT

Runtime=.NET Core 3.1  
Method ConstantString Mean Error StdDev
DeserializeTest_NewEmbeddedFormatter 508.6 ns 10.20 ns 10.47 ns
DeserializeTest2_OriginalGeneratedFormatter 682.1 ns 13.51 ns 29.66 ns
DeserializeTest_NewEmbeddedFormatter a 489.3 ns 6.36 ns 5.95 ns
DeserializeTest2_OriginalGeneratedFormatter a 694.0 ns 7.68 ns 6.81 ns
DeserializeTest_NewEmbeddedFormatter shorter 537.2 ns 2.44 ns 2.16 ns
DeserializeTest2_OriginalGeneratedFormatter shorter 714.6 ns 10.28 ns 8.58 ns
DeserializeTest_NewEmbeddedFormatter このベンチ(...)[62] 675.5 ns 6.43 ns 6.02 ns
DeserializeTest2_OriginalGeneratedFormatter このベンチ(...)[62] 871.8 ns 9.12 ns 7.12 ns

Copy link
Contributor

@waitxd waitxd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done!
Please consider to change constructor to static or move code into field initialization.
And I proposed a few style fixes to match project code style.

@pCYSl5EDgo
Copy link
Contributor Author

pCYSl5EDgo commented May 13, 2020

Thank you @waitxd !

You are using constructor to initialize static fields. Either use static constructor or move static this code into field initialization.

You are right.
I would like to embed those byte arrays as writer.WriteRaw's ReadOnlySpan<byte> parameter.
It is because C# 7.3 and C# compiler remove manage heap allocation.

@waitxd
Copy link
Contributor

waitxd commented May 13, 2020

Found two more bugs:

  1. If mpc output to directory instead of .cs file - your template is not used at all.
  2. You are checking MessagePackObject attribute value only for the first class from each namespace to decide whether should you apply your formatter.

In this example both will be serialized as an array:

namespace SharedData
{
    [MessagePackObject()]
    public class SerializeMeAsArrayPlease
    {
        [Key(0)]
        public bool Foo;
    }

    [MessagePackObject(true)]
    public class TotallyDoNotSerializeMeAsArrayPlease
    {
        public bool Foo;
    }
}

It seems static fields and static constructor in generated code are redundant now.
Maybe we should add a comment with meaning of inlined byte arrays in Serialize method as it was done in Deserialize?

@pCYSl5EDgo
Copy link
Contributor Author

pCYSl5EDgo commented May 14, 2020

Bug report

Thank you so much!

Maybe we should add a comment with meaning of inlined byte arrays in Serialize method as it was done in Deserialize?

It should be embedded. I am worried that the end-user programmer at Unity environment uses these generated codes in .NET 3.5 API Level and his/her C# version is less than 7.3.

@waitxd
Copy link
Contributor

waitxd commented May 14, 2020

You are still generate constructor and static fields that are not used anymore.

Example
    public sealed class StringKeySerializerTargetFormatter : global::MessagePack.Formatters.IMessagePackFormatter<global::PerfBenchmarkDotNet.StringKeySerializerTarget>
    {
        private static byte[] ____stringByteKeys_MyProperty1;

        static StringKeySerializerTargetFormatter()
        {
            ____stringByteKeys_MyProperty1 = new byte[] { 0xAB, 0x4D, 0x79, 0x50, 0x72, 0x6F, 0x70, 0x65, 0x72, 0x74, 0x79, 0x31, };
        }

        public void Serialize(ref MessagePackWriter writer, global::PerfBenchmarkDotNet.StringKeySerializerTarget value, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (value == null)
            {
                writer.WriteNil();
                return;
            }

            writer.WriteMapHeader(1);
            writer.WriteRaw(new byte[] { 0xAB, 0x4D, 0x79, 0x50, 0x72, 0x6F, 0x70, 0x65, 0x72, 0x74, 0x79, 0x31, });
            writer.Write(value.MyProperty1);
        }

        public global::PerfBenchmarkDotNet.StringKeySerializerTarget Deserialize(ref MessagePackReader reader, global::MessagePack.MessagePackSerializerOptions options)
        {
            if (reader.TryReadNil())
            {
                return null;
            }

            options.Security.DepthStep(ref reader);
            var __MyProperty1__ = default(int);
            var isBigEndian = !global::System.BitConverter.IsLittleEndian;

            for (int i = 0, length = reader.ReadMapHeader(); i < length; i++)
            {
                var stringKey = global::MessagePack.Internal.CodeGenHelpers.ReadStringSpan(ref reader);
                ReadOnlySpan<ulong> ulongs = isBigEndian ? stackalloc ulong[stringKey.Length >> 3] : MemoryMarshal.Cast<byte, ulong>(stringKey);
                if (isBigEndian)
                {
                    for (var index = 0; index < ulongs.Length; index++)
                    {
                        var index8times = index << 3;
                        ref var number = ref global::System.Runtime.CompilerServices.Unsafe.AsRef(ulongs[index]);
                        number = stringKey[index8times + 7];
                        for (var numberIndex = index8times + 6; numberIndex >= index8times; numberIndex--)
                        {
                            number <<= 8;
                            number |= stringKey[numberIndex];
                        }
                    }
                }

                switch (stringKey.Length)
                {
                    default:
                    FAIL:
                        reader.Skip();
                        continue;
                    case 11:
                        if (
                            ulongs[0] != 0x7265706F7250794DUL // MyProper
                        ) goto FAIL;
                        {
                            uint last = stringKey[10];
                            last <<= 8;
                            last |= stringKey[9];
                            last <<= 8;
                            last |= stringKey[9];
                            if(last != 0x317974U) goto FAIL;
                            __MyProperty1__ = reader.ReadInt32();
                            continue;
                        }
                }
            }

            var ____result = new global::PerfBenchmarkDotNet.StringKeySerializerTarget();
            ____result.MyProperty1 = __MyProperty1__;
            reader.Depth--;
            return ____result;
        }
    }

It should be embedded.

Yes, But for those who will read generated code byte array meaning will be unclear.
We can put a comment right after byte array:

writer.WriteMapHeader(9);
writer.WriteRaw(new byte[] { 0xAB, 0x4D, 0x79, 0x50, 0x72, 0x6F, 0x70, 0x65, 0x72, 0x74, 0x79, 0x31, }); //MyProperty1
writer.Write(value.MyProperty1);

Or before:

// MyProperty2
writer.WriteRaw(new byte[] { 0xAB, 0x4D, 0x79, 0x50, 0x72, 0x6F, 0x70, 0x65, 0x72, 0x74, 0x79, 0x32, });
writer.Write(value.MyProperty2);

I am worried that the end-user programmer at Unity environment uses these generated codes in .NET 3.5 API Level and his/her C# version is less than 7.3.

That's should not be a problem. MessagePack-CSharp supports only Unity 2018.3+. It is mentioned in the readme.

@pCYSl5EDgo
Copy link
Contributor Author

Thank you for all of your great advices!

MessagePack-CSharp supports only Unity 2018.3+. It is mentioned in the readme.

Oh! I was releaved to hear that!

@AArnott AArnott requested a review from neuecc May 17, 2020 14:43
@github-actions
Copy link

This pull request is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@pCYSl5EDgo
Copy link
Contributor Author

My proposed automaton first branches by string length. The current string search automaton does not consider the length of the string, and compares it in 8-byte units.
I will write a benchmark code and examine how much this difference affects performance.

@AArnott
Copy link
Collaborator

AArnott commented Sep 2, 2020

@neuecc How are you feeling about this change at this point? Is it close enough to the dynamic resolver that you're comfortable maintaining it?

@neuecc
Copy link
Member

neuecc commented Sep 2, 2020

Serialize is good.
However, as for Deserialize, as I mentioned before, I don't see why you can't use the AutomataKeyGen.GetKey used there.
DynamicObjectResolver's generated code and mpc generated results are so different, that can not acceptable.
Also, in the main scenario, Unity IL2CPP, the behavior on Android (which has been an issue many times, right?) I'm concerned about the problems.

@pCYSl5EDgo
Copy link
Contributor Author

Okay, I start rewriting the Deserialization method.

@pCYSl5EDgo
Copy link
Contributor Author

Seems done.

@AArnott AArnott changed the base branch from master to v2.2 September 14, 2020 02:57
Copy link
Collaborator

@AArnott AArnott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good. One inefficiency should be corrected as noted. @neuecc are you satisfied?

@neuecc
Copy link
Member

neuecc commented Sep 26, 2020

@pCYSl5EDgo
Thanks, great contribution.
I think it is okay.
A few code formatting issues.

Fix format of StringKeyFormatterTemplate.tt
reader.Skip() optimization
C#9 Record preparation
@pCYSl5EDgo
Copy link
Contributor Author

pCYSl5EDgo commented Sep 27, 2020

I don't checked generated code of zero members but Skip twice? Is okay?

It was okay. (First Skip() was string key skip and second Skip() was value skip.)
But I rewrite it to be more easily understood.

Different Endian

This PR uses binary sequence compare and AutomataKeyGen.GetKey.
It is okay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants