Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String Serialization/Deserialization not compatible with google's protobuf #476

Closed
ejoebstl opened this issue Nov 27, 2018 · 5 comments
Closed

Comments

@ejoebstl
Copy link

ejoebstl commented Nov 27, 2018

When I encode my data using protobuf-net, and I try to decode it using google's protobuf for c++, strings get decoded incorrectly.

Mininal Example to reproduce

C# Program:

[ProtoContract]
class StringHolder {
    [ProtoMember(1)]
    public string Text { get; set; }
    [ProtoMember(2)]
    public int A { get; set; }
    [ProtoMember(3)]
    public int B { get; set; }
}

class Program
{
    static void Main(string[] args)
    {

        StringHolder data = new StringHolder{ Text = "Test", A = 120, B = 120 };

        using (var file = File.Create("data.out"))
        {
            Serializer.Serialize(file, data);
        }

        File.WriteAllText("data_format.proto", Serializer.GetProto<StringHolder>());
    }
}

Then, compile data_format.proto using protoc.

protoc --proto_path ./ --cpp_out=proto ./data_format.proto

C++ Program:

#include "proto/data_format.pb.h"
#include <fstream>
#include <iostream>

int main() {

    std::ifstream stream("data.out");
    ast_loader::StringHolder data;
    data.ParseFromIstream(&stream);
    stream.close();

    std::cout << data.text() << std::endl;
}

Expected Output

Test

Actual Output

Testoadere�c7�7V1P�7V|�u�7V

The length of the deserialized string is given as 48, which should be 4.

@mgravell
Copy link
Member

Hi; that seems ... unlikely. Just so I start looking at the right place - what library version are you using? Also: are you sure that this isn't some kind of over-allocation API nuance in the C++ API, or an incorrect API usage?

If I run your example code, I get 10 bytes in my "data.out" file, which is exactly what I would expect:

  • 1 byte for the field header for field 1
  • 1 byte for the length prefix of the text of field 1
  • 4 bytes for the text of field 1
  • 1 byte for the field header for field 2
  • 1 byte for the value of field 2
  • 1 byte for the field header for field 3
  • 1 byte for the value of field 3

Equally, the second byte (the length prefix of the text) is: 0x04 - which is simply decoded as the literal value 4.

The actual payload I get is: (0x) 0A-04-54-65-73-74-10-78-18-78

So: whatever is happening - I don't think it is the encode at fault here.

The file you are processing: is that also 10 bytes? I'm trying to think how it might have been damaged; the most common cause is people using text APIs on binary data, but nothing in the question suggests that.

@ejoebstl
Copy link
Author

ejoebstl commented Nov 27, 2018

Thank you for the insanely fast response.

My payload is 11 bytes. 0A-04-54-65-73-74-10-78-18-78-0a.

Super odd. I don't know where it comes from, but I would not expect the trailing \n to cause that kind of trouble.
I'll look into deeper into the decoding.

I'm using protobuf.net 2.4.0, libprotoc 3.6.1, dotnet core 2.1.403 on arch linux.

@ejoebstl
Copy link
Author

I've created an issue with protobuf - maybe they can help.

Thanks again!

@ejoebstl
Copy link
Author

ejoebstl commented Dec 4, 2018

We've tracked the issue down and it's related to exported symbols from a 3rd party library. Thanks for your help!

@ejoebstl ejoebstl closed this as completed Dec 4, 2018
@mgravell
Copy link
Member

mgravell commented Dec 4, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants