Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG?] Problem roundtripping null strings #969

Open
jodydonetti opened this issue Oct 25, 2022 · 2 comments
Open

[BUG?] Problem roundtripping null strings #969

jodydonetti opened this issue Oct 25, 2022 · 2 comments

Comments

@jodydonetti
Copy link

jodydonetti commented Oct 25, 2022

Hi, I'm using protobuf-net v3.1.22 on net6.0 (using VS2022 Community v17.3.6, just in case) and I'm observing inconsistencies while roundtripping null strings.

Here's the minimal repro:

using ProtoBuf.Meta;

string? s1 = null;
string? s2;
int? i1 = null;
int? i2;
byte[] data;

// STRING
using (var ms = new MemoryStream())
{
	RuntimeTypeModel.Default.Serialize(ms, s1);
	data = ms.ToArray();
}
using (var ms = new MemoryStream(data))
{
	s2 = RuntimeTypeModel.Default.Deserialize<string?>(ms);
}

Console.WriteLine($"data.Length: {data.Length}");
Console.WriteLine($"s1 ({s1}) is null: {s1 is null}");
Console.WriteLine($"s2 ({s2}) is null: {s2 is null}");

// INT
using (var ms = new MemoryStream())
{
	RuntimeTypeModel.Default.Serialize(ms, i1);
	data = ms.ToArray();
}
using (var ms = new MemoryStream(data))
{
	i2 = RuntimeTypeModel.Default.Deserialize<int?>(ms);
}

Console.WriteLine($"data.Length: {data.Length}");
Console.WriteLine($"i1 ({i1}) is null: {i1 is null}");
Console.WriteLine($"i2 ({i2}) is null: {i2 is null}");

The output of this is:

data.Length: 0
s1 () is null: True
s2 () is null: False
data.Length: 0
i1 () is null: True
i2 () is null: True

When I serialize a null value, in both cases (string and int) it seems to produce - at least based on with understanding of how Protobuf works - the expected payload which is an empty byte[] (byte[0]).

When deserializing the string though, I expected to get back null but instead got an empty string.

With int? instead it seems to work fine, correctly returning null.

All the default settings are unchanged, and as can be seen I'm using the default TypeModel: the one above is the entire repro code.

In case it can be useful, this is the csproj:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="protobuf-net" Version="3.1.22" />
  </ItemGroup>

</Project>

I've found the issue #698 but I'm not sure this is the same thing since that involved a wrapping class.

Am I doing something wrong?

Thanks!

@jodydonetti
Copy link
Author

Update: it seems to do the same with a generic class too, not just strings:

// TEST DATA
TestData? td1 = null;
TestData? td2;

using (var ms = new MemoryStream())
{
	RuntimeTypeModel.Default.Serialize(ms, td1);
	data = ms.ToArray();
}
using (var ms = new MemoryStream(data))
{
	td2 = RuntimeTypeModel.Default.Deserialize<TestData?>(ms);
}

Console.WriteLine($"data.Length: {data.Length}");
Console.WriteLine($"td1 ({td1}) is null: {td1 is null}");
Console.WriteLine($"td2 ({td2}) is null: {td2 is null}");


[DataContract]
public class TestData
{
	[DataMember(Order = 1)]
	public int Id { get; set; }
	[DataMember(Order = 2)]
	public string? Name { get; set; }
}

The output is:

data.Length: 0
td1 () is null: True
td2 (TestData) is null: False

And the value of td2 is this:

image

so likely the result of just calling the ctor, I suppose.

Thanks in advance for any help.

@jodydonetti
Copy link
Author

Another update: I think I'm getting to the bottom of it, and sorry if I'm kinda using an issue on your repo here to basically explore Protobuf and learn more about it, but I haven't found a clear description around the web about the general approach to use and the expectations to have.

Exploration

Based on my readings and experiments it seems to me that null is quite special in the Protobuf world (see also here and here for example): it is not 100% officially supported all around, even though it seems to work sometimes based on the specific library implementations and the specific use cases.

For example when serializing a string set to null, there's a difference between serializing the string directly and serializing a data structure (eg: a class) that has a string field in it.

In protobuf-net, both when serializing a string variable set to null and when serializing a variable of my TestData class set to null, the result is always a byte[0].
Then when deserializing those, the first (direct string) returns a "default value" of an empty string, while the second (the class) returns a "default value" that basically corresponds to calling the default TestData ctor (with the inner fields set to, respectively, 0 and null).

I also noticed the same happens if I serialize an instance created by simply calling new TestData(): it generates a byte[0] and deserialize it in the same way (I assume because of the fields being equal to their types' default values, but it's just an hypothesis of mine).

The unexpected thing though is that the inner string field results, as said, in a null value. Probably coming from other serialization formats (eg: JSON) I would have expected a more consistent and uniform behaviour: if I serialize a null string I always get back e null string or, at least, I always get back an empty string. Getting back different things based on the context (direct VS field) was - at least to me - quite surprising and confusing.

Of course I'm not necessarily pointing to protobuf-net here, probably is more of a Protobuf (the protocol itself) thing, maybe part of the so called undefined behaviour world where different implementations may differ, in this case around null handling.

Am I getting it right?

Safe approach?

Because of those premises, in general I'm thinking about what can be the best approach: to always use some sort of a "container structure" like a class, and not work directly with "simple values" (ints, strings, etc) to avoid strange behaviours.

Would you agree on such an approach?

Suggestion

If all of this is correct, may I suggest adding a paragraph in the README file? Even though it would not be specific to protobuf-net but more of a protocol general thing, I assume people would be less confused and feel more welcome when approaching Protobuf in general, and protobuf-net in particular, if such concepts would be highlighted directly up-front.
That would possibly broaden the adoption of the protocol and the library even more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant