Join GitHub today
Add unknown field support for csharp #3936
We need to think carefully about the mutability of UnknownField and UnknownFieldSet, along with how "empty" can be efficiently implemented in both cases.
There are two options I'd consider:
Have an immutable "empty" instance, and make all "mutation" methods return a value of the same type, documenting that the call will mutate any non-empty values and return this, or create a new instance and return it if it's called on empty
Use null for empty instead
The two options have different pros and cons.
The global nature of preserving unknown fields worries me as well - is that what's happening in other languages? What's the migration aspect described in CodedInputStream.cs?
Yes, preserving unknown fields are happening in all other languages from 3.4.0. C++ have already changed the default behavior to preserve in 3.5.0 You can see the release log for more info.
For the DefaultInstance issue, I've remove the one in UnknownFieldSet and use null instead. The UnknownField may still need a default one. I don't quit understand your option 1. Do you mean that add a check in each mutate method, just return if it is the DefaultInstance ?
I'd suggest leaving the field null by default even if unknown fields are enabled - I'd expect most parse operations not to end up finding any unknown fields, so let's avoid the extra allocation.
Instead, if you create a static method in UnknownFieldSet that accepts an existing field set, and creates a new one if necessary, then you can have:
unknownFields = UnknownFields.MergeFieldFrom(unknownFields, input)
In equality, you can just use
... which will call Equals if and only if both values are non-null. Note that this relies on a field set never being empty - null is effectively the representation of an empty unknown field set.
I'd also suggest renaming the field to _unknownFields so that it won't conflict with a message field called "unknown_fields".
Have removed the DefaultInstance.
For unknownFields = UnknownFields.MergeFieldFrom(unknownFields, input) suggestion, I think it is not needed. Unknown fields are most used in code gen which does not need this support
I'm not sure what you mean. The point is that that would allow you to keep
Definitely getting there - sorry for this taking so many iterations.
Is the intention that users should be able to get at the unknown fields for a message? Currently I can only see them being visible by serializing the message.
Okay, hopefully my last set of comments now :)
If we're not exposing the fields themselves, let's really not expose them:
- Make UnknownField internal
- Make the UnknownFieldSet constructor internal
- Make UnknownFieldSet.AddOrReplaceField internal
Importantly, at that point the only way of getting an UnknownFieldSet instance is to populate one with a field. That's great because it means we never need to worry about comparing null with an empty UnknownFieldSet - whereas otherwise, both would effectively represent "there aren't any unknown fields in this message". If we revisit that decision, we'll need to provide an easy way of saying "I have two UnknownFieldSet references, either of which may be null and either of which may be non-null, but referring to an empty field set - please consider null and empty to be equivalent."
Just two more questions:
- Is it deliberate that Clone doesn't clone unknown fields? (That could be slightly surprising in some cases.)
- Can I confirm that we'll be removing the property in CodedInputStream before the next public release? It's not clear what's meant by "once migration is done".
Have changed the 3 methods to internal.
Should also clone the unknown fields. I've added the corresponding code and tests. Sorry I missed it.
We don't have a clear timeline for the migration, this is the java version:
For other languages, we have released a version that have PreserveUnknownsDefault (default false). Thus users can migrate their code. Then release a version with PreserveUnknownsDefault true and the last step is remove it (None of them have removed PreserveUnknownsDefault).
So we are indeed going to release a version which have PreserveUnknownsDefault.
Then removing the property later will be a breaking change, meaning that the library should be revved to version 4.x to follow semantic versioning. I'm very, very keen on not breaking compatibility like this. Changing the default over time seems reasonable, but I'm nervous over it being part of the public API.
One alternative option would be to make it an internal property that could be controlled via an external mechanism, e.g. an environment variable. That's not great either, but at least it doesn't completely violate compatibility.
So remove the property, change codegen as if it were always true, and add the method? Makes sense to me.
Unfortunately we can't add the