Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Non-nullable Message and Oneof parameter generation #242

Open
Dogacel opened this issue Nov 18, 2022 · 5 comments
Open

Comments

@Dogacel
Copy link
Contributor

Dogacel commented Nov 18, 2022

From README,

For proto3, the only nullable fields are other messages and oneof fields.
Note: We can update the README to include optional.

Currently there is no way to mark a message type or oneof type as required in proto3. But this is possible via plugins / options for example: protoc-gen-validate.

It could be super cool if we can generate not nullable message fields if the option exists (And possibly we can make it customizable):

  Location home = 4 [(validate.rules).message.required = true];

and the generated code would have a not-nullable home type as so

 val home: pbandk.Location

I think there can be stuff that we need to consider, because this will definitely break the default constructor implementation for all protos and backwards compatibility. So we might need to add additional checks and throw exceptions like validators do to overcome unexpected errors.

@Dogacel Dogacel changed the title Feature request: Feature request: Non-nullable Message and Oneof parameter generation Nov 18, 2022
@garyp
Copy link
Collaborator

garyp commented Dec 2, 2022

My initial reaction is that this goes against the spirit of the protocol buffer design. proto3 removed the ability to have required fields for a good, though controversial, reason. If you need the functionality of required fields, proto2 is still available and continues to be maintained and improved by the protobuf team (and pbandk supports proto2 as well). With proto2 you can simply mark your fields as required and pbandk will generate non-nullable Kotlin types for those fields. For proto3, I think doing field validation via a separate layer like protoc-gen-validate is the better way to go.

That said, I definitely see the appeal of making these fields non-nullable when you're using pbandk with a validator like protoc-gen-validate. If your code has already validated the message then you know the field will never be null and it's annoying to still have to use ?. or !! in Kotlin when accessing that field. I'm not sure that's a good enough reason to add this feature to pbandk, but I'm not completely against it either 😄

One other thought that just occurred to me: many protobuf libraries (including the official ones) do not expose an optional field as nullable in the generated code. Instead, accessing an optional field that was not present returns the default value of that field rather than a null. There is a separate method provided for checking if the field was present. For example, pbandk generates this code for the Location home = 4; field in your example:

val home: Location? = null

whereas libraries like protobuf-java instead generate the equivalent of:

val home: Location = Location.defaultInstance
val hasHome: Boolean = false

pbandk prefers the first approach because Kotlin has such great support for nullable types that it's natural to use null to represent a protobuf field value that isn't present. You can then easily handle field presence by using the ?. operator. But the advantage of the second approach is that it would easily support your use case (the validator can ensure that hasHome == true and the rest of your code never has to deal with home being possibly null) and it's more in line with the protobuf design: it encourages code to treat default values the same as non-default values, which generally leads to improved backwards compatibility.

@Dogacel
Copy link
Contributor Author

Dogacel commented Dec 6, 2022

If you need the functionality of required fields, proto2 is still available and continues to be maintained and improved by the protobuf team

This was a company-wide decision so I couldn't do much but I agree, I think I feel closer to proto2 rather than proto3 in that sense. I suppose 3 feels newer but that is not the case. More like they are two different things, 3 is just 2-lite? But I still feel like as long as we use those custom libraries such as pbandk and protoc-gen-validate, we should be able to customize our experience. From my personal experience, having all those !! and unnecessary hasXXX and ?. calls is very annoying 😆

So if validation is done via protoc-gen-validate we don't need to throw exceptions conditionally on every little method call while trying to access a field, which is great. But this won't reflect in our code-base, we will keep writing !! calls to fields we know to be required (implicitly) and it is very easy to miss something and incorrectly cast that null thing to not-null in runtime and see an unexpected exception.

One other thought that just occurred to me: many protobuf libraries (including the official ones) do not expose an optional field as nullable in the generated code.

This is one of the reasons why I want to use pbandk over the official library-generated code. Null safety in kotlin is pretty good and the official tools do not benefit from it. Type safety is not really there, I don't think having to call hasXXX is very meaningful when you can still get runtime exceptions for null values. We could have just used the operators kotlin provides such as ?. in the code directly and that method becomes obsolete and very counter-intuitive.

which generally leads to improved backwards compatibility.

I agree, but I also think it depends on the use case. If we want to mark a field as required, it is considered as a breaking change. Because we do not reflect this in source-code doesn't mean it won't break. I feel like we are assuming things are not null or nullable implicitly in code which makes cross-collaboration hard. I am just finding myself reading proto comments all the time.

I think it mostly drills down to the reason why people should switch to proto3 from proto2. If there are valid reasons (more language support, performance optimizations etc.) then I think backwards compatibility should be something left to the user.

@Dogacel
Copy link
Contributor Author

Dogacel commented Dec 13, 2022

@garyp I had some time to think about this, here are my thoughts:

A View on Required Fields & PbandK

Proto is a wire protocol and PbandK is built on top of proto. So, the required fields do not exist in the wire format, but it doesn't mean that they shouldn't exist at the API level. Nevertheless, I think everyone agrees some contracts do not make sense if a field is not set (e.g. business logic makes it required).

Server

So, let's see how individual parts of the system play together:

  • From proto's perspective: You don't have to send this field, I will work regardless.
  • From API's (service)'s perspective: I will check whether you have sent this field or not, if you did not I will return you some kind of error code.

So assuming PbandK sits right between the proto and the API (when we use gen-validate) I don't think having required types in code is against proto3's compatibility standards. (i.e this is still satisfied) It is just an encapsulation of the service level logic to a properly formatted class structure with more precise null support.

Clients

From the client's perspective, after the proto version is upgraded, it means the contract has changed. So the client is just responsible for updating its logic to populate that field which is 100% OK in my opinion.

Also, the case described here is not true for all systems. There can be only a single consumer for protos too. For example, we store data efficiently using a proto structure and some might think backward compatibility is not an issue because of how they roll the required fields. It just depends on the situation and libraries like PbandK should be able to respond to those edge cases better than the generic proto implementation.

For example, a field can be marked as optional but if all the consumers are sending this field 100% of the time (developers updated the code), I think it is perfectly safe to say this field can be changed to be required in the generated code.

Implementation

One thing to take note of is how we will generate entities with default constructors, I think the approach I suggest would work because circularly dependent validated fields do not make sense

Final Opinions

I feel like without changing the proto-compatibility standards PbandK should be able to provide an interface to its consumers that is flexible enough to cover all needs. People have different implementations for PbandK servers and clients, they can have interceptors, proto-generated classes etc. that can be used to bridge the actual proto-wire code and generated libraries.

Sorry for the very long thread I just feel like this is still something very debatable because of the parities I see in real life vs. standards introduced by protocol buffers 🙂

@garyp
Copy link
Collaborator

garyp commented Jan 5, 2023

@Dogacel thanks for taking the time to write up your thoughts. I really appreciate it. On a practical level, I share your perspective with regards to many of the points you brought up. But on a philosophical level, I'm hesitant to take pbandk in a direction that strays too far from the "protobuf way". After all, there are many serialization formats out there. One of the reasons many of us choose protobuf is because of its opinionated stance on how to best support backward and forward compatibility. I'm not ready to say that the feature you're requesting should not be implemented in pbandk, but I need some more time to mull it over.

I do want to clarify one point though. You said:

Type safety is not really there, I don't think having to call hasXXX is very meaningful when you can still get runtime exceptions for null values. We could have just used the operators kotlin provides such as ?. in the code directly and that method becomes obsolete and very counter-intuitive.

What I was suggesting above was that if pbandk generates a val hasXXX: Boolean field, then the val XXX field can actually be made non-null in the generated code. E.g. a proto field like:

message Person {
  Location home = 1;
}

currently generates this code:

val home: Location? = null

With my proposal it would instead generate:

val home: Location = Location.defaultInstance
val hasHome: Boolean = false

I.e. the home field becomes non-null. You can still have a validation layer that returns an exception to the API client when hasHome != true. But the rest of your business logic can access home without needing to use !! or ?.. Does that change your mind at all about this proposal?

@Dogacel
Copy link
Contributor Author

Dogacel commented Jan 5, 2023

I.e. the home field becomes non-null. You can still have a validation layer that returns an exception to the API client when hasHome != true. But the rest of your business logic can access home without needing to use !! or ?.. Does that change your mind at all about this proposal?

I am not clear on one thing, I hope we are not changing how PbandK generates a field for the proto definitionLocation home = 1 to be val home: Location = Location.defaultInstance. I think it should stay as val home: Location? = null unless user enables some option (or marks field as required with an extension as I have mentioned). Because I think direct serializtion to Kotlin nulls is much better than hasXXX() methods which protoc native generator already uses (and number 1 reason I want to use PbandK).

So, in short:

  • Being able to mark a field as required if an option (extension) exists such as [(validate.rules).message.required = true];
  • Code generator setting for generating all fields that are not optional to be not null by default

I think it would work pretty fine with hasXXX() methods. So the default instance approach should be used for only certain fields, not everything. Those methods will help with validation rather than being used in business logic. But as I have said, Location? = null type should stay around, I think it is a great addition for optional fields. So I suggest some kind of hybrid approach.


Finally, in Carbon, we started using proto3 widely recently so the version of the proto3 compiler we use includes the optional keyword. So we decided to use this keyword to mark fields that are not required. E.g.

message Person { Location home = 1; }

here we assume home is a required field (even though it is not in the proto definition). This is kind of semantic and for documentation purposes. It makes defining schemas easier.

So even though optional keyword has no effect on how home field is serialized, we still use it if the field is optional

message Person { optional Location home = 1; }

so from both the client and server's perspective, they usually map those proto messages to some POJO with a not-null home field for example. Trying to use the home, hasHome() and homeOrNull fields at the same time is super confusing and not very developer friendly 😄 So having that option in PbandK would also help a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants