Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minprotobuf does not handle proto3 repeated uint32 fields. #1035

Closed
richard-ramos opened this issue Feb 29, 2024 · 6 comments
Closed

minprotobuf does not handle proto3 repeated uint32 fields. #1035

richard-ramos opened this issue Feb 29, 2024 · 6 comments
Assignees

Comments

@richard-ramos
Copy link
Member

richard-ramos commented Feb 29, 2024

For the following protobuffer:

syntax = "proto3";
message WakuMetadataResponse {
  optional uint32 cluster_id = 1;
  repeated uint32 shards = 2;
}

if we were to encode: {"cluster_id": 16, "shards": [32]}, we'd get the following output:

minprotobuf does handle correctly the encoding/decoding of the field shards (2), as long as we use proto2. If we use proto3, it will not:

let buffer: seq[byte] = hexToSeqByte("0810120120") 
let pb = initProtoBuffer(buffer)

var shards: seq[uint64]
if ?pb.getRepeatedField(2, shards):
  for shard in shards:
    rpc.shards.add(shard.uint32)

echo shards                      # this is empty

cc: @Ivansete-status , @igor-sirotin

@arnetheduck
Copy link
Contributor

You're looking for getPackedRepeatedField.

minprotobuf does not operate at the level of proto2 and proto3 - the differences between proto2 and proto3 are mostly in high-level features - notably, the wire encoding is the same and with minprotobuf you need to understand the details of each version of protobuf - in particular, proto3 defaults to packed repeated fields).

@AlejandroCabeza
Copy link
Collaborator

Tested that and it does indeed work by calling getPackedRepeatedField for proto3.

@kaiserd
Copy link
Collaborator

kaiserd commented Mar 5, 2024

Thank you very much.

@richard-ramos could you resolve the related issue on Waku side with this help?
If yes, we'd close this issue.

@kaiserd kaiserd closed this as completed Mar 5, 2024
@richard-ramos
Copy link
Member Author

Yes, thank you, @kaiserd and @arnetheduck . We should be able to solve the issue with that function.

However, I'd say that having to know the details on how the serialization works for the different proto versions in order to effectively use minprotobuff is just adding extra difficulty to the development workflow, specially since the library itself is not documented at all, and I can imagine a different developer running into a situation like this in the future.

Also, is there any plans in the future to create a higher level library that uses minprotobuf and abstracts you from these details? or even generates code similar to Rust's https://docs.rs/prost-build/latest/prost_build/ or the code generators from https://github.com/golang/protobuf? Ideally you should be able to use protobuffers without having to be familiar with how the encoding for protobuffers work

@arnetheduck
Copy link
Contributor

higher level library

You're looking for https://github.com/status-im/nim-protobuf-serialization - the part with Nim objects works fairly well.

There is a basic macro that parses .proto files and generates Nim objects from there (without a separate code generation step that you'd need with rust/golang), but it probably needs a good weekend of hacking to get it into a usable state.

@arnetheduck
Copy link
Contributor

without having to be familiar with how the encoding for protobuffers work

This is fine when consuming protocols, but when designing them, I highly recommend getting intimately familiar with the encoding as there are many beginner traps, such as thinking that protobuf encodings are deterministic (a mistake that many enthusiastic protobuf users make, including libp2p), that defaults are handled the same way in all languages (they're not) and that required fields are useful (they break forward compatibility). It also opens up knowledge about how to compose them (you can concat byte streams) and what security issues you might run into (you can make an infinately large protobuf stream that decoded looks like a single field, hiding spam and invalid data in the stream etc)

The good news is that all knowledge about proto2 and proto3 can be derived from the wire encoding document I linked, which is fairly simple - proto2/3 are just two slightly different packagings of the same underlying thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants