Skip to content

Create Protobuf.Text, create Protobuf.field_presence/2 #398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

v0idpwn
Copy link
Collaborator

@v0idpwn v0idpwn commented Jan 15, 2025

This PR introduces support for encoding Protobuf messages into the textproto format.

Motivation

The primary motivation for this feature is to improve the ability to inspect and visualize Protobuf messages, while still conforming to some standard. The inspect protocol default implementation can be painful to work with large or complex structures. Finding the relevant, non-default fields in a inspected struct is not intuitive and takes an unnecessary amount of time. A textproto representation provides a human-readable alternative that is much closer to the actual data transmitted over the wire (specially when using transformers).

This implementation

  • Encoding only: Decoding was deliberately left out as it is less valuable compared to encoding, and has many caveats.
  • Emphasis on readability: pretty-printed by default. We do a best effort for breaking lines, and opted for the most readable and elixir-like alternatives in the spec. Max line width and tab size are is configurable.
  • Limitations:
    • No support for extensions
    • No support for Google.Protobuf.Any.
    • No type validation is performed.

Future Considerations

I'm open to extending this implementation in the future to include:

  1. Adding support for extensions
  2. Adding support for Google.Protobuf.Any.
  3. Implementing type validation.

@v0idpwn v0idpwn requested a review from whatyouhide January 16, 2025 20:07
Copy link
Collaborator

@whatyouhide whatyouhide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we look at using Inspect.Algebra to simplify the line splitting? Not sure what the spec of the format is and I don't have the bandwidth to go read it now, but it seems flexible enough that we could break lines in a few different places?

@v0idpwn
Copy link
Collaborator Author

v0idpwn commented Jan 22, 2025

Thank you, @whatyouhide. I'll explore using Inspect.Algebra, it's probably a good fit here. :)

@v0idpwn v0idpwn force-pushed the feat/text-encoding branch from dcc90cb to 6c73550 Compare January 23, 2025 03:21
@v0idpwn
Copy link
Collaborator Author

v0idpwn commented Jan 23, 2025

Pushed a new version using Inspect.Algebra. It indeed simplified code a lot, thanks for the suggestion.

Some limitations/changes:

  • I wasn't able to have different separators on container_doc for single line and multi-line. I'm sure it's achievable if I use the lower level functions, but I think having a , after every line isn't that bad (it's optional in the spec), and I like the simplicity of container_doc.
  • I wasn't able to define :pad_size as I was in the hand-crafted implementation, so it's a fixed 2.

I'll address the other comments over the week :)

@whatyouhide
Copy link
Collaborator

@v0idpwn OK sounds good, lmk when this is gonna be ready for re-review. 🙃

@v0idpwn
Copy link
Collaborator Author

v0idpwn commented Feb 1, 2025

I think it's okay to be reviewed, @whatyouhide :)

@v0idpwn
Copy link
Collaborator Author

v0idpwn commented Feb 3, 2025

Actually, I found that there are conformance tests for text encoding. I'll work on them.

@whatyouhide
Copy link
Collaborator

@v0idpwn is this ready for review?

v0idpwn and others added 5 commits April 30, 2025 13:12
* Allows encoding to textproto

* Run conformance for text, fixes for conformance, add unknown fields support

Most of the conformance tests in the text suite are for decoding, and
some of them are for unknown fields encoding rules, so we only got value
out of five of them. Still, they led to significant improvements in the
implementation.

For one, I figured that the "outer message" **can't** use brackets ("{",
"}"). I thought they were optional and just not used in the examples. So
we went from:

```elixir
Protobuf.Text.encode(%MyMessage{a: 1, b: 2})
```

To:
```elixir
Protobuf.Text.encode(%MyMessage{a: 1, b: 2})
```

I also added unknown fields support, but wasn't able to pass all the
unknown fields tests, so I skipped them for now. The current version
does the job of printing them reasonably well, so I'm not too bothered
with it.
* Create Protobuf.Presence module

Unifies presence tracking between encoders. Provides a public API for checking field presence

* Add test for messages, fix proto3

* Improve docs
Also fix examples, and add an extra example about repeated
@v0idpwn v0idpwn force-pushed the feat/text-encoding branch from 8c6bd63 to 7560e91 Compare April 30, 2025 16:12
Comment on lines +69 to +71
a: 1
e: {
a: 1,
Copy link
Collaborator Author

@v0idpwn v0idpwn Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very happy with this inconsistency: notice that the "outside" entries don't have ,, but the nested ones do. Tried to make it match but failed.

This is fine per the spec, I think we can leave it as it is, at least for now.

@v0idpwn v0idpwn changed the title Add Protobuf.Text Create Protobuf.Text, create Protobuf.field_presence/2 Apr 30, 2025
@v0idpwn
Copy link
Collaborator Author

v0idpwn commented Apr 30, 2025

@whatyouhide I think it is. I did a rebase, removed stuff that was already merged (mostly related to CI), and tried to leave commits more optimized for a non-squash merge given the large scope of this PR (sorry about that, won't do it again 💀). Overall, I’m not really happy with the code, but I’d rather do further iterations in separate, smaller PRs (the interface can be considered stable).

I did the google bump here because I wanted to ensure we had the latest conformance tests for text as well. I can move it to a separate PR if you wish.

v0idpwn and others added 3 commits April 30, 2025 13:38
Maps should be considered present when used for messages
@whatyouhide
Copy link
Collaborator

@v0idpwn can we split up Protobuf.field_presence/2 into its own PR?

@v0idpwn
Copy link
Collaborator Author

v0idpwn commented May 6, 2025

Will do. Will split the bump as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants