Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby should embed a serialized descriptor into generated code instead of using the DSL #10492

Closed
haberman opened this issue Sep 1, 2022 · 3 comments

Comments

@haberman
Copy link
Member

haberman commented Sep 1, 2022

Currently our generated code for Ruby uses a DSL to specify protos, eg:

Google::Protobuf::DescriptorPool.generated_pool.build do
  add_file("google/protobuf/timestamp.proto", :syntax => :proto3) do
    add_message "google.protobuf.Timestamp" do
      optional :seconds, :int64, 1
      optional :nanos, :int32, 2
    end
  end
end

This provides some readability, but has a high cost: any options that have not been specifically implemented in the DSL are dropped. This includes both built-in options like packed=true (#2294) as well as all custom options(#1198).

To solve these problems, the generated code should embed a serialized descriptor. This will naturally preserve all options, and is easy to implement. eg.

descriptor_data = File.binread(__FILE__).split("\n__END__\n", 2)[1]
Google::Protobuf::DescriptorPool.generated_pool.add_serialized_file(descriptor_data)

__END__

 google/protobuf/descriptor.proto^R^Ogoogle.protobuf"G
^QFileDescriptorSet^R2
^Dfile^X^A ^C(^K2$.google.protobuf.FileDescriptorProto"ì^C
^SFileDescriptorProto^R^L
^Dname^X^A ^A(>-^R^O
^Gpackage^X^B ^A(>------^R^R
<snip>

This does make the generated code less readable though. So we should not do this until we are generating .rbs files: #9495

Then we will have a world where:

  1. _pb.rbs provides a readable description of the protobuf interface
  2. _pb.rb provides a compact, efficient description of the message, optimized for completeness and fast runtime loading.
@haberman haberman added ruby untriaged auto added to all issues by default when created. labels Sep 1, 2022
@haberman
Copy link
Member Author

haberman commented Sep 1, 2022

Blocked by: #9495

@haberman haberman added blocked and removed untriaged auto added to all issues by default when created. labels Sep 1, 2022
@jplaisted
Copy link

jplaisted commented Sep 23, 2022

Is it worth unblocking this by ignoring the rbs piece of this problem?

  • Who is currently reading the generated code and expecting it to be readable?
    • They need to understand how the DSL generates ruby classes
    • Whereas if they just read the proto directly, they need to understand how it maps to ruby
    • In either case, you need to understand some (very simple) mapping.
      • Seems like the proto -> ruby should be the default here. People interacting with the protos need to know proto anyway...

Could RBS be considered orthogonal? Fixing the generated code now, while backsliding on readability (but, per above, is that actually negative?), provides real benefits (like being able to use get proto options, which has been an ask for 6.5 years...).

Readability can then be "fixed" later with RBS.


Edit: Also, fwiw, the DSL also describes / is much closer to the proto itself, not the ruby classes, where as RBS would describe ruby. So the difference between the proto and DSL is super minimal today (imo), meaning I'm not sure there is a readability backslide at all... reading the DSL today is basically reading the proto...

@haberman
Copy link
Member Author

haberman commented May 2, 2023

Fixed in bd52d04

Announced in: https://protobuf.dev/news/2023-04-20/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants