Skip to content

Add protobuf source comment support to generated Rust code#7

Merged
iainmcgin merged 6 commits intoanthropics:mainfrom
macalinao:feat/proto-source-comments
Mar 24, 2026
Merged

Add protobuf source comment support to generated Rust code#7
iainmcgin merged 6 commits intoanthropics:mainfrom
macalinao:feat/proto-source-comments

Conversation

@macalinao
Copy link
Copy Markdown
Contributor

@macalinao macalinao commented Mar 22, 2026

Summary

  • Extract comments from protobuf SourceCodeInfo and emit them as Rust doc comments (///) on generated structs, fields, enums, enum variants, oneof enums, and view types
  • Add comments.rs module that translates index-based SourceCodeInfo paths into an FQN-keyed HashMap on CodeGenContext, so codegen call sites look up comments by the proto FQN they already have
  • Add --include_source_info to all protoc invocations (buffa-build, WKT generation, bootstrap generation)
  • Combine proto source comments with existing tag docs (e.g. Field 1: \seconds``) using a blank-line separator
  • Regenerate all checked-in generated code (WKTs + bootstrap descriptor types) with full proto documentation

Test plan

  • All 1,244 existing tests pass
  • Clippy clean (zero warnings)
  • 17 new tests: 11 unit tests in comments.rs (edge cases: empty SCI, nested types, whitespace-only, detached comments, empty package) + 6 integration tests in tests/comments.rs (end-to-end through generate())
  • Verified generated WKT types have full proto documentation (e.g. Timestamp struct has 50+ line doc comment from proto source)
  • Verified bootstrap descriptor types have proto comments on fields
  • CI: lint-and-test, check-generated-code, conformance suite

Extract comments from SourceCodeInfo in .proto files and emit them as
Rust doc comments on generated structs, fields, enums, enum variants,
oneof enums, and view types. Comments are keyed by fully-qualified
protobuf name on CodeGenContext for simple lookup at codegen call sites.

- Add comments.rs module with FQN-keyed comment extraction from
  SourceCodeInfo index-based paths
- Add --include_source_info to protoc invocations (buffa-build,
  gen-wkt-types, gen-bootstrap-types)
- Combine proto source comments with existing tag docs using blank line
  separator via doc_attrs_with_tag()
- Regenerate WKT and bootstrap types with full proto documentation
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 22, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@macalinao
Copy link
Copy Markdown
Contributor Author

I have read the CLA Document and I hereby sign the CLA

Proto source comments (especially WKTs like Timestamp, Duration, Any)
contain C++/Java/Python code examples indented with 4 spaces. Rustdoc
interprets these as Rust code blocks and tries to compile them as doc
tests, causing 56 failures.

Detect indented blocks in doc_lines_to_tokens and wrap them in
```text fences so rustdoc renders them as plain text code blocks.
Regenerate all checked-in generated code.
github-actions bot added a commit that referenced this pull request Mar 22, 2026
- Strip the 4-space indent from lines inside ```text fences since the
  fence already denotes a code block
- Close code blocks before trailing blank lines by looking ahead to
  see if the next non-empty line is still indented
- Keep blank lines within multi-part examples (e.g. Win32 example with
  an internal comment block)
@iainmcgin
Copy link
Copy Markdown
Collaborator

similar to #5 , I think this just needs to regenerate the bootstrap/WKT/example code to pass the CI check: task gen-wkt-types, task gen-logging-example and task gen-bootstrap-types. I'm part way through reviewing the PR, nice work so far!

@macalinao
Copy link
Copy Markdown
Contributor Author

similar to #5 , I think this just needs to regenerate the bootstrap/WKT/example code to pass the CI check: task gen-wkt-types, task gen-logging-example and task gen-bootstrap-types. I'm part way through reviewing the PR, nice work so far!

Thanks, will do shortly.

macalinao and others added 3 commits March 24, 2026 11:01
# Conflicts:
#	buffa-codegen/src/generated/google.protobuf.compiler.plugin.rs
#	buffa-codegen/src/generated/google.protobuf.descriptor.rs
#	buffa-codegen/src/message.rs
#	buffa-codegen/src/view.rs
#	buffa-types/src/generated/google.protobuf.any.rs
#	buffa-types/src/generated/google.protobuf.duration.rs
#	buffa-types/src/generated/google.protobuf.field_mask.rs
#	buffa-types/src/generated/google.protobuf.struct.rs
#	buffa-types/src/generated/google.protobuf.timestamp.rs
#	buffa-types/src/generated/google.protobuf.wrappers.rs
#	examples/logging/src/gen/context.v1.context.rs
#	examples/logging/src/gen/log.v1.log.rs
The check-generated-code job was regenerating WKT types without proto
source comments because the inline protoc invocation lacked
--include_source_info. The Taskfile and gen-bootstrap-types.sh already
have the flag; this brings CI into alignment.
@iainmcgin
Copy link
Copy Markdown
Collaborator

Thank you so much for this!

@iainmcgin iainmcgin merged commit 950b63a into anthropics:main Mar 24, 2026
7 of 8 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 24, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants