Skip to content

descriptor: add DescriptorPool runtime descriptor pool#130

Merged
iainmcgin merged 5 commits into
mainfrom
reflect/03-pool
May 20, 2026
Merged

descriptor: add DescriptorPool runtime descriptor pool#130
iainmcgin merged 5 commits into
mainfrom
reflect/03-pool

Conversation

@iainmcgin
Copy link
Copy Markdown
Collaborator

What

DescriptorPool — the runtime descriptor pool that builds the linked descriptor types (PR #129) from a FileDescriptorSet. Behind the new reflect feature (default-off).

Construction is three-pass:

  1. Register — walk every file, recording the FQN of every message/enum (including nested) and assigning pool indices. Forward references and cross-file references resolve in pass 2.
  2. Link — walk again, building the linked MessageDescriptor for each: resolving type_name strings to indices, classifying fields as singular/list/map, resolving editions features down the file → message → field chain, validating field numbers and the u16 field-count cap.
  3. Link services — services reference message types by name for input_type/output_type, so they link after the type passes.

The pool retains the original FileDescriptorProtos after linking (file_by_name() accessor) so gRPC server reflection can serve the raw bytes.

Untrusted input

DescriptorPool::decode treats its input as untrusted — it's the entry point for consumers loading descriptors from a schema registry, gRPC server reflection peer, or on-disk policy bundle. Malformed input returns PoolError, never panics: out-of-range field numbers, negative extension ranges, dangling type names, and unparseable wire bytes are all handled. The pass-1/pass-2 walk-order invariant is asserted in release builds because a desync silently corrupts every cross-reference in the pool.

Tests

tests/pool_e2e.rs against a protoc-compiled FileDescriptorSet exercising proto3 presence, editions feature resolution, packed encoding, map entries, oneofs (including synthetic), service descriptors, idempotent re-add, and wrong-kind/missing lookups. The editions_feature_resolution test specifically validates that editions 2023 packs by default and that field-level overrides work — the correctness gap that the conformance suite would catch on the first run if the runtime feature resolution diverged from codegen's.

Net change

+1288/0 (7 files). ~290 lines are the test file and protoc-compiled fixtures. The pool.rs is a single cohesive linker; splitting it would create a non-compiling intermediate state.

iainmcgin added 3 commits May 20, 2026 04:15
Adds a #[doc(hidden)] buffa::json_helpers::wkt module with the shared
formatting and parsing primitives for the well-known types' JSON forms:
Timestamp RFC 3339 (fmt_timestamp, parse_timestamp), Duration decimal
seconds (fmt_duration, parse_duration, validate_duration), FieldMask
camelCase (snake_to_camel, camel_to_snake, field_mask_path_round_trips),
and the Howard Hinnant civil-calendar helpers (days_to_date,
date_to_days).

buffa-types' typed serde impls (Timestamp, Duration, FieldMask) now call
into this module rather than carrying their own implementations. The
adapters preserve the Option-returning private API the existing test
suite (~50 call sites) targets, so no test churn.

Sharing the implementation is load-bearing: the conformance suite
exercises the typed JSON path today, and a forthcoming reflective JSON
codec on DynamicMessage will exercise the same forms. A divergence
between the two (one accepting a fractional-second precision the other
rejects, or two civil-calendar implementations disagreeing on a
leap-year edge) would be a user-visible inconsistency. With the
implementation shared, drift is impossible.

The module is #[doc(hidden)] because the supported entry points are the
typed serde impls and (forthcoming) DynamicMessage's JSON codec — these
helpers operate on raw scalars and have no semver contract.
Two changes that lay the foundation for runtime reflection.

Linked descriptor types (buffa-descriptor/src/desc.rs):

MessageDescriptor, FieldDescriptor, FieldKind, SingularKind,
OneofDescriptor, EnumDescriptor, EnumValueDescriptor, ServiceDescriptor,
MethodDescriptor — the processed, feature-resolved form of the raw
FileDescriptorProto tree. Where the raw protos use string type_name
references and unresolved FeatureSet options, these types use pool
indices (MessageIndex, EnumIndex, ServiceIndex) and pre-resolved
edition features (presence, packed, delimited, enum openness).

FieldKind flattens protobuf's orthogonal type x label x map-entry axes
into a single Copy discriminant that maps 1:1 to runtime
representation, the same approach protobuf-es takes with its fieldKind
union.

Fields are private with #[inline] accessor methods, matching the buffa
convention for hand-written API types (SizeCache, UnknownFields, Tag).
Construction is gated to DescriptorPool (forthcoming) — downstream
test fixtures go through DescriptorPool::decode from FDS bytes, so
they don't skip the feature-resolution and validation passes.

Field indices within a message are u16, capping fields-per-message at
65,535. Field numbers stay u32 per the protobuf spec.

Feature resolution dedup:

The shared core (file/message/enum/oneof feature resolution, edition
defaults, FeatureSet merge) moves from buffa-codegen/src/features.rs to
buffa-descriptor/src/features.rs and is re-exported from buffa-codegen.
buffa-codegen retains the codegen-only resolve_field, which overlays
the referenced enum's enum_type from CodeGenContext::is_enum_closed —
a lookup built during codegen and not available to a runtime pool.

A divergence between codegen and the runtime pool would mean generated
code and reflective code disagree on packed encoding, presence, or
enum openness — sharing the implementation makes that impossible.
DescriptorPool builds the linked descriptor types from a
FileDescriptorSet. Construction is three-pass:

1. Register: walk every file, recording the fully-qualified name of
   every message and enum (including nested ones) and assigning each a
   pool index. Forward references and cross-file references resolve in
   pass 2.
2. Link: walk again, building the linked MessageDescriptor for each
   message — resolving type_name strings to indices, classifying fields
   as singular/list/map, resolving editions features down the
   file -> message -> field chain, validating field numbers and the
   u16 field-count cap.
3. Link services: services reference message types by name for their
   input/output, so they link after the type passes.

The pool retains the original FileDescriptorProtos after linking
(file_by_name() accessor) so gRPC server reflection can serve the raw
bytes.

DescriptorPool::decode treats its input as untrusted — it's the entry
point for consumers loading descriptors from a schema registry, gRPC
server reflection peer, or on-disk policy bundle. Malformed input
returns PoolError, never panics: out-of-range field numbers, negative
extension ranges, dangling type names, and unparseable wire bytes are
all handled. The pass-1/pass-2 walk-order invariant is asserted in
release builds because a desync silently corrupts every cross-reference
in the pool.

Behind the new `reflect` feature (default-off). Tests in
tests/pool_e2e.rs against a protoc-compiled FileDescriptorSet
exercising proto3 presence, editions feature resolution, packed
encoding, map entries, oneofs (including synthetic), service
descriptors, idempotent re-add, and wrong-kind/missing lookups.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

Base automatically changed from reflect/02-desc to main May 20, 2026 20:24
iainmcgin added 2 commits May 20, 2026 20:26
# Conflicts:
#	buffa-descriptor/src/lib.rs
CI's rustfmt (current stable) flags the trailing blank line after the
last test; the local 1.77 stable does not.
@iainmcgin iainmcgin marked this pull request as ready for review May 20, 2026 20:54
@iainmcgin iainmcgin merged commit a0f977b into main May 20, 2026
7 checks passed
@iainmcgin iainmcgin deleted the reflect/03-pool branch May 20, 2026 20:55
@github-actions github-actions Bot locked and limited conversation to collaborators May 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants