codegen: string_type knob for configurable string field types (#127)#144
Merged
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
d14341d to
aee1adb
Compare
Add a `StringRepr` knob (String default, plus SmolStr / EcoString / CompactString) selectable per proto path through buffa_build's `string_type` / `string_type_in` builder methods, mirroring the existing `use_bytes_type` machinery. Codegen: - StringRepr enum + string_fields config (ordered path-prefix rules, last match wins) + a CodeGenContext::string_repr predicate. - classify_field emits the chosen owned type for singular, optional, and repeated string fields. Map keys/values stay String, matching the bytes path. Encode/size paths are unchanged — &SmolStr/&EcoString/&CompactString deref-coerce to &str. - Decode routes String through the in-place merge_string fast path and other reprs through decode_string_to; clear() resets non-default reprs to default (the small-string types may be immutable, like bytes::Bytes); view→owned builds via From<&str>; EcoString fields get the arbitrary_ecow shim. Default String output is byte-for-byte unchanged (verified: regenerating the checked-in WKT and bootstrap types produces no diff). Tests: codegen integration assertions for each repr + the arbitrary shim, and an end-to-end buffa-test module (string_types.proto compiled with a SmolStr default and CompactString/EcoString overrides) covering binary/JSON/view round-trips, clear, and the field-type pinning across all string shapes.
Three code paths still emitted String for non-default string representations: - Oneof string variants: the variant type (and its JSON-deserialize seed type) ignored string_fields, so a configured SmolStr/EcoString/CompactString was silently dropped back to String. Fixed in oneof.rs and the custom-deserialize path in message.rs; EcoString oneof variants now also get the arbitrary shim. - Text format: read_string()?.into_owned() is String — combined with a non-default repr it failed to compile. The text decoder now converts via From<String> for singular, optional, repeated, and oneof string fields. - proto2 [default = "..."]: the default expression was hard-coded to String::from, breaking the generated Default impl and clear() for bare (required) string fields under a non-default repr. Default String output is unchanged (verified: regenerating the WKT and bootstrap descriptor types produces no diff). Tests: enable generate_text on the string_variant build and add a text round-trip; pin the oneof payload type; add a proto2 fixture exercising [default] + string_type. Also: mark StringRepr #[non_exhaustive], document the map-stays-String contract and the last-match-wins ordering on the builder methods, and note immutability on the SmolStr/EcoString variants.
smol_str 0.3.4 raised its MSRV to 1.89, above buffa's 1.85, so enabling the `smol_str` string-type feature broke `cargo check` on the MSRV toolchain (msrv-check CI). Pin to `>=0.3, <0.3.4` (resolves to 0.3.2, which declares no MSRV) — it keeps the `serde` and `arbitrary` features the codegen relies on. Re-pin Cargo.lock accordingly. Relax the cap when buffa's MSRV reaches 1.89.
aee1adb to
ce14a65
Compare
asacamano
approved these changes
May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements #127 — the user-facing
string_typecodegen knob, letting protostringfields map to a small-string-optimized type instead ofString.Stacked on #143 (the runtime support). Review #143 first; this branch contains its commits plus the codegen + build API.
API
StringReprisString(default),SmolStr,EcoString, orCompactString. Rules are ordered, last-match-wins (call the broadstring_typefirst, thenstring_type_inoverrides). The consumer enables the matchingbuffafeature (smol_str/ecow/compact_str). Mirrors the existinguse_bytes_typemachinery.What changes in generated code
Stringkeeps the in-placemerge_stringfast path; other reprs usedecode_string_to.clear()resets non-default reprs toDefault::default()(the SSO types may be immutable). View→owned builds viaFrom<&str>. Text format and proto2[default = "..."]honor the repr.EcoStringfields get anarbitraryshim.&str), andmap<_, string>keys/values (alwaysString). Encode/size paths need no change — the SSO types deref-coerce to&str.Default
Stringoutput is byte-for-byte identical — verified by regenerating the checked-in WKT and bootstrap descriptor types (the latter is proto2 with string fields, oneofs, defaults, and text format): zero diff.Testing
cargo test --workspaceandclippy --workspace --all-targets -D warningspass. An end-to-endbuffa-testmodule compilesstring_types.protowith a SmolStr default + CompactString/EcoString overrides and covers binary / JSON / text / view round-trips,clear(), field-type pinning across all string shapes (incl. oneof payload), and a proto2[default]fixture. Built with and without thearbitraryfeature.