Skip to content

buffa-types: back Any.value with bytes::Bytes for refcount-bump clone#51

Merged
iainmcgin merged 3 commits intoanthropics:mainfrom
kollektiv:tyen/any-bytes
Apr 17, 2026
Merged

buffa-types: back Any.value with bytes::Bytes for refcount-bump clone#51
iainmcgin merged 3 commits intoanthropics:mainfrom
kollektiv:tyen/any-bytes

Conversation

@kollektiv
Copy link
Copy Markdown
Contributor

google.protobuf.Any is commonly cached and cloned into repeated google.protobuf.Any response fields (RPC servers, fan-out, message buses). With value: Vec<u8> each Any.clone() deep-copies the encoded payload; with bytes::Bytes it is one atomic refcount increment, regardless of payload size.

The change is driven from gen_wkt_types.rs (bytes_fields = [".google.protobuf.Any.value"]) so task gen-wkt-types and the check-generated-code CI job stay green. any_ext.rs helpers (pack, unpack_*, JSON/textproto paths) are updated; signatures are unchanged.

Benchmark

New buffa-types/benches/any_clone.rs:

Any.value.len() Vec<u8> Bytes speedup
64 B 32 ns 34 ns 1.0x
1 KiB 43 ns 34 ns 1.3x
16 KiB 189 ns 35 ns 5.4x
256 KiB 5.99 µs 35 ns 171x
1000×1KiB into Vec<Any> 143 µs 54 µs 2.6x

Any.clone() is now constant-time in payload size; the ~34 ns floor is the type_url: String clone.

Compatibility

Wire format: fully compatible. Bytes derefs to &[u8]; encode (encode_bytes), decode (Bytes::from(decode_bytes(buf)?)), JSON (base64 of &[u8]), and textproto all produce/accept identical bytes.

Rust API: source-breaking (public field type changed). Common caller patterns:

Pattern Result Fix
Any { value: vec![..], .. } fails vec![..].into()
Any { value: Vec::new(), .. } fails Default::default()
let v: Vec<u8> = any.value fails any.value.into()
any.value.as_slice() fails &any.value[..] or .as_ref()
&any.value (deref to &[u8]) works
any.value.len() / indexing / is_empty() works
Any::pack / unpack_if / unpack_unchecked works
any.clone() works (and now constant-time)

The .into() / Default::default() shims compile against both the current and the changed type (Vec<u8>.into() is identity for Vec<u8>, From<Vec<u8>> for Bytes), so callers can pre-adopt before this lands.

`Any` is commonly cached and cloned into `repeated google.protobuf.Any`
response fields. With `value: Vec<u8>` each `Any.clone()` is a payload
memcpy; with `bytes::Bytes` it is one atomic refcount bump. Wire encoding
is unchanged (`Bytes` derefs to `&[u8]`).

Change is driven from the regenerator config so `task gen-wkt-types` and
the check-generated-code CI job stay green:

- gen_wkt_types.rs: set `bytes_fields = [".google.protobuf.Any.value"]`.
- buffa-types/Cargo.toml: add direct `bytes` dep (codegen emits
  `::bytes::Bytes` paths).
- generated/google.protobuf.any.rs: regenerated.
- any_ext.rs: `pack()` uses `encode_to_bytes()`; `unpack_*()` use
  `.as_ref()`; JSON-fallback and textproto-merge sites convert via
  `.into()`.
- tests/wkt_roundtrip.rs + any_ext.rs test fixtures: `vec![..].into()`.

Adds `buffa-types/benches/any_clone.rs` (criterion):

| payload | Vec<u8> | Bytes | speedup |
| --- | --- | --- | --- |
| 64 B | 32 ns | 34 ns | 1.0x |
| 1 KiB | 43 ns | 34 ns | 1.3x |
| 16 KiB | 189 ns | 35 ns | 5.4x |
| 256 KiB | 5.99 us | 35 ns | 171x |
| 1000x1KiB into Vec<Any> | 143 us | 54 us | 2.6x |

Clone is now constant-time in payload size (type_url String clone is the
floor at ~34 ns).

This is a source-breaking change to the public field type. Callers that
construct or move `Any.value` directly need a one-line `.into()` /
`Default::default()`; reads via `&[u8]` deref are unchanged.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 17, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@kollektiv kollektiv closed this Apr 17, 2026
@github-actions github-actions bot locked and limited conversation to collaborators Apr 17, 2026
@kollektiv kollektiv reopened this Apr 17, 2026
- benches/any_clone.rs: use .to_vec() instead of .iter().cloned().collect()
- tests/wkt_roundtrip.rs: drop useless .into() on BytesValue.value (still Vec<u8>)
@kollektiv
Copy link
Copy Markdown
Contributor Author

I have read the CLA Document and I hereby sign the CLA

@kollektiv
Copy link
Copy Markdown
Contributor Author

recheck

- CHANGELOG: document the Vec<u8> -> Bytes breaking change.
- Test: assert that cloning Any shares the payload buffer via pointer
  equality, pinning down the headline invariant of this change so a
  future refactor that reintroduces a copy can't silently regress it.
@iainmcgin
Copy link
Copy Markdown
Collaborator

[claude code] Reviewed and pushed two follow-ups directly to your branch (maintainerCanModify):

  • CHANGELOG entry noting the Any.value: Vec<u8>Bytes break. Since 0.x crates treat any source-break as a minor bump, this likely means the next workspace release cuts 0.4.0 — flagging for release-management when you merge.
  • clone_shares_payload_buffer unit test in any_ext.rs. Asserts as_ptr() equality after Any::clone(), pinning down the headline invariant of this change so a future refactor that accidentally reintroduces a copy (e.g. swapping .into() for Bytes::copy_from_slice) fails loudly instead of silently regressing the perf win. The bench covers speed; this covers "how it's fast".

Two perf follow-ups I spotted but didn't fix in this PR (they're out of scope / larger):

Minor nits not worth blocking on (feel free to address in this PR or skip):

  • bench_clone_into_vec uses a fixed "1024B" group label for both rows — harmless id collision.
  • Consider a brief doc note on Any in any_ext.rs showing Bytes-first construction patterns now that .into() is ubiquitous at call sites.

Otherwise LGTM — clean use of the existing bytes_fields knob, correct clear() reassignment, correct last-wins merge semantics, and encode_to_bytes is the right constructor hook. Nice find.

@kollektiv kollektiv marked this pull request as ready for review April 17, 2026 17:20
@anthropics anthropics unlocked this conversation Apr 17, 2026
@kollektiv
Copy link
Copy Markdown
Contributor Author

recheck

1 similar comment
@iainmcgin
Copy link
Copy Markdown
Collaborator

recheck

github-actions bot added a commit that referenced this pull request Apr 17, 2026
Copy link
Copy Markdown
Collaborator

@iainmcgin iainmcgin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[claude code] LGTM — clean use of the existing bytes_fields knob, correctness verified, no_std clean on host + bare-metal ARM, all CI green.

@iainmcgin iainmcgin merged commit 78373e3 into anthropics:main Apr 17, 2026
7 of 9 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Apr 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants