feat: introduce MaxJsonSize trait for JSON encode buffer sizing#126
Merged
Conversation
Same pattern as #122 — bump remaining request/response topic pairs from AtMostOnce to AtLeastOnce so a transient broker disconnect between SUBSCRIBE and the broker's accepted/rejected reply doesn't silently drop the response, and so QoS-1-spooled publishes survive a brief reconnect window instead of being dropped. * jobs/stream.rs: bump JobAgent::subscribe (notify-next + describe-accepted), the describe publish, report_progress publish, and publish_and_wait's update/accepted+rejected subscriptions. * commands/stream.rs: bump CommandAgent::subscribe (executions/+/request) and report_in_progress publish. * defender_metrics/mod.rs: bump publish_and_subscribe's accepted+rejected subs and the metric publish. Transfer's data_interface stays at QoS 0 — those are high-volume OTA data blocks with their own retry semantics, same as before #122. Discovered during factbird-edge factory_reset end-to-end testing: job-manager's report_progress publish ~5–10s after a deployment-driven cleanup returned `JobError::Mqtt` because the broker had briefly dropped the connection during the cleanup, and QoS 0 publishes were silently lost.
724feb5 to
83b4700
Compare
`Update::max_size()` (jobs and commands) and the provisioning register
buffer were hardcoded values that overflow on non-trivial payloads —
`serde_json_core::to_slice` returns `PayloadError::BufferSize` and the
publish surfaces as `JobError::Mqtt` / `Error::BufferSize`, so what's
really a buffer-size bug looks like an MQTT/broker problem.
Observed on factbird-edge factory_reset: the post-cleanup `IN_PROGRESS`
update carries a 3-participant report (~700 bytes JSON) and overflows
the hardcoded 512-byte buffer in `jobs::Update::max_size()` every time.
Bumping the constants high penalises resource-constrained mqttrust
users who pay for an embedded buffer they don't need. Make the size
caller-known via the type system instead:
* New `MaxJsonSize: Serialize` super-trait in `mqtt/mod.rs`. Implementer
declares `const MAX_JSON_SIZE: usize` — the worst-case JSON-encoded
size of `Self`. `Serialize` is a super-trait because a max-size
hint only makes sense for a JSON-encodable type; this gives use sites
a single tighter bound (`S: MaxJsonSize`) instead of the compound
`S: Serialize + MaxJsonSize`.
* `jobs::Update<'a, S>` bound changes `S: Serialize` → `S: MaxJsonSize`.
`ToPayload::max_size` returns `S::MAX_JSON_SIZE + framing`. Same for
`report_progress`, `succeed_job`, `fail_job`, `publish_and_wait`.
* `commands::Update<'a, R>` bound changes `R: Serialize` →
`R: MaxJsonSize`. Same for `succeed`, `publish_and_wait`. Replaces
the hardcoded 2048 with `R::MAX_JSON_SIZE + framing`.
* `provisioning::FleetProvisioner::provision*` parameters bound changes
`impl Serialize` → `impl MaxJsonSize`. The DeferredPayload buffer is
sized from `P::MAX_JSON_SIZE + framing` instead of a hardcoded 1024.
* `transfer::status_details::StatusDetailsExt` gains
`const MAX_EXTRA_JSON_SIZE: usize`. `StatusDetails` and
`CombinedStatusDetails<E>` impl `MaxJsonSize` — the latter delegates
to `E::MAX_EXTRA_JSON_SIZE` so users only declare the size of their
own contributed fields.
* Built-in impls: `() => 4` (serializes to "null"). Test types and
`RejectDetails` get explicit impls.
Breaking change for callers who pass a custom `status_details` /
command `result` / provisioning `parameters` type — they need to add
`impl MaxJsonSize for MyType { const MAX_JSON_SIZE: usize = N; }`.
83b4700 to
b14526c
Compare
KennethKnudsen97
approved these changes
May 6, 2026
The bound bump on `Update<S>` / `provision*` ripples through the integration tests, which use their own custom Serialize types. * `tests/common/file_handler.rs`: TestStatusDetails gains MAX_EXTRA_JSON_SIZE (firmware_version is short). * `tests/provisioning.rs`: Parameters<'a> gets MaxJsonSize, and the FleetProvisioner::provision[_cbor] turbofish gains a third inferred generic for the new P type parameter. * commands::ResultMap moves its MaxJsonSize impl from a private test-only block to data_types.rs so integration tests can use it.
`Update<'_, ResultMap>::max_size()` is now `R::MAX_JSON_SIZE + UPDATE_FRAMING_OVERHEAD` = 4096 + 1280 = 5376, which exceeded the test client's 4096-byte TX ring buffer. mqttrust's `grant_async` future stays pending forever when the requested size exceeds buffer capacity, so the publish would silently retry 3× (5+10+15s) before returning `Error::Timeout` — surfaced to the test as `succeed: Mqtt`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Update::max_size()(jobs and commands) and the provisioning register buffer were hardcoded values that overflow on non-trivial payloads —serde_json_core::to_slicereturnsPayloadError::BufferSizeand the publish surfaces asJobError::Mqtt/Error::BufferSize. What's really a buffer-size bug looks like an MQTT/broker problem.Observed on factbird-edge
factory_reset: the post-cleanupIN_PROGRESSupdate carries a 3-participant report (~700 bytes JSON) and overflows the hardcoded 512-byte buffer every time.Bumping the constants high penalises
mqttrust/no_std users who pay for an embedded buffer they don't need. Make the size caller-known via the type system instead.What's new
Serializeis a super-trait — aMAX_JSON_SIZEonly makes sense for a JSON-encodable type, and use sites get a single tighter bound (S: MaxJsonSize) instead of the compoundS: Serialize + MaxJsonSize.Sites updated
jobs::Update<'a, S>(andreport_progress/succeed_job/fail_job/publish_and_wait)S: Serialize, hardcoded 512S: MaxJsonSize,S::MAX_JSON_SIZE + framingcommands::Update<'a, R>(andsucceed/publish_and_wait)R: Serialize, hardcoded 2048R: MaxJsonSize,R::MAX_JSON_SIZE + framingprovisioning::FleetProvisioner::provision*parametersimpl Serialize, hardcoded 1024impl MaxJsonSize,P::MAX_JSON_SIZE + framingtransfer::status_details::StatusDetailsExtconst MAX_EXTRA_JSON_SIZE: usizeStatusDetails,CombinedStatusDetails<E>MaxJsonSize)MaxJsonSize; combined delegates toE::MAX_EXTRA_JSON_SIZEBreaking change
Callers passing a custom
status_details(jobs),result(commands), or provisioningparameterstype need to add an impl:StatusDetailsExtimpls likewise need to declareMAX_EXTRA_JSON_SIZE.Test plan
cargo check(default features)cargo check --no-default-features --features std,mqtt_greengrasscargo test --no-default-features --features std,mqtt_mqttrust,commands_cbor,metric_cbor --lib— 88/88 still passingMAX_JSON_SIZE)Follow-ups (not in this PR)
JobError::Mqttmapping injobs/stream.rsdiscards the underlying error type. Surfacing the typed error would prevent future buffer-overflow / serialization issues from being misdiagnosed as broker problems.shadows::ShadowRoot::MAX_PAYLOAD_SIZEfollows a similar per-trait const pattern. Could optionally be unified underMaxJsonSizelater, but that's a wider refactor.