refactor(container-runtime): Stop estimating batch size when accumulating a local batch #24309
refactor(container-runtime): Stop estimating batch size when accumulating a local batch #24309markfields merged 10 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
packages/runtime/container-runtime/src/opLifecycle/outbox.ts:539
- Please clarify the rationale behind not using socketSize in this check or update the logic to consistently rely on socketSize, ensuring the error handling is clear.
//* TODO: Why isn't this socketSize?
packages/runtime/container-runtime/src/test/opLifecycle/outbox.spec.ts:464
- Implement tests for batch size estimation to validate the new behavior, ensuring that any edge cases arising from the removal of the estimation logic are properly covered.
//* TODO: Implement tests like these
| /* reentrant */ false, | ||
| ), | ||
| true, | ||
| batchManager.push( |
There was a problem hiding this comment.
Now instead of asserting, it would fail if it somehow hit an error while pushing?
There was a problem hiding this comment.
Now push doesn't do anything but put it in an array, so it wouldn't really ever throw. The change is just that there's no return value anymore (nothing to say - it always adds it)
| assert.equal(batchManager.contentSizeInBytes, smallMessageSize * batchManager.length); | ||
| batchManager.push(smallMessage(), /* reentrant */ false); | ||
| batchManager.push(smallMessage(), /* reentrant */ false); | ||
| batchManager.push(smallMessage(), /* reentrant */ false); |
There was a problem hiding this comment.
Isn't this test and Batch metadata is set correctly [${includeBatchId ? "with" : "without"} batchId] doing the same thing now?
I don't see where you are checking the batch content size in this test.
There was a problem hiding this comment.
Good catch. Removed this and added a similar test to cover where this code under test moved to: localBatchToOutboundBatch
This reverts PR #24309, which has been causing tests to fail on the CI pipeline ([example run](https://dev.azure.com/fluidframework/internal/_build/results?buildId=332526&view=logs&j=f193e17a-f43e-518b-48dd-0a836d9f111c&t=8313d791-ee78-5f41-3fbb-b9b81d25bc65&l=5701)).
…)" This reverts commit 41c769c.
… accumulating a local batch (#24363) _Re-do of #24309 with less strict assertion conditions in `messageSize.spec.ts`. See df114eac27269bd2c7b90c1e8125034c68e96c98_ ## PR Description (from #24309) We are preparing to update `LocalBatchMessage` to hold the original runtime op, not the serialized op. So it will no longer be practical to estimate the content size of each op (we don't want to do an extra JSON stringify just for this purpose). Here's a comparison of the places we would fail before / after this change: Before: * If no compression, then if est socket size is too big, throw when submitting. (in `Outbox.addMessageToBatchManager`) * "If no compression" because `hardLimit` would be infinity if compression was enabled * If yes compression (and it happens), then throw if it was insufficient (in `Outbox.virtualizeBatch`) Now: * If yes compression (and it happens), then throw if it was insufficient (in `Outbox.virtualizeBatch`) -- _unchanged from before_ * Otherwise (no compression happened, or it did and appeared sufficient), if est socket size is too big, throw before send (in `Outbox.sendBatch`) * This case used to be possible even after compression if there were soooo many empty placeholders that it pushed it over the limit. The log I'm deleting from there ("LargeBatch") is hit for some internal apps on old versions, when we did empty placeholders. The log has not been hit on any recent version. * But now, if compression is sufficient or unnecessary, this will surely pass as well. So this is just guarding the case where compression was disabled, so it's quite equivalent to the original check during submit.
Description
We are preparing to update
LocalBatchMessageto hold the original runtime op, not the serialized op. So it will no longer be practical to estimate the content size of each op (we don't want to do an extra JSON stringify just for this purpose).The downside is if compression is disabled and someone submits an op that will likely push the batch over the threshold, they won't get that error under that submit callstack like they used to. Instead it will only surface when the container closes during flush. If this is a problem at some point, we may be able to keep a lambda that will create the callstack from the original closure, like this.
Reviewer Guidance
Here's a comparison of the places we would fail before / after this change:
Before:
Outbox.addMessageToBatchManager)hardLimitwould be infinity if compression was enabledOutbox.virtualizeBatch)Now:
Outbox.virtualizeBatch) -- unchanged from beforeOutbox.sendBatch)