-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API to probe the logstream batch writer if more bytes can be written without writing them #8798
Conversation
Adds a new public API method to the Dispatcher, `#canClaimFragmentBatch(int, int)`. This method determines whether the batch with the given fragment count and of the given length can actually be claimed for this dispatcher instance. This method is mostly useful to determine if something will be claimable before you wish to claim it, mostly to avoid having to abort your claim unnecessarily if the batch isn't finished yet when calling this method.
Introduces a new API to the `LogStreamBatchWriter` API, `#canWriteAdditionalEvent(int)`. This API lets one probe whether a batch including an additional event of the given size could be written. While this could be done via tryWrite, there are cases, such as with the job activation, where you want to know if adding data to a record (not to the writer, but the record) will cause it to be unwritable, at which point you want to handle it differently. Rolling back changes to the record itself is more complicated than to the writer, so as a compromise you can probe the writer to know if it will be able to write more bytes before writing them, even, as in our example, to your DTO.
Looking forward to talk about tests 😄 I'm not very satisfied with the tests I added, especially in the dispatcher, but nothing came to mind without essentially reusing the implementation in the tests 😵 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@npepinpe looks good so far 👍
I'm not so happy about the writer test but I have no better idea 😅
Please have a look at my comment. But feel free to merge 🚀
dispatcher/src/test/java/io/camunda/zeebe/dispatcher/DispatcherTest.java
Outdated
Show resolved
Hide resolved
Adds a common case test for `canClaimFragmentBatch`. The idea is that the previous test was testing a corner case - i.e. when the given batch length fits the max fragment length, but will not when it's framed and aligned.
I also took the opportunity to update the |
bors merge |
8798: Add API to probe the logstream batch writer if more bytes can be written without writing them r=npepinpe a=npepinpe ## Description This PR adds a new API method, `LogStreamBatchWriter#canWriteAdditionalEvent(int)`. This allows users of the writer to probe if adding the given amount of bytes to the batch would cause it to become un-writable, without actually having to write anything to the batch, or even modify their DTO (e.g. the `TypedRecord<?>` in the engine). To avoid having dispatcher details leak into the implementation, an analogous method is added to the dispatcher, `Dispatcher#canClaimFragmentBatch(int, int)`, which will compare the given size, framed and aligned, with the max fragment length. This is the main building block to eventually solve #5525, and enable other use cases (e.g. multi-instance creation) which deal with large batches until we have a more permanent solution (e.g. chunking follow up batches). NOTE: the tests added in the dispatcher are not very good, but I couldn't come up with something else that wouldn't be too coupled to the implementation (i.e. essentially reusing `LogBufferAppender`). I would like some ideas/suggestions. NOTE: this PR comes out of the larger one, #8491. You can check that one out to see how the new API would be used, e.g. in the `JobBatchCollector`. As such, this is marked for backporting, since we'll backport the complete fix for #5525. ## Related issues related to #5525 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Build failed: |
bors r+ ES tests were flaky, which incidentally is my next issue to work on 😄 |
Build succeeded:
|
Successfully created backport PR #8809 for |
Successfully created backport PR #8810 for |
8799: Correctly truncate a job activation batch if it will not fit in the dispatcher r=npepinpe a=npepinpe ## Description This PR fixes a bug where at times, we would try to activate a job batch which was too big and could not be written into the dispatcher, as when it was framed and aligned, its size would exceed the `maxFragmentLength`. The fix itself is simple, and it was to delegate, all the way down to the dispatcher (with layers adding their own framing in between), the responsibility to decide whether more jobs can be added to the batch or not. To simplify the code and testing, the responsibility of collecting jobs into a batch was extracted into a specific class. The new API, `TypedEventWriter#canWriteEventOfLength(int)` could potentially be reused in other places where we want to prevent writing batches that are too large. However, this is a bit brittle, since it requires the code in the engine to properly compute beforehand how the record will grow when you modify it. For example, in the `JobBatchCollector`, to calculate the length of the `TypedRecord<JobBatchRecordValue> record` when adding a given `JobRecordValue` to it, we need the following: ```java // the expected length is based on the current record's length plus the length of the job // record we would add to the batch, the number of bytes taken by the additional job key, // as well as one byte required per job key for its type header. if we ever add more, // should be updated accordingly. final var jobRecordLength = jobRecord.getLength(); final var expectedEventLength = record.getLength() + jobRecordLength + Long.BYTES + 1; ``` Not extremely complicated, but could be error prone. A better solution, chunking the follow up events, is not something we can sanely do in between KRs, unfortunately, and would most likely carry a high risk, so wouldn't be anything we can backport. For now, I think this is an OK compromise. NOTE: this PR depends on #8797 and #8798. Wait for these to be merged before reviewing! ## Related issues closes #5525 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Description
This PR adds a new API method,
LogStreamBatchWriter#canWriteAdditionalEvent(int)
. This allows users of the writer to probe if adding the given amount of bytes to the batch would cause it to become un-writable, without actually having to write anything to the batch, or even modify their DTO (e.g. theTypedRecord<?>
in the engine).To avoid having dispatcher details leak into the implementation, an analogous method is added to the dispatcher,
Dispatcher#canClaimFragmentBatch(int, int)
, which will compare the given size, framed and aligned, with the max fragment length. This is the main building block to eventually solve #5525, and enable other use cases (e.g. multi-instance creation) which deal with large batches until we have a more permanent solution (e.g. chunking follow up batches).NOTE: the tests added in the dispatcher are not very good, but I couldn't come up with something else that wouldn't be too coupled to the implementation (i.e. essentially reusing
LogBufferAppender
). I would like some ideas/suggestions.NOTE: this PR comes out of the larger one, #8491. You can check that one out to see how the new API would be used, e.g. in the
JobBatchCollector
. As such, this is marked for backporting, since we'll backport the complete fix for #5525.Related issues
related to #5525
Definition of Done
Not all items need to be done depending on the issue and the pull request.
Code changes:
backport stable/1.3
) to the PR, in case that fails you need to create backports manually.Testing:
Documentation: