Add API to probe the logstream batch writer if more bytes can be written without writing them #8798

npepinpe · 2022-02-16T14:39:47Z

Description

This PR adds a new API method, LogStreamBatchWriter#canWriteAdditionalEvent(int). This allows users of the writer to probe if adding the given amount of bytes to the batch would cause it to become un-writable, without actually having to write anything to the batch, or even modify their DTO (e.g. the TypedRecord<?> in the engine).

To avoid having dispatcher details leak into the implementation, an analogous method is added to the dispatcher, Dispatcher#canClaimFragmentBatch(int, int), which will compare the given size, framed and aligned, with the max fragment length. This is the main building block to eventually solve #5525, and enable other use cases (e.g. multi-instance creation) which deal with large batches until we have a more permanent solution (e.g. chunking follow up batches).

NOTE: the tests added in the dispatcher are not very good, but I couldn't come up with something else that wouldn't be too coupled to the implementation (i.e. essentially reusing LogBufferAppender). I would like some ideas/suggestions.

NOTE: this PR comes out of the larger one, #8491. You can check that one out to see how the new API would be used, e.g. in the JobBatchCollector. As such, this is marked for backporting, since we'll backport the complete fix for #5525.

Related issues

related to #5525

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

The changes are backwards compatibility with previous versions
If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

There are unit/integration tests that verify all acceptance criterias of the issue
New tests are written to ensure backwards compatibility with further versions
The behavior is tested manually
The change has been verified by a QA run
The impact of the changes is verified by a benchmark

Documentation:

The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
New content is added to the release announcement
If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Adds a new public API method to the Dispatcher, `#canClaimFragmentBatch(int, int)`. This method determines whether the batch with the given fragment count and of the given length can actually be claimed for this dispatcher instance. This method is mostly useful to determine if something will be claimable before you wish to claim it, mostly to avoid having to abort your claim unnecessarily if the batch isn't finished yet when calling this method.

Introduces a new API to the `LogStreamBatchWriter` API, `#canWriteAdditionalEvent(int)`. This API lets one probe whether a batch including an additional event of the given size could be written. While this could be done via tryWrite, there are cases, such as with the job activation, where you want to know if adding data to a record (not to the writer, but the record) will cause it to be unwritable, at which point you want to handle it differently. Rolling back changes to the record itself is more complicated than to the writer, so as a compromise you can probe the writer to know if it will be able to write more bytes before writing them, even, as in our example, to your DTO.

npepinpe · 2022-02-16T15:17:26Z

Looking forward to talk about tests 😄 I'm not very satisfied with the tests I added, especially in the dispatcher, but nothing came to mind without essentially reusing the implementation in the tests 😵

saig0

@npepinpe looks good so far 👍
I'm not so happy about the writer test but I have no better idea 😅

Please have a look at my comment. But feel free to merge 🚀

dispatcher/src/test/java/io/camunda/zeebe/dispatcher/DispatcherTest.java

Adds a common case test for `canClaimFragmentBatch`. The idea is that the previous test was testing a corner case - i.e. when the given batch length fits the max fragment length, but will not when it's framed and aligned.

npepinpe · 2022-02-17T08:48:56Z

I also took the opportunity to update the DispatcherTest to Junit 5, as it was an extremely easy thing to do and we do want to migrate to Junit 5 completely eventually. I don't think it needs another review, but I'll ping you on slack for a quick chance to veto anyway 😉

npepinpe · 2022-02-17T09:03:30Z

bors merge

8798: Add API to probe the logstream batch writer if more bytes can be written without writing them r=npepinpe a=npepinpe ## Description This PR adds a new API method, `LogStreamBatchWriter#canWriteAdditionalEvent(int)`. This allows users of the writer to probe if adding the given amount of bytes to the batch would cause it to become un-writable, without actually having to write anything to the batch, or even modify their DTO (e.g. the `TypedRecord<?>` in the engine). To avoid having dispatcher details leak into the implementation, an analogous method is added to the dispatcher, `Dispatcher#canClaimFragmentBatch(int, int)`, which will compare the given size, framed and aligned, with the max fragment length. This is the main building block to eventually solve #5525, and enable other use cases (e.g. multi-instance creation) which deal with large batches until we have a more permanent solution (e.g. chunking follow up batches). NOTE: the tests added in the dispatcher are not very good, but I couldn't come up with something else that wouldn't be too coupled to the implementation (i.e. essentially reusing `LogBufferAppender`). I would like some ideas/suggestions. NOTE: this PR comes out of the larger one, #8491. You can check that one out to see how the new API would be used, e.g. in the `JobBatchCollector`. As such, this is marked for backporting, since we'll backport the complete fix for #5525. ## Related issues related to #5525 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

ghost · 2022-02-17T09:32:18Z

Build failed:

continuous-integration/jenkins/branch

npepinpe · 2022-02-17T09:58:14Z

bors r+

ES tests were flaky, which incidentally is my next issue to work on 😄

ghost · 2022-02-17T10:32:40Z

Build succeeded:

github-actions · 2022-02-17T10:33:31Z

Successfully created backport PR #8809 for stable/1.2.

github-actions · 2022-02-17T10:33:36Z

Successfully created backport PR #8810 for stable/1.3.

8809: [Backport stable/1.2] Add API to probe the logstream batch writer if more bytes can be written without writing them r=npepinpe a=github-actions[bot] # Description Backport of #8798 to `stable/1.2`. relates to #5525 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

8810: [Backport stable/1.3] Add API to probe the logstream batch writer if more bytes can be written without writing them r=npepinpe a=github-actions[bot] # Description Backport of #8798 to `stable/1.3`. relates to #5525 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

8799: Correctly truncate a job activation batch if it will not fit in the dispatcher r=npepinpe a=npepinpe ## Description This PR fixes a bug where at times, we would try to activate a job batch which was too big and could not be written into the dispatcher, as when it was framed and aligned, its size would exceed the `maxFragmentLength`. The fix itself is simple, and it was to delegate, all the way down to the dispatcher (with layers adding their own framing in between), the responsibility to decide whether more jobs can be added to the batch or not. To simplify the code and testing, the responsibility of collecting jobs into a batch was extracted into a specific class. The new API, `TypedEventWriter#canWriteEventOfLength(int)` could potentially be reused in other places where we want to prevent writing batches that are too large. However, this is a bit brittle, since it requires the code in the engine to properly compute beforehand how the record will grow when you modify it. For example, in the `JobBatchCollector`, to calculate the length of the `TypedRecord<JobBatchRecordValue> record` when adding a given `JobRecordValue` to it, we need the following: ```java // the expected length is based on the current record's length plus the length of the job // record we would add to the batch, the number of bytes taken by the additional job key, // as well as one byte required per job key for its type header. if we ever add more, // should be updated accordingly. final var jobRecordLength = jobRecord.getLength(); final var expectedEventLength = record.getLength() + jobRecordLength + Long.BYTES + 1; ``` Not extremely complicated, but could be error prone. A better solution, chunking the follow up events, is not something we can sanely do in between KRs, unfortunately, and would most likely carry a high risk, so wouldn't be anything we can backport. For now, I think this is an OK compromise. NOTE: this PR depends on #8797 and #8798. Wait for these to be merged before reviewing! ## Related issues closes #5525 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>

npepinpe added 2 commits February 16, 2022 15:33

npepinpe mentioned this pull request Feb 16, 2022

Correctly truncate a job activation batch if it will not fit in the dispatcher #8799

Merged

10 tasks

npepinpe requested a review from saig0 February 16, 2022 15:15

npepinpe mentioned this pull request Feb 16, 2022

Fix estimation of claimed batch length when truncating job batch activation records #8491

Closed

9 tasks

npepinpe added backport stable/1.2 labels Feb 16, 2022

saig0 approved these changes Feb 17, 2022

View reviewed changes

dispatcher/src/test/java/io/camunda/zeebe/dispatcher/DispatcherTest.java Outdated Show resolved Hide resolved

npepinpe added 2 commits February 17, 2022 09:43

test(dispatcher): add common case test

471f3c7

Adds a common case test for `canClaimFragmentBatch`. The idea is that the previous test was testing a corner case - i.e. when the given batch length fits the max fragment length, but will not when it's framed and aligned.

test(dispatcher): update dispatcher unit test to junit 5

409a7dc

ghost merged commit 11c1a74 into main Feb 17, 2022

ghost deleted the 5525-extend-logpatcher branch February 17, 2022 10:32

github-actions bot mentioned this pull request Feb 17, 2022

[Backport stable/1.2] Add API to probe the logstream batch writer if more bytes can be written without writing them #8809

Merged

github-actions bot mentioned this pull request Feb 17, 2022

[Backport stable/1.3] Add API to probe the logstream batch writer if more bytes can be written without writing them #8810

Merged

Zelldon added Release: 1.2.11 version:1.3.5 Marks an issue as being completely or in parts released in 1.3.5 labels Mar 1, 2022

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add API to probe the logstream batch writer if more bytes can be written without writing them #8798

Add API to probe the logstream batch writer if more bytes can be written without writing them #8798

npepinpe commented Feb 16, 2022 •

edited

Loading

npepinpe commented Feb 16, 2022

saig0 left a comment

npepinpe commented Feb 17, 2022

npepinpe commented Feb 17, 2022

ghost commented Feb 17, 2022

npepinpe commented Feb 17, 2022

ghost commented Feb 17, 2022

github-actions bot commented Feb 17, 2022

github-actions bot commented Feb 17, 2022

Add API to probe the logstream batch writer if more bytes can be written without writing them #8798

Add API to probe the logstream batch writer if more bytes can be written without writing them #8798

Conversation

npepinpe commented Feb 16, 2022 • edited Loading

Description

Related issues

Definition of Done

npepinpe commented Feb 16, 2022

saig0 left a comment

Choose a reason for hiding this comment

npepinpe commented Feb 17, 2022

npepinpe commented Feb 17, 2022

ghost commented Feb 17, 2022

npepinpe commented Feb 17, 2022

ghost commented Feb 17, 2022

github-actions bot commented Feb 17, 2022

github-actions bot commented Feb 17, 2022

npepinpe commented Feb 16, 2022 •

edited

Loading