Fix possible incomplete snapshot chunk write #4358

Zelldon · 2020-04-23T15:04:40Z

Description

As written in #4357 we need to loop on writing otherwise we may end in cases where your snapshot is corrupted because chunks have been written not completely.

Related guide to write to FileChannel http://tutorials.jenkov.com/java-nio/file-channel.html

Related issues

closes #4357

Pull Request Checklist

All commit messages match our commit message guidelines
The submitting code follows our code style
If submitting code, please run mvn clean install -DskipTests locally before committing

npepinpe

Dumb question, why can't we use Files.write here as well again?

Zelldon · 2020-04-24T16:51:05Z

Not dump at all. I just wanted to do smallest changes. I was not sure whether here was a reason to use the SeekableChannel, furthermore not sure whether we have access to the byte array. But probably we have here no direct byte buffer.

Zelldon · 2020-04-27T07:01:15Z

was this now an request ? @npepinpe

npepinpe · 2020-04-27T12:52:08Z

I see what you mean, I guess I'm not 100% sure if we can guarantee it's always a byte array. We probably do, but yeah not sure.

That said, it makes me realize there might be an issue with our replication. Say I want to write a chunk of X bytes, but I only write Y = X / 2 bytes, then for whatever reason it throws an IO exception. The next time I try the same chunk, we check if it exists, and skip it if it does. Could this happen then? Should we compare checksums? I guess this is a little outside the scope of the issue though.

I guess the PR, as is, is already an improvement so 👍

Zelldon · 2020-04-27T13:11:39Z

bors r+

zeebe-bors · 2020-04-27T13:36:40Z

Build succeeded

continuous-integration/jenkins/branch

Zelldon · 2020-04-27T13:39:15Z

@npepinpe regarding the io thing. Our replication marks the snapshot as invalid if an exception was thrown, so should be fine.

npepinpe · 2020-04-27T13:40:13Z

So the only possible case is if it crashed while looping the write?

Zelldon · 2020-04-28T04:02:26Z

Node crash you mean? Yeah maybe.

Zelldon · 2020-04-28T12:01:18Z

@npepinpe thought about this a bit more. I think then the snapshot will not be finished if the node fails, which means it will be not marked as valid so no problem.

4372: fix(atomix): consume read buffer correctly r=Zelldon a=Zelldon ## Description Fix issue where the read buffer was consumed only half and the rest was thrown away. Refactor `FileChannelJournalSegmentReader#readNext` to improve readability and hopefully maintainability.  I run also a benchmark with these changes (#4358 was also part of the branch I run a benchmark with) ![metrics](https://user-images.githubusercontent.com/2758593/80199113-45202a00-8621-11ea-9595-e8c65130ec97.png) ## Related issues  closes #4248 # Co-authored-by: Christopher Zell <zelldon91@googlemail.com>

fix(broker): fix possible incomplete snapshot chunk write

92467d1

Zelldon requested a review from npepinpe April 23, 2020 15:55

Zelldon mentioned this pull request Apr 24, 2020

fix(atomix): consume read buffer correctly #4372

Merged

3 tasks

npepinpe reviewed Apr 24, 2020

View reviewed changes

npepinpe approved these changes Apr 27, 2020

View reviewed changes

zeebe-bors bot merged commit 3d423d1 into develop Apr 27, 2020

zeebe-bors bot deleted the 4357-loop-write branch April 27, 2020 13:36

npepinpe added the Release: 0.24.0-alpha2 label Jun 2, 2020

npepinpe added the Release: 0.24.0 label Jul 3, 2020

github-merge-queue bot pushed a commit that referenced this pull request Apr 16, 2024

fix: Add useOnlyPositionCheck for tasklist (#4358)

ef1a602

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix possible incomplete snapshot chunk write #4358

Fix possible incomplete snapshot chunk write #4358

Zelldon commented Apr 23, 2020

npepinpe left a comment

Zelldon commented Apr 24, 2020

Zelldon commented Apr 27, 2020

npepinpe commented Apr 27, 2020

Zelldon commented Apr 27, 2020

zeebe-bors bot commented Apr 27, 2020

Zelldon commented Apr 27, 2020

npepinpe commented Apr 27, 2020

Zelldon commented Apr 28, 2020

Zelldon commented Apr 28, 2020

Fix possible incomplete snapshot chunk write #4358

Fix possible incomplete snapshot chunk write #4358

Conversation

Zelldon commented Apr 23, 2020

Description

Related issues

Pull Request Checklist

npepinpe left a comment

Choose a reason for hiding this comment

Zelldon commented Apr 24, 2020

Zelldon commented Apr 27, 2020

npepinpe commented Apr 27, 2020

Zelldon commented Apr 27, 2020

zeebe-bors bot commented Apr 27, 2020

Build succeeded

Zelldon commented Apr 27, 2020

npepinpe commented Apr 27, 2020

Zelldon commented Apr 28, 2020

Zelldon commented Apr 28, 2020