Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse already created checksum collections on persist #14184

Merged
merged 5 commits into from Sep 6, 2023

Conversation

Zelldon
Copy link
Member

@Zelldon Zelldon commented Sep 6, 2023

Description

  • Support to update the checksum collection from bytes, which might be received or read from the file already. This is useful, if we want to avoid re-reading files etc.
  • Support to persist snapshot with given checksum collection.
  • collection of checksum is no longe recreated and directly written to the directory.

Related issues

relate #14045

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Other teams:
If the change impacts another team an issue has been created for this team, explaining what they need to do to support this change.

Please refer to our review guidelines.

Useful, if we want to avoid re-reading files etc.
Should allow to use pre calculated checksums, instead of always recalculating the value.
New persist method accepts SFV checksum, but needs to be reworked that we not recalculate value.
@Zelldon Zelldon changed the title Ck reuse sfv checksum Reuse already create checksum collections Sep 6, 2023
@Zelldon Zelldon changed the title Reuse already create checksum collections Reuse already created checksum collections on persist Sep 6, 2023
Copy link
Member

@npepinpe npepinpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 I think we'll need one where you pass the checksum value itself (either as a long or as a byte array), and not the byte array of contents (since RocksDB will give us the checksum and not the file contents). It might also be useful if we try to replicate snapshot files in smaller chunks 🤷

Anyway, we'll cross that bridge when we get there 😄

@Override
public void updateFromBytes(final String fileName, final byte[] bytes) {
combinedChecksum.update(fileName.getBytes(UTF_8));
final Checksum checksum = new CRC32C();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 You can create it once for the whole class, and just call CRC32C#reset before every usage. Very minor optimization 🙃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I keep it for now.

The CRC323C#reset :D

    public void reset() {
        this.crc = -1;
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at how it works, this is the initial value 🙃 Then when you pass things it gets updated. So it is, in fact, resetting the state to the initial values ;)

@@ -62,7 +62,7 @@ private static boolean isNotMetadataFile(final Path file) {

public static void persist(final Path checksumPath, final ImmutableChecksumsSFV checksum)
throws IOException {
try(final var stream = new FileOutputStream(checksumPath.toFile())) {
try (final var stream = new FileOutputStream(checksumPath.toFile())) {
Copy link
Member

@npepinpe npepinpe Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ups 🙈 my bad

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah :D seems to be from the other PR :D

@Zelldon
Copy link
Member Author

Zelldon commented Sep 6, 2023

I think we'll need one where you pass the checksum value itself (either as a long or as a byte array), and not the byte array of contents (since RocksDB will give us the checksum and not the file contents). It might also be useful if we try to replicate snapshot files in smaller chunks

Yeah, let's do that when we are there, meaning when we are actually migrating.

@Zelldon
Copy link
Member Author

Zelldon commented Sep 6, 2023

bors r+

@Zelldon Zelldon merged commit d74f357 into ck-refactor-checksum Sep 6, 2023
30 of 32 checks passed
@Zelldon Zelldon deleted the ck-reuse-sfv-checksum branch September 6, 2023 12:46
zeebe-bors-camunda bot added a commit that referenced this pull request Sep 6, 2023
14175: Refactor checksum collection writing and usage r=Zelldon a=Zelldon

## Description

The first part of PR:

Instead of constructing a byte array with byte output stream and writers, and then later writing again
to a stream, we now get an OutputStream in and write directly to the stream


Additionally PR contains #14184

- Support to update the checksum collection from bytes, which might be received or read from the file already. This is useful, if we want to avoid re-reading files etc.
- Support to persist snapshot with given checksum collection.
- collection of checksum is no longer recreated and directly written to the directory.



<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related #14045 



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
Co-authored-by: Christopher Kujawa (Zell) <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this pull request Sep 6, 2023
14175: Refactor checksum collection writing and usage r=Zelldon a=Zelldon

## Description

The first part of PR:

Instead of constructing a byte array with byte output stream and writers, and then later writing again
to a stream, we now get an OutputStream in and write directly to the stream


Additionally PR contains #14184

- Support to update the checksum collection from bytes, which might be received or read from the file already. This is useful, if we want to avoid re-reading files etc.
- Support to persist snapshot with given checksum collection.
- collection of checksum is no longer recreated and directly written to the directory.



<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

related #14045 



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
Co-authored-by: Christopher Kujawa (Zell) <zelldon91@googlemail.com>
@zeebe-bors-camunda
Copy link
Contributor

Merge conflict.

@abbasadel abbasadel added the version:8.3.0-alpha6 Marks an issue as being completely or in parts released in 8.3.0-alpha6 label Sep 8, 2023
@megglos megglos added the version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0 label Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
version:8.3.0-alpha6 Marks an issue as being completely or in parts released in 8.3.0-alpha6 version:8.3.0 Marks an issue as being completely or in parts released in 8.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants