Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(file source): Handle legacy fingerprint checksums from < v0.14.0 #8225

Merged
merged 5 commits into from
Jul 12, 2021

Conversation

jszwedko
Copy link
Member

@jszwedko jszwedko commented Jul 9, 2021

Adds logic to convert from checkpoints written by Vector before version 0.14.0 to 0.14.0 checkpoints by seeing if any checkpoints, that don't match an existing file, match if the old checkpoint strategy is used.

Also adds alias for first_line_checksum strategy which would have broken checkpoints if released.

Ideally I'd like to include this in 0.15.0 to avoid needing to release a 0.15.1 with it.

This will be released as a 0.13.1.

Fixes: #8182

TODO:

  • Figure out why test_checkpointer_fingerprint_upgrades_legacy_checksum is failing. It seems to calculate a different checksum than I get when running 0.13.1 directly despite the upgrade working when I run this branch using 0.13.1 checkpoints.
  • Add tests to assert the literal value we are writing to the checkpoint file to avoid unexpected regressions

Signed-off-by: Jesse Szwedko jesse@szwedko.me

Fixes: #8182

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
@jszwedko jszwedko requested review from a team and bruceg and removed request for a team July 9, 2021 23:52
Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, and I appreciate moving more logic into the maybe upgrade function, though I do have a question about the buffer parameter.

lib/file-source/src/checkpointer.rs Outdated Show resolved Hide resolved
path: &Path,
fng: FileFingerprint,
fingerprinter: &Fingerprinter,
fingerprint_buffer: &mut Vec<u8>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This buffer does not appear to be used by any of the callers after calling the function. Could it be created within this function to avoid the additional parameter?

In fact, it doesn't look like the contents are used within this function either. I don't think this function is on the hot path (startup only) so reusing the buffer shouldn't be critical to performance. Am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I was following the pattern of the other functions, but think we can just create the buffer in here since it is only called once at start-up.

Copy link
Member Author

@jszwedko jszwedko Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I realized the default buffer size 1 MB so I may replace this. I can see why we'd want to reuse them in-case there are a lot of files to avoid allocating 1 MB each time.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
@jszwedko jszwedko enabled auto-merge (squash) July 12, 2021 21:00
@jszwedko jszwedko merged commit 9b7fabf into master Jul 12, 2021
@jszwedko jszwedko deleted the upgrade-fingerprints branch July 12, 2021 22:00
lucperkins pushed a commit that referenced this pull request Jul 19, 2021
…8225)

* fix(file source): Handle legacy fingerprint checksums from < v0.14.0

Adds logic to convert from checkpoints written by Vector before version 0.14.0 to 0.14.0 checkpoints by seeing if any checkpoints, that don't match an existing file, match if the old checkpoint strategy is used.

Also adds alias for `first_line_checksum` strategy which would have broken checkpoints if released.

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
Signed-off-by: Luc Perkins <luc.perkins@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File checkpoints are incompatible between 0.13.1 and 0.14.0
3 participants