Skip to content

chore(file source): Add Fingerprinter::FirstLineChecksum#2904

Merged
MOZGIII merged 3 commits intomasterfrom
first-line-checksum-fingerprinter
Jun 30, 2020
Merged

chore(file source): Add Fingerprinter::FirstLineChecksum#2904
MOZGIII merged 3 commits intomasterfrom
first-line-checksum-fingerprinter

Conversation

@MOZGIII
Copy link
Copy Markdown
Contributor

@MOZGIII MOZGIII commented Jun 25, 2020

This PR implements first-line-checksum fingerprinter outlined at #2890.

It is a concrete implementation proposal, and also a proof of concept. Tests ensure that we achieve the desired behavior and also serve as the illustration for the properties of the solution.


One of the questionable moments is filling the buffer with 0 before leaving the match arm. We could alternatively pass the shorter slice to the checksum_ecma, but I'm not sure what we'd prefer here.

Signed-off-by: MOZGIII <mike-n@narod.ru>
@MOZGIII MOZGIII force-pushed the first-line-checksum-fingerprinter branch from da77c40 to 108596d Compare June 25, 2020 13:53
Comment thread lib/file-source/src/file_server.rs Outdated
@binarylogic
Copy link
Copy Markdown
Contributor

Thanks, @MOZGIII. From a UX perspective, I'm a little concerned with 3 different options when, at the end of the day, this is just a checksum. If we decide to implement this I'd prefer that we work it in as an option to the already existing checksum options. Maybe adding something like stop_pattern (similar to multiline.start_pattern)?

Regardless, I defer to @lukesteensen on how we should approach this, if at all.

Signed-off-by: MOZGIII <mike-n@narod.ru>
@MOZGIII
Copy link
Copy Markdown
Contributor Author

MOZGIII commented Jun 26, 2020

@binarylogic this is an internal implementation, we can make the user-facing configuration different, taking your concerns into account! From the technical point of view, the config parameters for the two checksum algorithms are disjoint, so it makes sense to express that explicitly in the type system at the internal implementation.

@binarylogic binarylogic changed the title chore: Add Fingerprinter::FirstLineChecksum chore(file source): Add Fingerprinter::FirstLineChecksum Jun 30, 2020
@binarylogic
Copy link
Copy Markdown
Contributor

@lukesteensen given your comment on #2890 (comment) should we approve this?

Copy link
Copy Markdown
Member

@lukesteensen lukesteensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience with this! Glad we were able to come to an understanding and get this done.

Comment thread lib/file-source/src/file_server.rs
Comment thread lib/file-source/src/file_server.rs Outdated
Signed-off-by: MOZGIII <mike-n@narod.ru>
@MOZGIII MOZGIII merged commit 25ee24c into master Jun 30, 2020
@MOZGIII MOZGIII linked an issue Jul 1, 2020 that may be closed by this pull request
@binarylogic binarylogic deleted the first-line-checksum-fingerprinter branch July 23, 2020 17:33
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
…v#2904)

* Add Fingerprinter::FirstLineChecksum

Signed-off-by: MOZGIII <mike-n@narod.ru>

* Remove the lifetime rebind as it's not required

Signed-off-by: MOZGIII <mike-n@narod.ru>

* Fix a typo

Signed-off-by: MOZGIII <mike-n@narod.ru>
Signed-off-by: Brian Menges <brian.menges@anaplan.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The ability to checksum by the first line at the file server

4 participants