New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-5147 Implement CalculateAttributeHash processor #2980
Conversation
- added properties to control behavior when attributes that are configured are partially or completely missing - set charset with a property - added tests
…hContent. Added unit tests.
Cleaned up typos and descriptions. Added unit test demonstrating missing Blake2 algorithm.
…implementation. Updated unit test.
Unit test for all default test vectors passes.
Added unit tests.
…s, character encoding, and defaults.
All unit tests pass.
…stRunner, and MockProcessContext.
…n name matching. Added unit test.
Added unit tests.
This encapsulates the changes @ottobackwards made in PR 2836, but also:
I will open follow-on issues to:
|
* * SHA-512 (SHA2) | ||
* * SHA-512/224 (SHA2) | ||
* * SHA-512/256 (SHA2) | ||
* * SHA3-256 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add SHA3-224 and Blake2b-160.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, just a couple of comments
public boolean isStrongAlgorithm() { | ||
return (!BROKEN_ALGORITHMS.contains(name)); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the isBlake2 check about? Is there a way to make it more general? It seems strange to call out by the name as opposed to the "why"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Blake2 implementations need BouncyCastle and use different API calls than the other MessageDigest
instances.
return traditionalHash(algorithm, value); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put this functionality in the enum and simplify this class? Having the specialization there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to move the execution logic into the enum. The enum is there to capture metadata about the acceptable values, while the logic is independent from that selection.
Reviewing.. |
Tested out the HashAttribute processor. This all worked fine:
UTF-16 is where I came unstuck:
Test input string in all cases: “hehe” NiFi CalculateAttributeHash: CyberChef (https://gchq.github.io/CyberChef/#recipe=Encode_text('UTF16BE%20(1201)')MD5()&input=aGVoZQ): I found that “UTF-16” is different because when encoding, Java adds a big-endian BOM: “When decoding, the UTF-16 charset interprets the byte-order mark at the beginning of the input stream to indicate the byte-order of the stream but defaults to big-endian if there is no byte-order mark; when encoding, it uses big-endian byte order and writes a big-endian byte-order mark.” As expected, adding the BOM changes the output bytes which are then hashed, resulting in a different hash to “UTF-16BE” encoding. Is this a problem or is this simply expected behaviour - ie. should the user realize that there will be a difference between UTF-16 and UTF-16BE encoding and the resulting hash? |
Thanks for discovering this @thenatog . This is an excellent catch. I've added behavior to catch this, better documentation, and unit tests. However, I added them on the branch that includes PR 2983. Let's mark this PR as closed and just review the other one, as it is more complete and addresses this issue.
|
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.