NIFI-13779: Missing Some Data Provenance Events from Python#9292
NIFI-13779: Missing Some Data Provenance Events from Python#9292exceptionfactory merged 1 commit intoapache:mainfrom
Conversation
cbbecaa to
b8e0c57
Compare
exceptionfactory
left a comment
There was a problem hiding this comment.
Thanks for proposing this change @bobpaulin. This seems to be a better way to handle the lineage of input FlowFiles, reflecting CONTENT_MODIFIED instead of cloning. However, tagging @markap14 for additional review and consideration.
markap14
left a comment
There was a problem hiding this comment.
Hey @bobpaulin thanks for updating! I do think this makes a lot of sense. I had 2 thoughts about the PR though. I commented inline about the name of the transformed and originalCloned variables. Normally I try not to quibble over variable names, but given how frequently they're used in the code I think it's helpful to use clear names.
The other thought is that the Provenance is now going to always show a CLONE event followed by a DROP event if original is auto-terminated, which is the default and very common. But we just recently added a new method to ProcessContext: boolean isAutoTerminated(Relationship relationship);
Perhaps we should not even clone the FlowFile at all if the original FlowFile is auto-terminated. That way, there's no CLONE or DROP event, and the lineage is much clearer.
| public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException { | ||
| FlowFile original = session.get(); | ||
| if (original == null) { | ||
| FlowFile transformed = session.get(); |
There was a problem hiding this comment.
It feels weird to me to call the FlowFile that we are pulling from an input queue transformed. Perhaps we should name the variable simply flowFile? And then we can call the clone just simply clone rather than originalCloned? I think that would make the code a little easier to read personally.
There was a problem hiding this comment.
I agree with you @markap14, I think just calling this flowFile would be simpler and avoid some confusion.
|
Thanks for pinging @exceptionfactory. I do agree with the approach but think we should clean it up a bit more, as noted above. |
b8e0c57 to
9e27915
Compare
|
Thanks @markap14 and @exceptionfactory as always for the feedback. This cleaned up nicely with the usage of |
* Use cloned flow file as original * Transform inbound flow file to ensure it gets picked up in Provenance Events
9e27915 to
c874c6f
Compare
exceptionfactory
left a comment
There was a problem hiding this comment.
Thanks for the input @markap14, and thanks for making the updates @bobpaulin, the latest version looks good! +1 merging
…pache#9292) - Transform input FlowFile instead of cloned FlowFile Signed-off-by: David Handermann <exceptionfactory@apache.org>
Summary
NIFI-13779
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
Pull Request Tracking
NIFI-00000NIFI-00000Pull Request Formatting
mainbranchVerification
Please indicate the verification steps performed prior to pull request creation.
Build
mvn clean install -P contrib-checkLicensing
LICENSEandNOTICEfilesDocumentation