NIFI-543 Added annotation to indicate processor should run only on Primary Node#2509
NIFI-543 Added annotation to indicate processor should run only on Primary Node#2509zenfenan wants to merge 6 commits intoapache:masterfrom
Conversation
|
@markap14 Appreciate if you could take a look :) |
7d7b3af to
53152c2
Compare
markap14
left a comment
There was a problem hiding this comment.
@zenfenan thanks for the update! This mostly looks good, but I left some comments inline. I think a few minor tweaks needed, as it looks like as-is, there are some bits that expect the notion of 'Primary Node Only' to be configurable, but this really is not a configurable thing - it is hardcoded by the Processor developer. But otherwise, it looks good!
| if (newConfig.getExecutionNode() != null) { | ||
| values.put(EXECUTION_NODE, processor.getExecutionNode().name()); | ||
| } | ||
| if (newConfig.isExecutionNodeRestricted() != null) { |
There was a problem hiding this comment.
I don't believe this is needed here - it's not a configurable value. It's hardcoded by the developer.
| configDTO.getSchedulingPeriod(), | ||
| configDTO.getSchedulingStrategy(), | ||
| configDTO.getExecutionNode(), | ||
| configDTO.isExecutionNodeRestricted(), |
There was a problem hiding this comment.
This should not be checked, as it's not a configuration element that the user is able to configure.
| private String comments; | ||
| private String customUiUrl; | ||
| private Boolean lossTolerant; | ||
| private Boolean executionNodeRestricted; |
There was a problem hiding this comment.
This is not something that is configurable, so I think we want to put this in the ProcessorDTO, not the ProcessorConfigDTO. This is where we store other 'flags' about a processor, such as restricted, deprecation, supportsBatching, etc, etc.
There was a problem hiding this comment.
Understood. Makes more sense now. I'll move it to ProcessorDTO. One question though, I added the setter in DTOFactory.createProcessorDTO(), it has to be added to the DTOFactory.copy() as well, right? Where and how the copy() call is used?
There was a problem hiding this comment.
Yes, it will need to be added to DtoFactory.copy() as well. That method is used when you copy & paste a processor for example.
| configDto.setYieldDuration(getString(element, "yieldPeriod")); | ||
| configDto.setBulletinLevel(getString(element, "bulletinLevel")); | ||
| configDto.setLossTolerant(getBoolean(element, "lossTolerant")); | ||
| configDto.setExecutionNodeRestricted(getBoolean(element, "executionNodeRestricted")); |
There was a problem hiding this comment.
We shouldn't be looking for an "executionNodeRestricted" element in the flow here - it won't be there because this is not a configuration element.
|
@zenfenan one other note - I did notice a checkstyle violation (unused import) in StandardProcessorNode. |
55f16d8 to
72dec83
Compare
|
@markap14 Applied the changes and also rebased against the latest master. |
| // show the execution node option if we're cluster or we're currently configured to run on the primary node only | ||
| if (nfClusterSummary.isClustered() || executionNode === 'PRIMARY') { | ||
| // show the execution node option if we're clustered and execution node is not restricted to run only in primary node | ||
| if (nfClusterSummary.isClustered() && executionNodeRestricted !== true) { |
There was a problem hiding this comment.
I think we still need the executionNode === 'PRIMARY' here:
if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) || executionNode === 'PRIMARY') {
This way, if running in standalone mode, but the processor is marked with an ExecutionNode of Primary Node (which may be the case if instantiating a template from a cluster, or if if copying a flow.xml.gz over or something like that) we still have the ability to change it to 'All Nodes'.
There was a problem hiding this comment.
No. If we add executionNode === 'PRIMARY' then the UI will show the dropdown menu for ExecutionNodes. Moreover, why should we need the ability to change to All Nodes in the said scenario because it is running in standalone mode, right? Primary Node or All Nodes doesn't make sense in standalone mode. Correct?
There was a problem hiding this comment.
We can make it this: if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) || (!nfClusterSummary.isClustered() && executionNode === 'PRIMARY')) but I don't understand why we are doing the executionNode === 'PRIMARY' check, for the same reason I had mentioned above.
There was a problem hiding this comment.
I believe the executionNode === 'PRIMARY' was in place to ensure the currently configured value is shown. If the current value is set to PRIMARY, but this instance is no longer clustered we need to render that fact. Once the user reconfigures this value, they will no longer be able to select this option (since the node isn't clustered and executionNode would be ALL). Hope this makes sense.
There was a problem hiding this comment.
Understood. So I think if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) || (!nfClusterSummary.isClustered() && executionNode === 'PRIMARY')) will do the job. Correct?
There was a problem hiding this comment.
@mcgilman With the current state of the commit, if a processor previously didn't have executionNodeRestricted and the executionNode was ALL, it will still be hidden. Now I understand that shouldn't be the case. Thanks for the clarification so I think the following will address this:
var getExecutionNodeOptions = function (processor) {
return [{
text: 'All nodes',
value: 'ALL',
description: 'Processor will be scheduled to run on all nodes',
disabled: processor.executionNodeRestricted === true
}, {
text: 'Primary node',
value: 'PRIMARY',
description: 'Processor will be scheduled to run only on the primary node',
disabled: !nfClusterSummary.isClustered() && processor.config['executionNode'] === 'PRIMARY'
}];
};
And changing the code to show when execution-node-options gets shown in the UI to the following:
if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) ||
(!nfClusterSummary.isClustered() && executionNode === 'PRIMARY') ||
(executionNodeRestricted === true && executionNode === 'ALL')) {
$('#execution-node-options').show();
} else {
$('#execution-node-options').hide();
}
There are two much checks in the if but I think they are needed to address this. If that can be optimized or modified, do let me know.
There was a problem hiding this comment.
@zenfenan Sorry, I missed the mentions here. Always feel free to push subsequent commits as we're working through the PR process. It's much easier to review when I pull down the changes locally. What your suggesting above is exactly what I was thinking. I agree that the conditional is a bit confusing. I wonder if we attempt to break it up a little bit if it would be more clear. Maybe something like
if (nfClusterSummary.isClustered()) {
if (executionNodeRestricted !== true || executionNode === 'ALL') {
$('#execution-node-options').show();
} else {
$('#execution-node-options').hide();
}
} else {
if (executionNode === 'PRIMARY') {
$('#execution-node-options').show();
} else {
$('#execution-node-options').hide();
}
}
Thoughts?
There was a problem hiding this comment.
Yep. It's better. I've added it and pushed the commit. Appreciate if you could test it out and confirm the changes :)
|
|
||
| // only show the execution-node when applicable | ||
| if (nfClusterSummary.isClustered() || executionNode === 'PRIMARY') { | ||
| if (nfClusterSummary.isClustered() && executionNodeRestricted !== true) { |
There was a problem hiding this comment.
I think we need the executionNode === 'PRIMARY' to still be considered here as well.
|
|
||
| - `PrimaryNodeOnly`: Apache NiFi, when clustered, offers two modes of execution for Processors: "Primary Node" and | ||
| "All Nodes". Although running in all the nodes offers better parallelism, some Processors are known to cause unintended | ||
| behaviors when run in multiple nodes. For instance, some Processors lists or reads files from remote filesystems. If such |
There was a problem hiding this comment.
Typo in the docs: think it should read "some Processors list or read files" rather than "lists of reads"
|
Hey @zenfenan... So I just checked out the updated PR. Things seem to be running as suggested, however, I'm wondering if it makes sense to improve it a little and in the process reduce the complexity of some of that code. Specifically, I'm referring to when we show the Execution drop down. I just stood up a standalone instance. I dropped on two processors. One that had I wanted to get your thoughts on taking a slightly different approach. What if we always showed the Execution drop down and If we opted for this approach, we should probably update the tooltip/info icon for this field to indicate that when clustered, this drives which node(s) the processor will be scheduled on. The other benefit to this approach is that it will allow for users to build a flow on a standalone instance (including the appropriate execution nodes) before saving it to the Registry where the flow may be imported into a cluster. |
|
@zenfenan I think there's also one other detail that I missed. The intent here, I believe, is not just to default to Primary Node execution mode when the @PrimaryNodeOnly annotation is present, but to actually enforce that the processor always use Primary Node execution mode if it has the annotation. Is that correct? If so, then I think we need to update the setExecutionMode() method to ignore the provided value and use ExecutionMode.PRIMARY_NODE if the annotation is present. Otherwise, there is no enforcement guaranteed. |
|
@mcgilman I understand the points you have made. They are valid. In particular, I like the last one which is how this might have benefit in building a flow in a standalone node with appropriate nodes and then save it to the registry. I just have three questions:
|
|
@markap14 I thought about it actually but since |
|
@zenfenan Yeah I think we're on the same page here. Sorry for the different suggestions earlier but I think we're ultimately getting it right here...
Thanks |
zenfenan
left a comment
There was a problem hiding this comment.
@markap14 It had merge conflicts. I resolved them now. If possible, please take a look. @mcgilman had mentioned that front end is fine so if everything on the backend is also good, we can close this one as it seems to be sitting in the queue for quite sometime. Thanks for taking time in reviewing this!
|
@zenfenan sorry for the delay and all the back-and-forth on this one, but I've merged to master now! Many thanks for the PR and sticking with it through the end. Definitely nice to have this improvement merged in. |
Thank you for submitting a contribution to Apache NiFi.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.