NIFI-543 Added annotation to indicate processor should run only on Primary Node by zenfenan · Pull Request #2509 · apache/nifi

zenfenan · 2018-03-03T16:10:19Z

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically master)?
Is your initial contribution a single, squashed commit?

For code changes:

Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
Have you written or updated unit tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.

zenfenan · 2018-03-09T15:50:38Z

@markap14 Appreciate if you could take a look :)

markap14

@zenfenan thanks for the update! This mostly looks good, but I left some comments inline. I think a few minor tweaks needed, as it looks like as-is, there are some bits that expect the notion of 'Primary Node Only' to be configurable, but this really is not a configurable thing - it is hardcoded by the Processor developer. But otherwise, it looks good!

markap14 · 2018-03-27T13:29:38Z

...fi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/audit/ProcessorAuditor.java

            if (newConfig.getExecutionNode() != null) {
                values.put(EXECUTION_NODE, processor.getExecutionNode().name());
            }
+            if (newConfig.isExecutionNodeRestricted() != null) {


I don't believe this is needed here - it's not a configurable value. It's hardcoded by the developer.

markap14 · 2018-03-27T13:37:30Z

...k/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/dao/impl/StandardProcessorDAO.java

                    configDTO.getSchedulingPeriod(),
                    configDTO.getSchedulingStrategy(),
                    configDTO.getExecutionNode(),
+                    configDTO.isExecutionNodeRestricted(),


This should not be checked, as it's not a configuration element that the user is able to configure.

markap14 · 2018-03-27T13:39:57Z

...-framework/nifi-client-dto/src/main/java/org/apache/nifi/web/api/dto/ProcessorConfigDTO.java

    private String comments;
    private String customUiUrl;
    private Boolean lossTolerant;
+    private Boolean executionNodeRestricted;


This is not something that is configurable, so I think we want to put this in the ProcessorDTO, not the ProcessorConfigDTO. This is where we store other 'flags' about a processor, such as restricted, deprecation, supportsBatching, etc, etc.

Understood. Makes more sense now. I'll move it to ProcessorDTO. One question though, I added the setter in DTOFactory.createProcessorDTO(), it has to be added to the DTOFactory.copy() as well, right? Where and how the copy() call is used?

Yes, it will need to be added to DtoFactory.copy() as well. That method is used when you copy & paste a processor for example.

markap14 · 2018-03-27T13:46:02Z

...ramework-core/src/main/java/org/apache/nifi/controller/serialization/FlowFromDOMFactory.java

        configDto.setYieldDuration(getString(element, "yieldPeriod"));
        configDto.setBulletinLevel(getString(element, "bulletinLevel"));
        configDto.setLossTolerant(getBoolean(element, "lossTolerant"));
+        configDto.setExecutionNodeRestricted(getBoolean(element, "executionNodeRestricted"));


We shouldn't be looking for an "executionNodeRestricted" element in the flow here - it won't be there because this is not a configuration element.

markap14 · 2018-03-27T13:49:22Z

@zenfenan one other note - I did notice a checkstyle violation (unused import) in StandardProcessorNode.

zenfenan · 2018-03-30T15:00:00Z

@markap14 Applied the changes and also rebased against the latest master.

markap14

@zenfenan thanks, I think we're almost there! Just a couple of minor tweaks that i recommended inline, and then I think we're good to merge. Thanks!

markap14 · 2018-04-09T19:57:21Z

...fi-framework/nifi-web/nifi-web-ui/src/main/webapp/js/nf/canvas/nf-processor-configuration.js

-                    // show the execution node option if we're cluster or we're currently configured to run on the primary node only
-                    if (nfClusterSummary.isClustered() || executionNode === 'PRIMARY') {
+                    // show the execution node option if we're clustered and execution node is not restricted to run only in primary node
+                    if (nfClusterSummary.isClustered() && executionNodeRestricted !== true) {


I think we still need the executionNode === 'PRIMARY' here:

if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) || executionNode === 'PRIMARY') {

This way, if running in standalone mode, but the processor is marked with an ExecutionNode of Primary Node (which may be the case if instantiating a template from a cluster, or if if copying a flow.xml.gz over or something like that) we still have the ability to change it to 'All Nodes'.

No. If we add executionNode === 'PRIMARY' then the UI will show the dropdown menu for ExecutionNodes. Moreover, why should we need the ability to change to All Nodes in the said scenario because it is running in standalone mode, right? Primary Node or All Nodes doesn't make sense in standalone mode. Correct?

We can make it this: if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) || (!nfClusterSummary.isClustered() && executionNode === 'PRIMARY')) but I don't understand why we are doing the executionNode === 'PRIMARY' check, for the same reason I had mentioned above.

I believe the executionNode === 'PRIMARY' was in place to ensure the currently configured value is shown. If the current value is set to PRIMARY, but this instance is no longer clustered we need to render that fact. Once the user reconfigures this value, they will no longer be able to select this option (since the node isn't clustered and executionNode would be ALL). Hope this makes sense.

Understood. So I think if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) || (!nfClusterSummary.isClustered() && executionNode === 'PRIMARY')) will do the job. Correct?

@mcgilman With the current state of the commit, if a processor previously didn't have executionNodeRestricted and the executionNode was ALL, it will still be hidden. Now I understand that shouldn't be the case. Thanks for the clarification so I think the following will address this:

var getExecutionNodeOptions = function (processor) { return [{ text: 'All nodes', value: 'ALL', description: 'Processor will be scheduled to run on all nodes', disabled: processor.executionNodeRestricted === true }, { text: 'Primary node', value: 'PRIMARY', description: 'Processor will be scheduled to run only on the primary node', disabled: !nfClusterSummary.isClustered() && processor.config['executionNode'] === 'PRIMARY' }]; };

And changing the code to show when execution-node-options gets shown in the UI to the following:

if ((nfClusterSummary.isClustered() && executionNodeRestricted !== true) || (!nfClusterSummary.isClustered() && executionNode === 'PRIMARY') || (executionNodeRestricted === true && executionNode === 'ALL')) { $('#execution-node-options').show(); } else { $('#execution-node-options').hide(); }

There are two much checks in the if but I think they are needed to address this. If that can be optimized or modified, do let me know.

@markap14 @mcgilman I've tested the above and verified it to be working as expected. Request you to review this approach and if you don't have any concerns, I'll make the changes and commit it.

@mcgilman @markap14 Bump... Any update? If you're okay, I can make the commit.

@zenfenan Sorry, I missed the mentions here. Always feel free to push subsequent commits as we're working through the PR process. It's much easier to review when I pull down the changes locally. What your suggesting above is exactly what I was thinking. I agree that the conditional is a bit confusing. I wonder if we attempt to break it up a little bit if it would be more clear. Maybe something like

if (nfClusterSummary.isClustered()) { if (executionNodeRestricted !== true || executionNode === 'ALL') { $('#execution-node-options').show(); } else { $('#execution-node-options').hide(); } } else { if (executionNode === 'PRIMARY') { $('#execution-node-options').show(); } else { $('#execution-node-options').hide(); } }

Thoughts?

Yep. It's better. I've added it and pushed the commit. Appreciate if you could test it out and confirm the changes :)

markap14 · 2018-04-09T19:57:39Z

...ork-bundle/nifi-framework/nifi-web/nifi-web-ui/src/main/webapp/js/nf/nf-processor-details.js


                    // only show the execution-node when applicable
-                    if (nfClusterSummary.isClustered() || executionNode === 'PRIMARY') {
+                    if (nfClusterSummary.isClustered() && executionNodeRestricted !== true) {


I think we need the executionNode === 'PRIMARY' to still be considered here as well.

markap14 · 2018-04-09T19:59:11Z

nifi-docs/src/main/asciidoc/developer-guide.adoc


+    - `PrimaryNodeOnly`: Apache NiFi, when clustered, offers two modes of execution for Processors: "Primary Node" and
+        "All Nodes". Although running in all the nodes offers better parallelism, some Processors are known to cause unintended
+        behaviors when run in multiple nodes. For instance, some Processors lists or reads files from remote filesystems. If such


Typo in the docs: think it should read "some Processors list or read files" rather than "lists of reads"

mcgilman · 2018-04-30T18:47:34Z

Hey @zenfenan... So I just checked out the updated PR. Things seem to be running as suggested, however, I'm wondering if it makes sense to improve it a little and in the process reduce the complexity of some of that code.

Specifically, I'm referring to when we show the Execution drop down. I just stood up a standalone instance. I dropped on two processors. One that had @PrimaryNodeOnly and one that did not. For the processor that was marked @PrimaryNodeOnly we showed the Execution drop down because it's configured value was Primary node. This verifies the existing behavior but is a little weird because previously the only way to get into this state was by having this node be part of a cluster. Now with this PR, it is the default value even in a standalone case. The other processor which was not marked with @PrimaryNodeOnly we did not show the Execution drop down. I think this distinction may be confusing to users since the context of this node previously being part of a cluster is no longer the case.

I wanted to get your thoughts on taking a slightly different approach. What if we always showed the Execution drop down and Primary node was always an allowed option. We could either remove or disable the All option conditionally based on the presence of the @PrimaryNodeOnly annotation. This would allow us to remove the really confusing conditionals that drive the visibility of the Execution field.

If we opted for this approach, we should probably update the tooltip/info icon for this field to indicate that when clustered, this drives which node(s) the processor will be scheduled on. The other benefit to this approach is that it will allow for users to build a flow on a standalone instance (including the appropriate execution nodes) before saving it to the Registry where the flow may be imported into a cluster.

markap14 · 2018-04-30T18:51:08Z

@zenfenan I think there's also one other detail that I missed. The intent here, I believe, is not just to default to Primary Node execution mode when the @PrimaryNodeOnly annotation is present, but to actually enforce that the processor always use Primary Node execution mode if it has the annotation. Is that correct? If so, then I think we need to update the setExecutionMode() method to ignore the provided value and use ExecutionMode.PRIMARY_NODE if the annotation is present. Otherwise, there is no enforcement guaranteed.

zenfenan · 2018-05-01T06:56:35Z

@mcgilman I understand the points you have made. They are valid. In particular, I like the last one which is how this might have benefit in building a flow in a standalone node with appropriate nodes and then save it to the registry.

I just have three questions:

All Nodes option should be disabled if the component is marked as PrimaryNodeOnly. That should be the case in both standalone as well as cluster scenario. Correct?
Processors which were previously added to the canvas i.e. those who were not executionNodeRestricted and had executionNode as ALL but after this change, those processors will have the All Nodes option disabled and the user will only have the ability to switch to Primary Node, post which the user will never be able to select All Nodes again. Technically that's correct since the processor is supposed to be run only on the primary node. You feel that is okay? Do you see any inconsistency in UX?
One more thing is, in general, I used to feel that Execution Nodes doesn't make sense in a standalone context so showing the dropdown still has got me thinking whether we should display it or not. But your point on showing the dropdown for all the components and in all cases (standalone or cluster) seems to make it consistent and also the understanding on how this will be well suited in actual enterprise scenarios where most enterprise customers probably use standalone node to get the flow/template ready and save it to registry and will actually import it to a cluster. The question, here, again is with respect to the UX. With this change, we will be showing the dropdown all the time for all the processors in standalone case. Do you see any issues/inconsistency with the way the user would perceive this new change?

zenfenan · 2018-05-01T07:09:24Z

@markap14 I thought about it actually but since setExecutionNode gets called from FlowController/StandardProcessorGroup but regardless of from where it gets called they all pass the value from the ProcessorConfigDTO to setExecutionNode and since this change takes care of setting the value of executionNode to PRIMARY in the DTO, I thought it will be okay. I'll still add it though to enforce it. But just out of curiosity I'm asking, do you see any specific scenarios which might make this non-guaranteed?

mcgilman · 2018-05-02T20:20:19Z

@zenfenan Yeah I think we're on the same page here. Sorry for the different suggestions earlier but I think we're ultimately getting it right here...

All nodes should be disabled if the component is marked PrimaryNodeOnly regardless of clustered or standalone. However, I think we should update the tooltip for this field to indicate that it's applicable when clustered. When not clustered, All nodes and Primary Node are effectively the same thing.
If a processor has been updated to PrimaryNodeOnly and its configuration was previously All nodes it should retain that value initially. However, given the first bullet above, that option will be disabled and would only allow the user to update it to Primary node going forward.
The field should be always visible regardless of clustered or standalone. It will allow folks to build flows which may be deployed in a cluster in the future. It also allows us to remove some of that unnecessarily complex code when determining the appropriate visibility.

Thanks

zenfenan

@mcgilman @markap14 I have made the changes as per the discussion. Please take a look.

…imary node

zenfenan

@mcgilman Ready to be merged?

mcgilman · 2018-05-08T17:07:40Z

@zenfenan I'm a +1 on the front end changes. I believe that @markap14 was reviewing the back end changes. Let's make sure he's good as well before it's merged. Thanks.

zenfenan

@markap14 It had merge conflicts. I resolved them now. If possible, please take a look. @mcgilman had mentioned that front end is fine so if everything on the backend is also good, we can close this one as it seems to be sitting in the queue for quite sometime. Thanks for taking time in reviewing this!

markap14 · 2018-05-25T17:25:57Z

@zenfenan sorry for the delay and all the back-and-forth on this one, but I've merged to master now! Many thanks for the PR and sticking with it through the end. Definitely nice to have this improvement merged in.

zenfenan force-pushed the NIFI543 branch 2 times, most recently from 7d7b3af to 53152c2 Compare March 19, 2018 11:47

markap14 requested changes Mar 27, 2018

View reviewed changes

zenfenan force-pushed the NIFI543 branch 4 times, most recently from 55f16d8 to 72dec83 Compare March 30, 2018 14:56

markap14 requested changes Apr 9, 2018

View reviewed changes

zenfenan force-pushed the NIFI543 branch from 72dec83 to 541f412 Compare April 26, 2018 15:55

zenfenan force-pushed the NIFI543 branch from 51dec5b to 5ad7cdc Compare May 3, 2018 17:37

zenfenan commented May 3, 2018

View reviewed changes

zenfenan added 5 commits May 5, 2018 15:36

NIFI-543 Added annotation to restrict processor to run only on the pr…

099fd9c

…imary node

NIFI-543: PR review changes

bafe903

NIFI-543: PR Review Fixes apache#2

18ffaf6

NIFI-543: PR Fix - 'Execution' dropdown will now be shown in all cases

f58d7ec

NIFI-543: Annotated ListGCSBucket with PrimaryNodeOnly

cc0fd7a

zenfenan force-pushed the NIFI543 branch from 5ad7cdc to cc0fd7a Compare May 5, 2018 10:09

zenfenan commented May 8, 2018

View reviewed changes

Merge branch 'master' into NIFI543

cb1cfff

zenfenan commented May 23, 2018

View reviewed changes

asfgit closed this in 0973c2d May 25, 2018

Conversation

zenfenan commented Mar 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

For all changes:

For code changes:

For documentation related changes:

Note:

Uh oh!

zenfenan commented Mar 9, 2018

Uh oh!

markap14 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markap14 commented Mar 27, 2018

Uh oh!

zenfenan commented Mar 30, 2018

Uh oh!

markap14 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcgilman Apr 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcgilman commented Apr 30, 2018

Uh oh!

markap14 commented Apr 30, 2018

Uh oh!

zenfenan commented May 1, 2018

Uh oh!

zenfenan commented May 1, 2018

Uh oh!

mcgilman commented May 2, 2018

Uh oh!

zenfenan left a comment

Choose a reason for hiding this comment

Uh oh!

zenfenan left a comment

Choose a reason for hiding this comment

Uh oh!

mcgilman commented May 8, 2018

Uh oh!

zenfenan left a comment

Choose a reason for hiding this comment

zenfenan commented Mar 3, 2018 •

edited

Loading

mcgilman Apr 25, 2018 •

edited

Loading