-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIFI-8123: Added support for --wait-for-init when NiFi started with t… #4748
Conversation
…he 'start' command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvement @pgyori! I think this is helpful, but there are a few rough corners that I think we should try to round out. Added some comments inline. A couple of the comments are very minor. I do think it's important, though, that if NiFi fails to startup that we stop polling/waiting for the initialization to complete.
@@ -215,6 +220,12 @@ public void run() { | |||
|
|||
writeDiagnostics(socket.getOutputStream(), verbose); | |||
break; | |||
case IS_LOADED: | |||
logger.info("Received IS_LOADED request from Bootstrap"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably make this a DEBUG log. Since this is happening every 2 seconds, we don't really want to see it constantly in the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, thanks! Modified it in the next commit.
echo "Exited the script due to --wait-for-init timeout" | ||
break; | ||
fi | ||
is_nifi_loaded=$( eval "cd ${NIFI_HOME} && ${run_bootstrap_cmd} is_loaded" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If NiFi takes a while to load, the user experience can leave the use wondering if anything is happening, or if something is 'stuck', etc. Probably best to provide some feedback here. Perhaps just an echo NiFi has not fully initialized yet....
type of thing. Perhaps not every 2 seconds though. Maybe every 5 iterations (10 seconds) or so would make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. The $( eval "... is_loaded" ) call also takes a little time (besides the 2-second sleep), so I introduced a time-based feedback "mechanism" that will write on the standard output approximately every 10 seconds.
final Logger logger = defaultLogger; | ||
final Integer port = getCurrentPort(logger); | ||
if (port == null) { | ||
logger.info("Apache NiFi is not currently running"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If NiFi fails to start, we see this in the logs. However, the --wait-for-init
results in this constantly running, logging this over and over. If we encounter a case where NiFi is not running, it should not continue waiting for it to initialize, since we know that it will never complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the code so that when the isNiFiFullyLoaded() method runs into this if branch, the nifi.sh script will detect that NiFi hasn't been started so there is no point in waiting. This logic is built on the premise that (port == null) means that NiFi hasn't been started. However, I found that when calling 'bin/nifi.sh start --wait-for-init', the first time isNiFiFullyLoaded() is executed happens before NiFi manages to reserve itself a port. So in my current solution (next commit) in the nifi.sh script we only assume that NiFi hasn't even been started if isNiFiFullyLoaded() runs into (port == null) 3 times (2 seconds pass between each call). Of course, it can result in exiting the nifi.sh script if it takes more than 4 seconds for NiFi to reserve a port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you consider to extract the log message into a constant as it is duplicated at 746. and 815. rows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case, I would prefer to leave it like that. There are many log messages in this class that would need to be extracted as constants, and the part of the class where the constants are, would be 'littered' with the crowd of log messages. In this case, I think extracting the messages would decrease readability.
WAIT_FOR_INIT_DEFAULT_TIMEOUT increased to 15 minutes. IS_LOADED only logged on DEBUG level. The scripts sends feedback to the user about the initialization status (every 10 seconds). If NiFi fails to start, the script exits.
logger.debug("Established connection to NiFi instance."); | ||
socket.setSoTimeout(60000); | ||
|
||
logger.debug("Sending " + request + " Command to port {}", port); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use built-in formatting instead of concatenation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. The fix will be in my next commit.
final OutputStream socketOut = socket.getOutputStream(); | ||
|
||
final Properties nifiProps = loadProperties(logger); | ||
final String secretKey = nifiProps.getProperty("secret.key"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
secretKey hides the variable at row 124, can it cause any problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Each running NiFi instance has 1 secret key. We should not rely on the instance variable, as the bootstrap process can (and in this wait-for-init feature it IS) started over and over again 'independently' from the running NiFi instance, so the value of the secret key needs to be loaded from the property file.
|
||
private static class NiFiNotRunningException extends Exception { | ||
@Override | ||
public Throwable fillInStackTrace() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you make this synchronized to match parent class implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks!
@@ -46,6 +46,7 @@ | |||
|
|||
private volatile Listener listener; | |||
private volatile ServerSocket serverSocket; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is not your modification but this one is interesting, so just pointing out: objects shouldn't be marked volatile. "Marking a mutable object field volatile means the object reference is volatile but the object itself is not, and other threads may not see updates to the object state... For mutable objects, the volatile should be removed, and some other method should be used to ensure thread-safety, such as synchronization, or ThreadLocal storage."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, in the cases above, it actually is the reference that needs to be volatile since the value of the reference is changed from multiple threads and we need to make sure that no matter which thread reads the object behind this reference, it gets to the current object and not an outdated previous instance.
Built-in formatting used for a log message instead of concatenation. A custom exception class' method made synchronized.
Great work @pgyori, this all looks good to me! Did some verification of corner cases such as causing nifi to die during startup, etc. All appears to work as expected. +1 merged to main. Thanks! |
Thank you, @markap14 ! |
…he 'start' command This closes apache#4748. Signed-off-by: Mark Payne <markap14@hotmail.com>
…he 'start' command This closes apache#4748. Signed-off-by: Mark Payne <markap14@hotmail.com>
…he 'start' command
https://issues.apache.org/jira/browse/NIFI-8123
Description of PR
Enables starting NiFi with "bin/nifi.sh start --wait-for-init N" (where N is a positive integer) in which case the startup script will wait (up to N seconds) for NiFi to finish loading all components before returning.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically
main
)?Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not
squash
or use--force
when pushing to allow for clean monitoring of changes.For code changes:
mvn -Pcontrib-check clean install
at the rootnifi
folder?LICENSE
file, including the mainLICENSE
file undernifi-assembly
?NOTICE
file, including the mainNOTICE
file found undernifi-assembly
?.displayName
in addition to .name (programmatic access) for each of the new properties?For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.