-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-4208] Support Running Flink processes in foreground mode #2239
Conversation
nohup ignores the HUP terminal signals, so the process is still alive even after the terminal ends
@aljoscha As discussed in FLINK-4118 I am doing this to support the use case of running flink as a docker container and eventually remove the dependency on python and supervisord from the docker image. |
I hope this gets into 1.1.0 so I can push the changes to the docker image once this is available in the official binary distribution. |
Hi @iemejia, is the situation with Docker that if the Flink processes are started as daemons and the script returns that Docker assumes the process has terminated? Skipping the pid file might work fine for a container where one wouldn't start multiple TaskManagers but would cause logfile issues otherwise. We do need to add locking to the pid file. |
Hi, Yes, this is exactly the situation, in a previous pull request I was optimizing the flink docker image, however I found that the image used supervisord to catch and keep alive those daemons, so I wanted to remove this dependency (because it adds around 40MB to the image + python and some extra stuff). Can you give me some hints on the best way to address this ? Or how can I improve my current approach (notice that I took the start-foreground idea from zookeeper). |
echo "Starting $DAEMON daemon on host $HOSTNAME." | ||
$JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null & | ||
nohup $JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 2>&1 < /dev/null & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances is the nohup
necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because I was trying to grasp what makes a daemon a daemon and I found a reference that convinced me that nohup was missing:
https://stackoverflow.com/questions/3430330/best-way-to-make-a-shell-script-daemon
Additionally when I looked for inspiration for my changes (the start-foreground name), I look
at how they started the daemon in zookeeper and I noticed they use nohup
too.
https://github.com/apache/zookeeper/blob/trunk/bin/zkServer.sh#L219
This is an extra thing and not the core of the Pull Request, if you don't agree
I can rebase and remove that commit, but I think it is worth the addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent documentation. It looks like (from a grep daemon
through the code) that Flink is daemonizing its threads internally. Perhaps @StephanEwen can look at this.
What if instead of changing how we start the daemon (so continue to always start as a background process), we instead add a If FLINK-4212 is accepted we would also need to release the file lock before calling |
I tried not to change the current daemon behavior, that's the reason why I took
|
Using |
Thanks Greg, I was probably too tired last night because I put the wait in a weird place, I just tried now and everything is working, it is still not 'real' foreground because Ctrl-C gets captured by the 'wait', but it fixes the issue for the docker image so I am going to close this pull request and the issue. |
Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the How To Contribute guide.
In addition to going through the list, please provide a meaningful description of your changes.
mvn clean verify
has been executed successfully locally or a Travis build has passed