New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-5964] Allow spark-daemon.sh to support foreground operation #3881
[SPARK-5964] Allow spark-daemon.sh to support foreground operation #3881
Conversation
@hellertime would you mind filing a JIRA for this PR? |
Hi @hellertime, A really good use-case for this PR has just cropped up at #4720; I think that the |
@nchammas, it might be good to get your eyes on this one given your shell scripting expertise. |
Noticed that indentation of |
echo "failed to launch $command:" | ||
tail -2 "$log" | sed 's/^/ /' | ||
echo "full log in $log" | ||
fi | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor coding style thing, but could we add explicit comments to the else
branches to help reiterate what case it corresponds to? With the deep nesting of else
statements here, it's a little easy to get lost and not spot that this one refers to the "run in background / daemonize" case. So, maybe just write something like
else # run in background
would help.
Ok. JIRA ticket has been filed, noted ticket in the title of this PR. Happy to add the additional comment. |
echo "full log in $log" | ||
if [ "$RUN_IN_FOREGROUND" = "0" ]; then | ||
spark_rotate_log "$log" | ||
echo starting $command, logging to $log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this was unquoted? I think it should be
echo "starting $command, logging to $log"
otherwise there is room for word splitting bugs.
Sorry to spam this PR. Most of the word splitting issues are in code this PR didn't introduce, but since we're touching it here it's good to fix it up. Hope it's not too much @hellertime! |
Wow, very thorough review :) Let me whitelist this PR with Jenkins so that it gets tested when you push changes. |
Jenkins, this is ok to test. |
On a separate, less important note, is there an easy way to add the new command line parameter in a position-independent way? Right now it looks like |
Test build #27881 has finished for PR 3881 at commit
|
@nchammas wow, happy to make those edits. As for the long options parsing order, the approach currently used by the script (and the approach that I coopted) is limited. It would require some rewrite in order to allow for parsing the args out of order. Changes like that might be better in another PR, but it wouldn't be trouble to add them here, just a bit overreaching. |
Agreed. |
**NOTICE** Do NOT merge this, as we're waiting for #3881 to be merged. `HiveThriftServer2Suite` has been notorious for its flakiness for a while. This was mostly due to spawning and communicate with external server processes. This PR revamps this test suite for better robustness: 1. Fixes a racing condition occurred while using `tail -f` to check log file It's possible that the line we are looking for has already been printed into the log file before we start the `tail -f` process. This PR uses `tail -n +0 -f` to ensure all lines are checked. 2. Retries up to 3 times if the server fails to start In most of the cases, the server fails to start because of port conflict. This PR no longer asks the system to choose an available TCP port, but uses a random port first, and retries up to 3 times if the server fails to start. 3. A server instance is reused among all test cases within a single suite The original `HiveThriftServer2Suite` is splitted into two test suites, `HiveThriftBinaryServerSuite` and `HiveThriftHttpServerSuite`. Each suite starts a `HiveThriftServer2` instance and reuses it for all of its test cases. **TODO** - [ ] Starts the Thrift server in foreground once #3881 is merged (adding `--foreground` flag to `spark-daemon.sh`) <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4720) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes #4720 from liancheng/revamp-thrift-server-tests and squashes the following commits: d6c80eb [Cheng Lian] Relaxes server startup timeout 6f14eb1 [Cheng Lian] Revamped HiveThriftServer2Suite for robustness (cherry picked from commit f816e73) Signed-off-by: Cheng Lian <lian@databricks.com>
**NOTICE** Do NOT merge this, as we're waiting for #3881 to be merged. `HiveThriftServer2Suite` has been notorious for its flakiness for a while. This was mostly due to spawning and communicate with external server processes. This PR revamps this test suite for better robustness: 1. Fixes a racing condition occurred while using `tail -f` to check log file It's possible that the line we are looking for has already been printed into the log file before we start the `tail -f` process. This PR uses `tail -n +0 -f` to ensure all lines are checked. 2. Retries up to 3 times if the server fails to start In most of the cases, the server fails to start because of port conflict. This PR no longer asks the system to choose an available TCP port, but uses a random port first, and retries up to 3 times if the server fails to start. 3. A server instance is reused among all test cases within a single suite The original `HiveThriftServer2Suite` is splitted into two test suites, `HiveThriftBinaryServerSuite` and `HiveThriftHttpServerSuite`. Each suite starts a `HiveThriftServer2` instance and reuses it for all of its test cases. **TODO** - [ ] Starts the Thrift server in foreground once #3881 is merged (adding `--foreground` flag to `spark-daemon.sh`) <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4720) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes #4720 from liancheng/revamp-thrift-server-tests and squashes the following commits: d6c80eb [Cheng Lian] Relaxes server startup timeout 6f14eb1 [Cheng Lian] Revamped HiveThriftServer2Suite for robustness
Can one of the admins verify this patch? |
@hellertime have you had the chance to address those comments? This patch has mostly gone stale at this point and I would recommend that we close it for now. If you prefer feel free to open a new updated one. |
@andrewor14 I was under the impression all the shell scripts were getting refactored and that this patch had become obsolete. I agree it's best to close this out. Sent from my iPhone
|
@hellertime you will have to close it |
Yes, also most scripts are being refactored into python so there's less of a reason to do this. (Though the spark-daemon.sh ones are still in BASH). Thanks. |
Can anyone explain what happened to this PR? Why people close this PR without adding any support for running in foreground? |
It's right there at #3881 (comment) |
## What changes were proposed in this pull request? Add a SPARK_NO_DAEMONIZE environment variable flag to spark-daemon.sh that causes the process it would run to be run in the foreground. It looks like there has been some prior work in apache#3881, but there was some talk about these being refactored. I'm not sure if that happened or not, but that PR is almost 2 years old at this point so it was worth revisiting. ## How was this patch tested? ./dev/run-tests still seems to work. It doesn't look like these scripts have tests, but if I missed them just let me know. Author: Mike Ihbe <mikejihbe@gmail.com> Closes apache#15338 from mikejihbe/SPARK-11653.
## What changes were proposed in this pull request? Add a SPARK_NO_DAEMONIZE environment variable flag to spark-daemon.sh that causes the process it would run to be run in the foreground. It looks like there has been some prior work in apache#3881, but there was some talk about these being refactored. I'm not sure if that happened or not, but that PR is almost 2 years old at this point so it was worth revisiting. ## How was this patch tested? ./dev/run-tests still seems to work. It doesn't look like these scripts have tests, but if I missed them just let me know. Author: Mike Ihbe <mikejihbe@gmail.com> Closes apache#15338 from mikejihbe/SPARK-11653.
## What changes were proposed in this pull request? Add a SPARK_NO_DAEMONIZE environment variable flag to spark-daemon.sh that causes the process it would run to be run in the foreground. It looks like there has been some prior work in apache#3881, but there was some talk about these being refactored. I'm not sure if that happened or not, but that PR is almost 2 years old at this point so it was worth revisiting. ## How was this patch tested? ./dev/run-tests still seems to work. It doesn't look like these scripts have tests, but if I missed them just let me know. Author: Mike Ihbe <mikejihbe@gmail.com> Closes apache#15338 from mikejihbe/SPARK-11653.
Add
--foreground
option to spark-daemon.sh to prevent the process from daemonizing itself. Useful if running under a watchdog which waits on its child process.