-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-24534][K8S] Bypass non spark-on-k8s commands #21572
Conversation
Thanks @rimolive ! Can you please prepend |
@rimolive I think you may need to rebase this to latest head of master, to pick up the pyspark updates |
@erikerlandson I swear i did a rebase before push. I'll double check this. |
787b904
to
ec0ae36
Compare
@rimolive there are some significant differences between the head of master and your file. I'm wondering if you should get the head of master and re-edit from there, because rebasing didn't seem to sync it |
CMD=( | ||
"$SPARK_HOME/bin/spark-class" | ||
"org.apache.spark.deploy.k8s.SparkPodInitContainer" | ||
"$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any place in this entrypoint where $@ is being referenced by a spark-on-k8s command still needs the shift that was taken out on line 44, correct? For the unknown command case we just want a passthrough, but the main case statement historically is expecting the spark-on-k8s command to be stripped. The easiest/clearest way to preserve this might just be an extra case statement like this at line 44:
case "$SPARK_K8S_CMD" in
driver | executor | init)
shift 1
;;
esac
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing the shift for spark-on-k8s commands
ec0ae36
to
e4acc44
Compare
@@ -38,10 +38,10 @@ fi | |||
|
|||
SPARK_K8S_CMD="$1" | |||
if [ -z "$SPARK_K8S_CMD" ]; then | |||
echo "No command to execute has been provided." 1>&2 | |||
exit 1 | |||
echo "No command to execute has been provided. Ignoring spark-on-k8s workflow..." 1>&2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd propose:
"No SPARK_K8S_CMD provided: proceeding in pass-through mode..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. This message is better.
@@ -110,8 +110,7 @@ case "$SPARK_K8S_CMD" in | |||
;; | |||
|
|||
*) | |||
echo "Unknown command: $SPARK_K8S_CMD" 1>&2 | |||
exit 1 | |||
CMD=("$@") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should log a message here too, about "executing in pass-through mode" since this is the guaranteed code path for pass through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I'll make the change.
@@ -38,10 +38,10 @@ fi | |||
|
|||
SPARK_K8S_CMD="$1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see two possible pass-through conditions here: one is "empty SPARK_K8S_CMD" and the other is "SPARK_K8S_CMD is non empty but has non-spark command in it" Is that the convention, or is the pass-through case always expected to be "empty SPARK_K8S_CMD" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handled by the case block, because pass-through mode will be used either if SPARK_K8S_CMD is empty or has a non spark-on-k8s command. I tested both scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, I concur, this is the way the script works historically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it does the right thing w.r.t. shift
in both cases?
If it is non-empty but also not one of the three spark commands, shift removes the first argument, then it falls into the "passthrough" case, but then is "$@" missing the first arg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erikerlandson yes, after a shift the leading arg is gone from "$@"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmckayus that seems wrong, since the CMD
that gets executed after the case
is now missing the first element, am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the else case is not sufficient to handle the logic for shift
exit 1 | ||
echo "No command to execute has been provided. Ignoring spark-on-k8s workflow..." 1>&2 | ||
else | ||
shift 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't quite work, the -z test is effectively checking only whether $1 was empty or not.
If it's non-empty, but it is not a recognized spark-on-k8s command (ie driver, driver-py, or executor), it's a passthrough command and therefore we cannot shift anything. As it is, this would consume something like "/usr/libexec/s2i/assembly.sh" and make it disappear.
Personally, I would do somethng like this and take an early out in the unsupported case, skipping all the other environment processing
case "$SPARK_K8S_CMD" in
driver | driver-py | executor)
shift 1
;;
*)
echo "No SPARK_K8S_CMD provided: proceeding in pass-through mode..."
exec /sbin/tini -s -- "$@"
;;
esac
"") | ||
;; | ||
*) | ||
echo "No SPARK_K8S_CMD provided: proceeding in pass-through mode..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me. Looking at this simplified logic, the logging message should probably be more like:
"Non-spark-k8s command provided, proceeding in pass-through mode..."
ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending testing
please test this |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Kubernetes integration test starting |
Kubernetes integration test status success |
Jenkins, test this please |
Kubernetes integration test starting |
Test build #91965 has finished for PR 21572 at commit
|
Kubernetes integration test status failure |
Error from QA is unrelated to PR. We will need to wait for the PRB to be reconfigured for this to be properly tested. What integration tests have you run to handle this change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @rimolive
Jenkins, test this please |
Kubernetes integration test starting |
Kubernetes integration test status success |
testing environment is synced again and its passing, so I'm going to merge |
What changes were proposed in this pull request?
This PR changes the entrypoint.sh to provide an option to run non spark-on-k8s commands (init, driver, executor) in order to let the user keep with the normal workflow without hacking the image to bypass the entrypoint
How was this patch tested?
This patch was built manually in my local machine and I ran some tests with a combination of
docker run
commands.