Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24534][K8S] Bypass non spark-on-k8s commands #21572

Closed
wants to merge 3 commits into from

Conversation

rimolive
Copy link
Contributor

What changes were proposed in this pull request?

This PR changes the entrypoint.sh to provide an option to run non spark-on-k8s commands (init, driver, executor) in order to let the user keep with the normal workflow without hacking the image to bypass the entrypoint

How was this patch tested?

This patch was built manually in my local machine and I ran some tests with a combination of docker run commands.

@erikerlandson
Copy link
Contributor

Thanks @rimolive ! Can you please prepend [SPARK-24534][K8S] to the title of this PR?

@erikerlandson
Copy link
Contributor

@rimolive I think you may need to rebase this to latest head of master, to pick up the pyspark updates

@rimolive rimolive changed the title Bypass non spark-on-k8s commands [SPARK-24534][K8S] Bypass non spark-on-k8s commands Jun 15, 2018
@rimolive
Copy link
Contributor Author

@erikerlandson I swear i did a rebase before push. I'll double check this.

@erikerlandson
Copy link
Contributor

@rimolive there are some significant differences between the head of master and your file. I'm wondering if you should get the head of master and re-edit from there, because rebasing didn't seem to sync it

CMD=(
"$SPARK_HOME/bin/spark-class"
"org.apache.spark.deploy.k8s.SparkPodInitContainer"
"$@"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any place in this entrypoint where $@ is being referenced by a spark-on-k8s command still needs the shift that was taken out on line 44, correct? For the unknown command case we just want a passthrough, but the main case statement historically is expecting the spark-on-k8s command to be stripped. The easiest/clearest way to preserve this might just be an extra case statement like this at line 44:

case "$SPARK_K8S_CMD" in
    driver | executor | init)
       shift 1
       ;;
esac

Copy link

@tmckayus tmckayus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are missing the shift for spark-on-k8s commands

@@ -38,10 +38,10 @@ fi

SPARK_K8S_CMD="$1"
if [ -z "$SPARK_K8S_CMD" ]; then
echo "No command to execute has been provided." 1>&2
exit 1
echo "No command to execute has been provided. Ignoring spark-on-k8s workflow..." 1>&2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose:
"No SPARK_K8S_CMD provided: proceeding in pass-through mode..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. This message is better.

@@ -110,8 +110,7 @@ case "$SPARK_K8S_CMD" in
;;

*)
echo "Unknown command: $SPARK_K8S_CMD" 1>&2
exit 1
CMD=("$@")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should log a message here too, about "executing in pass-through mode" since this is the guaranteed code path for pass through

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I'll make the change.

@@ -38,10 +38,10 @@ fi

SPARK_K8S_CMD="$1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two possible pass-through conditions here: one is "empty SPARK_K8S_CMD" and the other is "SPARK_K8S_CMD is non empty but has non-spark command in it" Is that the convention, or is the pass-through case always expected to be "empty SPARK_K8S_CMD" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is handled by the case block, because pass-through mode will be used either if SPARK_K8S_CMD is empty or has a non spark-on-k8s command. I tested both scenarios.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, I concur, this is the way the script works historically

Copy link
Contributor

@erikerlandson erikerlandson Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it does the right thing w.r.t. shift in both cases?
If it is non-empty but also not one of the three spark commands, shift removes the first argument, then it falls into the "passthrough" case, but then is "$@" missing the first arg?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikerlandson yes, after a shift the leading arg is gone from "$@"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmckayus that seems wrong, since the CMD that gets executed after the case is now missing the first element, am I missing something?

Copy link

@tmckayus tmckayus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the else case is not sufficient to handle the logic for shift

exit 1
echo "No command to execute has been provided. Ignoring spark-on-k8s workflow..." 1>&2
else
shift 1
Copy link

@tmckayus tmckayus Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't quite work, the -z test is effectively checking only whether $1 was empty or not.
If it's non-empty, but it is not a recognized spark-on-k8s command (ie driver, driver-py, or executor), it's a passthrough command and therefore we cannot shift anything. As it is, this would consume something like "/usr/libexec/s2i/assembly.sh" and make it disappear.

Personally, I would do somethng like this and take an early out in the unsupported case, skipping all the other environment processing

case "$SPARK_K8S_CMD" in
    driver | driver-py | executor)
       shift 1
       ;;
    *)
       echo "No SPARK_K8S_CMD provided: proceeding in pass-through mode..."
       exec /sbin/tini -s -- "$@"
      ;;
esac

"")
;;
*)
echo "No SPARK_K8S_CMD provided: proceeding in pass-through mode..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good to me. Looking at this simplified logic, the logging message should probably be more like:

"Non-spark-k8s command provided, proceeding in pass-through mode..."

@erikerlandson
Copy link
Contributor

ok to test

Copy link
Contributor

@erikerlandson erikerlandson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending testing

@erikerlandson
Copy link
Contributor

please test this

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/214/

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/215/

@felixcheung
Copy link
Member

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jun 16, 2018

@SparkQA
Copy link

SparkQA commented Jun 16, 2018

Test build #91965 has finished for PR 21572 at commit 4867a2d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 16, 2018

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/234/

@ifilonenko
Copy link
Contributor

Error from QA is unrelated to PR. We will need to wait for the PRB to be reconfigured for this to be properly tested. What integration tests have you run to handle this change?

Copy link

@tmckayus tmckayus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @rimolive

@erikerlandson
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jun 19, 2018

@SparkQA
Copy link

SparkQA commented Jun 19, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/328/

@erikerlandson
Copy link
Contributor

testing environment is synced again and its passing, so I'm going to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants