Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Kubernetes #1599

Merged
merged 17 commits into from Jul 17, 2018

Conversation

@javierluraschi
Copy link
Member

commented Jul 14, 2018

Add support for Kubernetes:

sc <- spark_connect(config = spark_config_kubernetes("k8s://192.168.99.100:8443"))

screen shot 2018-07-15 at 10 06 59 pm

Work Items

  • Use deterministic ports to allow kubernetes port forwarding.
  • Support to set up port forwarding automatically
  • Simplify required spark_config() options

Minikube

Use minikube to run Kubernetes locally with sparklyr as follows:

  • Install the kubernetes-cli and minikube, from OS X you can run: brew install kubernetes-cli minikube.
  • Run minikube start as described in the quickstart
  • Ensure enough resources are assigned to minikube, for instance, minikube stop, modify VirtualBox to use 4 CPUs and 4Gb, rerun minikube start.
  • Get the master address by running kubectl cluster-info.
  • Create a sparklyr folder in SPARK_HOME and copy the sparklyr jars from system.file("java", package = "sparklyr").
  • Modify the Spark kubernetes/dockerfiles/spark/Dockerfile to create a /opt/sparklyr path and copy the sparklyr jars. For instance by adding mkdir -p /opt/sparklyr && \ and COPY sparklyr /opt/sparklyr.
  • To prevent Forbidden!Configured service account doesn't have access generate service account: kubectl create serviceaccount spark and $ kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default.
  • Generate the cluster docker images by running from SPARK_HOME: ./bin/docker-image-tool.sh -m -t sparklyr build. However, you might hit an error with a workaround available in this post.

Notes

Notice that spark_config_kubernetes() is just a convenience function that could be replaced by something like:

sc <- spark_connect(
    master = "k8s://https://192.168.99.100:8443",
    config = list(
        sparklyr.shell.master = "k8s://https://192.168.99.100:8443",
        "sparklyr.shell.deploy-mode" = "cluster",
        sparklyr.gateway.remote = TRUE,
        sparklyr.shell.name = "sparklyr",
        sparklyr.shell.class = "sparklyr.Shell",
        sparklyr.shell.conf = c(
            "spark.kubernetes.container.image=spark:sparklyr",
            "spark.kubernetes.driver.pod.name=spark-driver",
            "spark.kubernetes.authenticate.driver.serviceAccountName=spark"
        ),
        sparklyr.app.jar = "local:///opt/sparklyr/sparklyr-2.3-2.11.jar",
        sparklyr.events.aftersubmit = function() {
            # wait for pods to launch
            Sys.sleep(15)
            # configure port forwarding
            system2(
              "kubectl",
              c("port-forward", "spark-driver", "8880:8880", "8881:8881", "4040:4040"),
              wait = FALSE
            )
        }
    ),
    spark_home = spark_home_dir()
)

@javierluraschi javierluraschi changed the title WIP: Support for Kubernetes Support for Kubernetes Jul 16, 2018

@javierluraschi javierluraschi force-pushed the feature/kubernetes branch from 870120f to 91cc45b Jul 17, 2018

@javierluraschi javierluraschi merged commit 0a7da4b into master Jul 17, 2018

2 of 4 checks passed

continuous-integration/appveyor/branch AppVeyor build failed
Details
continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
@javierluraschi javierluraschi referenced this pull request Jul 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.