Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tooling to create kernel specs #8

Merged
merged 1 commit into from
Sep 2, 2022
Merged

Add tooling to create kernel specs #8

merged 1 commit into from
Sep 2, 2022

Conversation

kevin-bates
Copy link
Member

This pull request adds tooling for creating kernel specifications that utilize the various provisioners. Each "provisioner" will have its own console script that knows about that provisioner's requirements (parameters, launcher, etc.). However, most of the implementation occurs within a base class of each script class.

The current set of console scripts are:

  • jupyter-k8s-spec (corresponding to KubernetesProvisioner)
  • jupyter-docker-spec (corresponding to either DockerProvisioner or DockerSwarmProvisioner)
  • jupyter-yarn-spec (corresponding to YarnProvisioner)
  • jupyter-ssh-spec (corresponding to DistributedProvisioner)

Kernel specifications are essentially assembled from the set of parameters. We will not provide OOTB kernel specs as part of the release assets. Instead, kernel specs can be built using this tool.

Here are some example commands (and their outputs):
$ jupyter-k8s-spec install --language=Scala --spark --kernel-name=scala_k8s_spark --display-name='Scala on Kubernetes with Spark'
[I 2022-09-02 10:03:30.985 K8sSpecApp] Installing kernel specification for 'Scala on Kubernetes with Spark'
[I 2022-09-02 10:03:31.191 K8sSpecApp] Installed kernelspec scala_k8s_spark in /usr/local/share/jupyter/kernels/scala_k8s_spark

$ cat /usr/local/share/jupyter/kernels/scala_k8s_spark/kernel.json 
{
  "argv": [
    "/usr/local/share/jupyter/kernels/scala_k8s_spark/bin/run.sh",
    "--kernel-id",
    "{kernel_id}",
    "--port-range",
    "{port_range}",
    "--response-address",
    "{response_address}",
    "--public-key",
    "{public_key}",
    "--spark-context-initialization-mode",
    "lazy"
  ],
  "env": {
    "SPARK_HOME": "/opt/spark",
    "__TOREE_SPARK_OPTS__": "--master k8s://https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT} --deploy-mode cluster --name ${KERNEL_USERNAME}-${KERNEL_ID} --conf spark.kubernetes.namespace=${KERNEL_NAMESPACE} --driver-memory 2G --conf spark.kubernetes.driver.label.app=kubernetes-kernel-provider --conf spark.kubernetes.driver.label.kernel_id=${KERNEL_ID} --conf spark.kubernetes.driver.label.component=kernel --conf spark.kubernetes.executor.label.app=kubernetes-kernel-provider --conf spark.kubernetes.executor.label.kernel_id=${KERNEL_ID} --conf spark.kubernetes.executor.label.component=kernel --conf spark.kubernetes.driver.container.image=${KERNEL_IMAGE} --conf spark.kubernetes.executor.container.image=${KERNEL_EXECUTOR_IMAGE} --conf spark.kubernetes.authenticate.driver.serviceAccountName=${KERNEL_SERVICE_ACCOUNT_NAME} --conf spark.kubernetes.submission.waitAppCompletion=false  ${KERNEL_EXTRA_SPARK_OPTS}",
    "__TOREE_OPTS__": "--alternate-sigint USR2",
    "LAUNCH_OPTS": "",
    "DEFAULT_INTERPRETER": "Scala"
  },
  "display_name": "Scala on Kubernetes with Spark",
  "language": "scala",
  "interrupt_mode": "signal",
  "metadata": {
    "kernel_provisioner": {
      "provisioner_name": "kubernetes-provisioner",
      "config": {
        "image_name": "elyra/kernel-scala:dev",
        "executor_image_name": "elyra/kernel-scala:dev",
        "launch_timeout": 30
      }
    }
  }
}
$ jupyter-ssh-spec install --kernel-name ssh_python --remote-hosts=192.168.2.4 --remote-hosts=192.168.2.5
[I 2022-09-02 10:10:59.456 SshSpecApp] Installing kernel specification for 'Python SSH'
[I 2022-09-02 10:10:59.740 SshSpecApp] Installed kernelspec ssh_python in /usr/local/share/jupyter/kernels/ssh_python

$ cat /usr/local/share/jupyter/kernels/ssh_python/kernel.json 
{
  "argv": [
    "python",
    "/usr/local/share/jupyter/kernels/ssh_python/scripts/launch_ipykernel.py",
    "--kernel-id",
    "{kernel_id}",
    "--port-range",
    "{port_range}",
    "--response-address",
    "{response_address}",
    "--public-key",
    "{public_key}",
    "--kernel_class_name",
    "ipykernel.ipkernel.IPythonKernel"
  ],
  "env": {},
  "display_name": "Python SSH",
  "language": "python",
  "interrupt_mode": "signal",
  "metadata": {
    "debugger": true,
    "kernel_provisioner": {
      "provisioner_name": "distributed-provisioner",
      "config": {
        "launch_timeout": 30,
        "remote_hosts": [
          "192.168.2.4",
          "192.168.2.5"
        ]
      }
    }
  }
}

Note that we still need to bring over the SparkOperatorProcessProxy as a provisioner and implement (most likely) a jupyter-crd-spec script.

@kevin-bates kevin-bates added the enhancement An improvement to an existing feature label Sep 2, 2022
@kevin-bates kevin-bates merged commit 896d83f into main Sep 2, 2022
@kevin-bates kevin-bates deleted the add-tooling branch September 2, 2022 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant