Be notified of new releases
Create your free GitHub account today to subscribe to this repository for new releases and build software alongside 31 million developers.Sign up
This release is based off Spark 2.2 and requires Kubernetes 1.6 and up.
Major features and changes in this release include:
Major features and fixes in this release include:
- Support for HDFS locality.
- Added an option to use a secret to mount small files in driver and executors.
- Support for custom Kubernetes service account for the driver pod.
- Support for executor java options.
- Fixed conversion from MB to MiB in driver and executor memory specification.
- Added configuration properties for injecting arbitrary Kubernetes secrets into the driver and executors.
- Use a headless service to give a hostname to the driver (requiring Kubernetes DNS in the cluster).
- Improved docker image build/push flow.
- Added the ability to fail submission if submitter-local files are provided without the resource staging server URI.
- Added reference YAML files for RBAC configs for driver and shuffle service.
Deprecations and removals:
- Removed support for
spark.kubernetes.executor.annotationswhich were deprecated. They have been superseded by new properties.
See documentation for details.
Spark-on-Kubernetes release rebased from the Apache Spark 2.2 branch
Features available with this release include:
- Cluster-mode submission of Spark jobs to a Kubernetes cluster
- Support for Scala, Java and PySpark
- Static and Dynamic Allocation for Executors
- Automatic staging of local resources onto Driver and Executor pods
- Configurable security and credential management
- HDFS, running on the Kubernetes cluster or externally
- Launch jobs using kubectl proxy
- Support for Kubernetes 1.6 - 1.7
- Pre-built docker images
This is a bug-fix release and contains the following changes:
- Support specify CPU cores and Memory restricts for driver (#340)
- Generate the application ID label irrespective of app name. (#331)
- Create base-image and minimize layer count (#324)
- Added log4j config for k8s unit tests. (#314)
- Use node affinity to launch executors on preferred nodes benefitting from data locality (#316)
- New API for custom labels and annotations. (#346)
- Allow spark driver find shuffle pods in specified namespace (#357)
- Bypass init-containers when possible (#348)
- Config for hard cpu limit on pods; default unlimited (#356)
- Allow number of executor cores to have fractional values (#361)
- Python Bindings for launching PySpark Jobs from the JVM (#364)
- Submission client redesign to use a step-based builder pattern (#365)
- Add node selectors for driver and executor pods (#355)
- Retry binding server to random port in the resource staging server test. (#378)
- set RestartPolicy=Never for executor (#367)
- Read classpath entries from SPARK_EXTRA_CLASSPATH on executors. (#383)
- Changes to support executor recovery behavior during static allocation. (#244)
First beta release of Spark with Kubernetes support.
Based off Spark upstream at version 2.1.0
- File staging server for local files
- Dynamic allocation of executors
- Stability and bug fixes
- Applications can only run in cluster mode.
- Only Scala and Java applications can be run.
- No HA