Skip to content
Permalink
Browse files

Add github workflow to check keras spark examples are in sync (#1654)

Signed-off-by: Enrico Minack <github@enrico.minack.dev>
  • Loading branch information
EnricoMi authored and tgaddair committed Jan 10, 2020
1 parent bb23be3 commit 4fc321fbee81f8fb3103dc1f5172608f265074ca
Showing with 63 additions and 0 deletions.
  1. +27 −0 .github/workflows/examples-keras-spark3-rossmann.yml
  2. +36 −0 examples/keras_spark3_rossmann.py.patch
@@ -0,0 +1,27 @@
name: Examples Keras Spark3 Sync

on: [pull_request]

jobs:
build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v1
- name: Diffing examples/keras_spark_rossmann_run.py with examples/keras_spark3_rossmann.py
run: |
patch --quiet -p0 examples/keras_spark_rossmann_run.py examples/keras_spark3_rossmann.py.patch -o examples/keras_spark3_rossmann.py.from-keras_spark_rossmann_run
if ! diff -q examples/keras_spark3_rossmann.py.from-keras_spark_rossmann_run --label examples/keras_spark_rossmann_run.py examples/keras_spark3_rossmann.py
then
echo
echo "Unexpected differences are:"
diff examples/keras_spark3_rossmann.py.from-keras_spark_rossmann_run --label examples/keras_spark_rossmann_run.py examples/keras_spark3_rossmann.py || true
echo
echo "Use the following as examples/keras_spark3_rossmann.py.patch to accept those changes:"
diff examples/keras_spark_rossmann_run.py examples/keras_spark3_rossmann.py || true
false
fi
@@ -0,0 +1,36 @@
32a33
> DISCOVERY_SCRIPT = 'get_gpu_resources.sh'
42c43
< TRAINING_CLUSTER = None # or 'spark://hostname:7077'
---
> TRAINING_CLUSTER = 'local-cluster[2,1,1024]' # or 'spark://hostname:7077'
391a393
> from horovod.spark.task import get_available_devices
406c408
< config.gpu_options.visible_device_list = str(hvd.local_rank())
---
> config.gpu_options.visible_device_list = get_available_devices()[0]
492a495,517
>
> # This config will change depending on your cluster setup.
> #
> # 1. Standalone Cluster
> # - Must configure spark.worker.* configs as below.
> #
> # 2. YARN
> # - Requires YARN 3.1 or higher to support GPUs
> # - Cluster should be configured to have isolation on so that
> # multiple executors don’t see the same GPU on the same host.
> # - If you don’t have isolation then you would require a different discovery script
> # or other way to make sure that 2 executors don’t try to use same GPU.
> #
> # 3. Kubernetes
> # - Requires GPU support and isolation.
> # - Add conf.set(“spark.executor.resource.gpu.discoveryScript”, DISCOVERY_SCRIPT)
> # - Add conf.set(“spark.executor.resource.gpu.vendor”, “nvidia”)
> conf = conf.set("spark.test.home", os.environ.get('SPARK_HOME'))
> conf = conf.set("spark.worker.resource.gpu.discoveryScript", DISCOVERY_SCRIPT)
> conf = conf.set("spark.worker.resource.gpu.amount", 1)
> conf = conf.set("spark.task.resource.gpu.amount", "1")
> conf = conf.set("spark.executor.resource.gpu.amount", "1")
>

0 comments on commit 4fc321f

Please sign in to comment.
You can’t perform that action at this time.