diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/README.md b/ppml/trusted-big-data-ml/scala/docker-occlum/README.md index 49634ad6b04..af73b979d20 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/README.md +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/README.md @@ -1,6 +1,5 @@ # Trusted Big Data ML with Occlum - ## Prerequisites Pull image from dockerhub. @@ -110,7 +109,6 @@ Enlarge these four configurations in [run_spark_on_occlum_glibc.sh](https://gith .resource_limits.max_num_of_threads = 4096 | .process.default_heap_size = "4096MB" | .resource_limits.kernel_space_heap_size="4096MB" | -.process.default_mmap_size = "81920MB" | ``` Then build the docker image: @@ -188,7 +186,6 @@ Enlarge these four configurations in [run_spark_on_occlum_glibc.sh](https://gith .resource_limits.max_num_of_threads = 4096 | .process.default_heap_size = "32GB" | .resource_limits.kernel_space_heap_size="2GB" | -.process.default_mmap_size = "24GB" | ``` Then build the docker image: diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/README.md b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/README.md index 57e96fce729..16ab8f35bbe 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/README.md +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/README.md @@ -18,31 +18,67 @@ cd .. bash build-docker-image.sh ``` -2. Download [Spark 3.1.2](https://archive.apache.org/dist/spark/spark-3.1.2/spark-3.1.2-bin-hadoop2.7.tgz), and setup `SPARK_HOME`. Or set `SPARK_HOME` in `run_spark_pi.sh`. -3. Modify `${kubernetes_master_url}` to your k8s master url in the `run_spark_pi.sh ` -4. Modify `executor.yaml` for your need +2. Download [Spark 3.1.2](https://archive.apache.org/dist/spark/spark-3.1.2/spark-3.1.2-bin-hadoop2.7.tgz), and setup `SPARK_HOME`. +3. `export kubernetes_master_url=your_k8s_master` or replace `${kubernetes_master_url}` with your k8s master url in `run_spark_xxx.sh`. +4. Modify `driver.yaml` and `executor.yaml` for your applications. -## Run Spark executor in Occlum: +## Examples -### Run SparkPi example +### SparkPi example ```bash ./run_spark_pi.sh ``` -### Run Spark ML LogisticRegression example +```yaml +#driver.yaml + env: + - name: DRIVER_MEMORY + value: "500m" + - name: SGX_MEM_SIZE + value: "1GB" +``` + +```yaml +#executor.yaml + env: + - name: SGX_MEM_SIZE + value: "1GB" +``` + +### Spark ML LogisticRegression example ```bash ./run_spark_lr.sh ``` -### Run Spark ML GradientBoostedTreeClassifier example +```yaml +#driver.yaml + env: + - name: DRIVER_MEMORY + value: "2g" + - name: SGX_MEM_SIZE + value: "4GB" + - name: SGX_THREAD + value: "128" +``` + +```yaml +#executor.yaml + env: + - name: SGX_MEM_SIZE + value: "4GB" + - name: SGX_THREAD + value: "128" +``` + +### Spark ML GradientBoostedTreeClassifier example ```bash ./run_spark_gbt.sh ``` -### Run Spark SQL SparkSQL example +### Spark SQL SparkSQL example ```bash ./run_spark_sql.sh @@ -67,8 +103,7 @@ Parameters: * num_round : Int * path_to_model_to_be_saved : String. - After training, you can find xgboost model in folder `/tmp/path_to_model_to_be_saved`. - +After training, you can find xgboost model in folder `/tmp/path_to_model_to_be_saved`. #### Criteo 1TB Click Logs [dataset](https://ailab.criteo.com/download-criteo-1tb-click-logs-dataset/) @@ -77,6 +112,7 @@ Then change the `class` in [script](https://github.com/intel-analytics/BigDL/blo `com.intel.analytics.bigdl.dllib.examples.nnframes.xgboost.xgbClassifierTrainingExampleOnCriteoClickLogsDataset`. Add these configurations to [script](https://github.com/intel-analytics/BigDL/blob/main/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_xgboost.sh): + ```bash --conf spark.driver.extraClassPath=local:///opt/spark/jars/* \ --conf spark.executor.extraClassPath=local:///opt/spark/jars/* \ @@ -94,17 +130,21 @@ Add these configurations to [script](https://github.com/intel-analytics/BigDL/bl --executor-memory 10g \ --driver-memory 10g ``` + Change the `parameters` to: + ```commandline /host/data/xgboost_data /host/data/xgboost_criteo_model 32 100 10 ``` + Then: + ```bash ./run_spark_xgboost.sh ``` Parameters: -* path_to_Criteo_data : String. +* path_to_Criteo_data : String. For example, yout host path to Criteo dateset is `/tmp/xgboost_data/criteo` then this parameter in `run_spark_xgboost.sh` is `/host/data/xgboost_data`. * path_to_model_to_be_saved : String. @@ -115,18 +155,16 @@ Parameters: * num_round : Int * max_depth: Int. Tree max depth. -**note: make sure num_threads is larger than spark.task.cpus.** +**Note: make sure num_threads is larger than spark.task.cpus.** #### Source code You can find source code [here](https://github.com/intel-analytics/BigDL/tree/main/scala/dllib/src/main/scala/com/intel/analytics/bigdl/dllib/example/nnframes/xgboost). ### Run Spark TPC-H example -Modify the following configuration in `executor.yaml`. +Modify the following configuration in `driver.yaml` and `executor.yaml`. ```yaml -imagePullPolicy: Always - env: - name: SGX_THREAD value: "256" @@ -134,8 +172,6 @@ env: value: "2GB" - name: SGX_KERNEL_HEAP value: "2GB" -- name: SGX_MMAP - value: "16GB" ``` Then run the script. @@ -145,6 +181,5 @@ Then run the script. ``` ## How to debug -Modify the `--conf spark.kubernetes.sgx.log.level=off \` to one of `off, error, warn, debug, info, and trace` -in `run_spark_xx.sh`. +Modify the `--conf spark.kubernetes.sgx.log.level=off \` to one of `off, error, warn, debug, info, and trace` in `run_spark_xx.sh`. diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/driver.yaml b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/driver.yaml new file mode 100644 index 00000000000..49a3d92fddb --- /dev/null +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/driver.yaml @@ -0,0 +1,45 @@ +apiVersion: v1 +kind: Pod +metadata: + name: spark-deployment + namespace: default +spec: + containers: + - name: spark-example + image: intelanalytics/bigdl-ppml-trusted-big-data-ml-scala-occlum:2.1.0-SNAPSHOT + imagePullPolicy: Never + volumeMounts: + - name: sgx-enclave + mountPath: /dev/sgx/enclave + - name: sgx-provision + mountPath: /dev/sgx/provision + - name: aesm + mountPath: /var/run/aesmd + - name: data-exchange + mountPath: /opt/occlum_spark/data + securityContext: + privileged: true + env: + - name: DRIVER_MEMORY + value: "5g" + - name: SGX_MEM_SIZE + value: "12GB" + - name: SGX_THREAD + value: "128" + - name: SGX_HEAP + value: "512MB" + - name: SGX_KERNEL_HEAP + value: "1GB" + volumes: + - name: sgx-enclave + hostPath: + path: /dev/sgx_enclave + - name: sgx-provision + hostPath: + path: /dev/sgx_provision + - name: aesm + hostPath: + path: /var/run/aesmd + - name: data-exchange + hostPath: + path: /tmp diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_gbt.sh b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_gbt.sh index 11abdc4f048..52039e36909 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_gbt.sh +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_gbt.sh @@ -11,7 +11,7 @@ ${SPARK_HOME}/bin/spark-submit \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.executor.podNamePrefix="sparkgbt" \ --conf spark.kubernetes.executor.deleteOnTermination=false \ - --conf spark.kubernetes.driver.podTemplateFile=./executor.yaml \ + --conf spark.kubernetes.driver.podTemplateFile=./driver.yaml \ --conf spark.kubernetes.executor.podTemplateFile=./executor.yaml \ --conf spark.kubernetes.sgx.log.level=off \ --jars local:/opt/spark/examples/jars/scopt_2.12-3.7.1.jar \ diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_lr.sh b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_lr.sh index c45ee6614bc..617045020ce 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_lr.sh +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_lr.sh @@ -11,7 +11,7 @@ ${SPARK_HOME}/bin/spark-submit \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.executor.podNamePrefix="sparklr" \ --conf spark.kubernetes.executor.deleteOnTermination=false \ - --conf spark.kubernetes.driver.podTemplateFile=./executor.yaml \ + --conf spark.kubernetes.driver.podTemplateFile=./driver.yaml \ --conf spark.kubernetes.executor.podTemplateFile=./executor.yaml \ --conf spark.kubernetes.sgx.log.level=off \ --jars local:/opt/spark/examples/jars/scopt_2.12-3.7.1.jar \ diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_pi.sh b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_pi.sh index e256e3ca73b..eb625f55dfb 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_pi.sh +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_pi.sh @@ -10,7 +10,7 @@ ${SPARK_HOME}/bin/spark-submit \ --conf spark.kubernetes.container.image=intelanalytics/bigdl-ppml-trusted-big-data-ml-scala-occlum:2.1.0-SNAPSHOT \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.executor.deleteOnTermination=false \ - --conf spark.kubernetes.driver.podTemplateFile=./executor.yaml \ + --conf spark.kubernetes.driver.podTemplateFile=./driver.yaml \ --conf spark.kubernetes.executor.podTemplateFile=./executor.yaml \ --conf spark.kubernetes.sgx.log.level=off \ local:/opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_sql.sh b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_sql.sh index f9f509ae0ae..0c7857c845a 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_sql.sh +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_sql.sh @@ -11,7 +11,7 @@ ${SPARK_HOME}/bin/spark-submit \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.executor.podNamePrefix="sparksql" \ --conf spark.kubernetes.executor.deleteOnTermination=false \ - --conf spark.kubernetes.driver.podTemplateFile=./executor.yaml \ + --conf spark.kubernetes.driver.podTemplateFile=./driver.yaml \ --conf spark.kubernetes.executor.podTemplateFile=./executor.yaml \ --conf spark.kubernetes.sgx.log.level=off \ --jars local:/opt/spark/examples/jars/scopt_2.12-3.7.1.jar \ diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_tpch.sh b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_tpch.sh index 3e44db7c166..bd9bba8efb2 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_tpch.sh +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_tpch.sh @@ -10,7 +10,7 @@ ${SPARK_HOME}/bin/spark-submit \ --conf spark.kubernetes.container.image.pullPolicy="IfNotPresent" \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.executor.deleteOnTermination=false \ - --conf spark.kubernetes.driver.podTemplateFile=./executor.yaml \ + --conf spark.kubernetes.driver.podTemplateFile=./driver.yaml \ --conf spark.kubernetes.executor.podTemplateFile=./executor.yaml \ --conf spark.kubernetes.file.upload.path=file:///tmp \ --conf spark.kubernetes.executor.podNamePrefix="sparktpch" \ diff --git a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_xgboost.sh b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_xgboost.sh index 98593b80775..0904f8a8b2b 100644 --- a/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_xgboost.sh +++ b/ppml/trusted-big-data-ml/scala/docker-occlum/kubernetes/run_spark_xgboost.sh @@ -10,7 +10,7 @@ ${SPARK_HOME}/bin/spark-submit \ --conf spark.kubernetes.container.image=intelanalytics/bigdl-ppml-trusted-big-data-ml-scala-occlum:2.1.0-SNAPSHOT \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.kubernetes.executor.deleteOnTermination=false \ - --conf spark.kubernetes.driver.podTemplateFile=./executor.yaml \ + --conf spark.kubernetes.driver.podTemplateFile=./driver.yaml \ --conf spark.kubernetes.executor.podTemplateFile=./executor.yaml \ --conf spark.kubernetes.file.upload.path=file:///tmp \ --conf spark.kubernetes.sgx.log.level=off \