Skip to content

Commit

Permalink
[ML-50] Merge #47 and prepare for OAP 1.1 (#51)
Browse files Browse the repository at this point in the history
* [ML-35] Restrict printNumericTable to first 10 eigenvalues with first 20 dimensions (#36)

* restrict PCA printNumericTable to first 10 eigenvalues with first 20 dimensions

* fix ALS printNumericTable

* [ML-44] [PIP] Update to oneAPI 2021.2 and Rework examples for validation (#47)

* remove hibench examples

* Fix tbb linking

* Add data

* Add env.sh.template

* Revise examples

* Add ALS scala and modify als-pyspark.py

* nit

* Add build-all & run-all

* remove test-cluster/workloads and use examples for validation

* add setup-python3 and setup-cluster, fix paths

* fix java home

* add config ssh

* add config ssh

* add config ssh

* add config ssh

* fix config-ssh

* fix config-ssh

* fix config-ssh

* fix config-ssh

* fix config-ssh

* fix config-ssh

* set strict modes no

* clear out comments

* Update oneCCL and oneDAL to oneAPI 2021.2.0, don't build oneCCL from source

* nit

* Fix install oneapi and source setvars

* nit

* Add spark.driver.host

* Add ci-build

* nit

* Update

* Update

* Add --ccl-configuration=cpu_icc

* Update

* Update

* revert to build oneCCL from source and package related so

* nit

* nit

* Add ci-test-cluster

* update

* update

* update

* update

* update

* Add check: OneCCL doesn't support loopback IP

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update README

* update

* update

* update

* update

* Update README and nit changes
  • Loading branch information
xwu99 committed Apr 15, 2021
1 parent f637bfc commit 0158e96
Show file tree
Hide file tree
Showing 54 changed files with 761 additions and 1,092 deletions.
27 changes: 8 additions & 19 deletions .github/workflows/oap-mllib-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,32 +11,21 @@ jobs:
- name: Set up JDK 1.8
uses: actions/setup-java@v1
with:
java-version: 1.8
java-version: 1.8
- name: Restore cached dependencies
uses: actions/cache@v2
with:
path: |
# /var/cache/apt/archives/*.deb
~/.m2/repository
~/downloads
/opt/intel/inteloneapi
/opt/intel/oneapi
~/opt
key: ${{ runner.os }}-${{ hashFiles('**/pom.xml', '{{github.workspace}}/dev/install-build-deps-ubuntu.sh') }}
restore-keys: |
${{ runner.os }}-
- name: Set up dependencies
run: |
[ -d ~/downloads ] || mkdir ~/downloads
cd ~/downloads
[ -f spark-3.0.0-bin-hadoop2.7.tgz ] || wget http://archive.apache.org/dist/spark/spark-3.0.0/spark-3.0.0-bin-hadoop2.7.tgz
[ -d spark-3.0.0-bin-hadoop2.7 ] || cd ~ && tar -zxf downloads/spark-3.0.0-bin-hadoop2.7.tgz
export SPARK_HOME=~/spark-3.0.0-bin-hadoop2.7
${{github.workspace}}/dev/install-build-deps-ubuntu.sh
- name: Set up environments
run: |
source ${{github.workspace}}/dev/setup-all.sh
- name: Build and Test
run: |
cd ${{github.workspace}}/mllib-dal
export ONEAPI_ROOT=/opt/intel/oneapi
source /opt/intel/oneapi/dal/latest/env/vars.sh
source /opt/intel/oneapi/tbb/latest/env/vars.sh
source /tmp/oneCCL/build/_install/env/setvars.sh
# temp disable and will enable for new release of oneCCL
#./build.sh
run: |
${{github.workspace}}/dev/ci-test.sh
55 changes: 36 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,13 @@ For those algorithms that are not accelerated by OAP MLlib, the original Spark M

## Online Documentation

You can find the all the OAP MLlib documents on the [project web page](https://oap-project.github.io/oap-mllib/).
You can find the all the OAP MLlib documents on the [project web page](https://oap-project.github.io/oap-mllib).

## Getting Started

### Java/Scala Users Preferred

Use a pre-built OAP MLlib JAR to get started. You can firstly download OAP package from [OAP-JARs-Tarball](https://github.com/Intel-bigdata/OAP/releases/download/v1.0.0-spark-3.0.0/oap-1.0.0-bin-spark-3.0.0.tar.gz) and extract this Tarball to get `oap-mllib-x.x.x-with-spark-x.x.x.jar` under `oap-1.0.0-bin-spark-3.0.0/jars`.
Use a pre-built OAP MLlib JAR to get started. You can firstly download OAP package from [OAP-JARs-Tarball](https://github.com/Intel-bigdata/OAP/releases/download/v1.1.0-spark-3.0.0/oap-1.1.0-bin-spark-3.0.0.tar.gz) and extract this Tarball to get `oap-mllib-x.x.x-with-spark-x.x.x.jar` under `oap-1.1.0-bin-spark-3.0.0/jars`.

Then you can refer to the following [Running](#running) section to try out.

Expand Down Expand Up @@ -58,24 +58,31 @@ spark.executor.extraClassPath ./oap-mllib-x.x.x-with-spark-x.x.x.jar

### Sanity Check

To use K-means example for sanity check, you need to upload a data file to your HDFS and change related variables in `run.sh` of kmeans example. Then run the following commands:
#### Setup `env.sh`
```
$ cd oap-mllib/examples/kmeans
$ ./build.sh
$ ./run.sh
$ cd conf
$ cp env.sh.template env.sh
```
Edit related variables in "`Minimun Settings`" of `env.sh`

### Benchmark with HiBench
Use [Hibench](https://github.com/Intel-bigdata/HiBench) to generate dataset with various profiles, and change related variables in `run-XXX.sh` script when applicable. Then run the following commands:
#### Upload example data files to HDFS
```
$ cd oap-mllib/examples/kmeans-hibench
$ cd examples
$ hadoop fs -mkdir -p /user/$USER
$ hadoop fs -copyFromLocal data
$ hadoop fs -ls data
```
#### Run K-means

```
$ cd examples/kmeans
$ ./build.sh
$ ./run-hibench-oap-mllib.sh
$ ./run.sh
```

### PySpark Support

As PySpark-based applications call their Scala couterparts, they shall be supported out-of-box. An example can be found in the [Examples](#examples) section.
As PySpark-based applications call their Scala couterparts, they shall be supported out-of-box. Examples can be found in the [Examples](#examples) section.

## Building

Expand All @@ -86,7 +93,8 @@ We use [Apache Maven](https://maven.apache.org/) to manage and build source code
* JDK 8.0+
* Apache Maven 3.6.2+
* GNU GCC 4.8.5+
* Intel® oneAPI Toolkits 2021.1.1 Components:
* Intel® oneAPI Toolkits 2021.2+ Components:
- DPC++/C++ Compiler (dpcpp/clang++)
- Data Analytics Library (oneDAL)
- Threading Building Blocks (oneTBB)
* [Open Source Intel® oneAPI Collective Communications Library (oneCCL)](https://github.com/oneapi-src/oneCCL)
Expand All @@ -95,7 +103,7 @@ Intel® oneAPI Toolkits and its components can be downloaded and install from [h

More details about oneAPI can be found [here](https://software.intel.com/content/www/us/en/develop/tools/oneapi.html).

You can also refer to [this script and comments in it](https://github.com/Intel-bigdata/OAP/blob/branch-1.0-spark-3.x/oap-mllib/dev/install-build-deps-centos.sh) to install correct oneAPI version and manually setup the environments.
You can refer to [this script](dev/install-build-deps-centos.sh) to install correct dependencies.

Scala and Java dependency descriptions are already included in Maven POM file.

Expand All @@ -107,7 +115,7 @@ To clone and build from open source oneCCL, run the following commands:
```
$ git clone https://github.com/oneapi-src/oneCCL
$ cd oneCCL
$ git checkout beta08
$ git checkout 2021.2
$ mkdir build && cd build
$ cmake ..
$ make -j install
Expand Down Expand Up @@ -138,30 +146,39 @@ CCL_ROOT | Path to oneCCL home directory
We suggest you to source `setvars.sh` script into current shell to setup building environments as following:

```
$ source /opt/intel/inteloneapi/setvars.sh
$ source /opt/intel/oneapi/setvars.sh
$ source /your/oneCCL_source_code/build/_install/env/setvars.sh
```

__Be noticed we are using our own built oneCCL instead, we should source oneCCL's `setvars.sh` to overwrite oneAPI one.__

You can also refer to [this CI script](dev/ci-build.sh) to setup the building environments.

If you prefer to buid your own open source [oneDAL](https://github.com/oneapi-src/oneDAL), [oneTBB](https://github.com/oneapi-src/oneTBB) versions rather than use the ones included in oneAPI TookKits, you can refer to the related build instructions and manually source `setvars.sh` accordingly.

To build, run the following commands:
```
$ cd oap-mllib/mllib-dal
$ cd mllib-dal
$ ./build.sh
```

The built JAR package will be placed in `target` directory with the name `oap-mllib-x.x.x-with-spark-x.x.x.jar`.

## Examples

Example | Description
Example | Description
----------------|---------------------------
kmeans | K-means example for Scala
kmeans-pyspark | K-means example for PySpark
kmeans-hibench | Use HiBench-generated input dataset to benchmark K-means performance
pca | PCA example for Scala
pca-pyspark | PCA example for PySpark
als | ALS example for Scala
als-pyspark | ALS example for PySpark

## List of Accelerated Algorithms

* K-Means (CPU, Experimental)
Algorithm | Category | Maturity
----------|----------|-------------
K-Means | CPU | Experimental
PCA | CPU | Experimental
ALS | CPU | Experimental
46 changes: 46 additions & 0 deletions conf/env.sh.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# == OAP MLlib users to customize the following environments for running examples ======= #

# ============== Minimum Settings ============= #

# Set OAP MLlib version (e.g. 1.1.0)
OAP_MLLIB_VERSION=x.x.x
# Set Spark master
SPARK_MASTER=yarn
# Set Hadoop home path
export HADOOP_HOME=/path/to/your/hadoop/home
# Set Spark home path
export SPARK_HOME=/path/to/your/spark/home
# Set HDFS Root, should be hdfs://xxx or file://xxx
export HDFS_ROOT=hdfs://localhost:8020
# Set OAP MLlib source code root directory
export OAP_MLLIB_ROOT=/path/to/oap-mllib/home

# ============================================= #

# Set HADOOP_CONF_DIR for Spark
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

# Set JAR name & path
OAP_MLLIB_JAR_NAME=oap-mllib-$OAP_MLLIB_VERSION.jar
OAP_MLLIB_JAR=$OAP_MLLIB_ROOT/mllib-dal/target/$OAP_MLLIB_JAR_NAME
# Set Spark driver & executor classpaths,
# absolute path for driver, relative path for executor
SPARK_DRIVER_CLASSPATH=$OAP_MLLIB_JAR
SPARK_EXECUTOR_CLASSPATH=./$OAP_MLLIB_JAR_NAME

# Set Spark resources, can be overwritten in example
SPARK_DRIVER_MEMORY=1G
SPARK_NUM_EXECUTORS=2
SPARK_EXECUTOR_CORES=1
SPARK_EXECUTOR_MEMORY=1G
SPARK_DEFAULT_PARALLELISM=$(expr $SPARK_NUM_EXECUTORS '*' $SPARK_EXECUTOR_CORES '*' 2)

# Checks

for dir in $SPARK_HOME $HADOOP_HOME $OAP_MLLIB_JAR
do
if [[ ! -e $dir ]]; then
echo $dir does not exist!
exit 1
fi
done
43 changes: 43 additions & 0 deletions dev/ci-build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env bash

# Setup building envs
source /opt/intel/oneapi/setvars.sh
source /tmp/oneCCL/build/_install/env/setvars.sh

# Check envs for building
if [[ -z $JAVA_HOME ]]; then
echo JAVA_HOME not defined!
exit 1
fi

if [[ -z $(which mvn) ]]; then
echo Maven not found!
exit 1
fi

if [[ -z $DAALROOT ]]; then
echo DAALROOT not defined!
exit 1
fi

if [[ -z $TBBROOT ]]; then
echo TBBROOT not defined!
exit 1
fi

if [[ -z $CCL_ROOT ]]; then
echo CCL_ROOT not defined!
exit 1
fi

echo === Building Environments ===
echo JAVA_HOME=$JAVA_HOME
echo DAALROOT=$DAALROOT
echo TBBROOT=$TBBROOT
echo CCL_ROOT=$CCL_ROOT
echo Maven Version: $(mvn -v | head -n 1 | cut -f3 -d" ")
echo Clang Version: $(clang -dumpversion)
echo =============================

cd $GITHUB_WORKSPACE/mllib-dal
mvn --no-transfer-progress -DskipTests clean package
61 changes: 61 additions & 0 deletions dev/ci-test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/usr/bin/env bash

# Setup building envs
source /opt/intel/oneapi/setvars.sh
source /tmp/oneCCL/build/_install/env/setvars.sh

# Check envs for building
if [[ -z $JAVA_HOME ]]; then
echo JAVA_HOME not defined!
exit 1
fi

if [[ -z $(which mvn) ]]; then
echo Maven not found!
exit 1
fi

if [[ -z $DAALROOT ]]; then
echo DAALROOT not defined!
exit 1
fi

if [[ -z $TBBROOT ]]; then
echo TBBROOT not defined!
exit 1
fi

if [[ -z $CCL_ROOT ]]; then
echo CCL_ROOT not defined!
exit 1
fi

echo === Testing Environments ===
echo JAVA_HOME=$JAVA_HOME
echo DAALROOT=$DAALROOT
echo TBBROOT=$TBBROOT
echo CCL_ROOT=$CCL_ROOT
echo Maven Version: $(mvn -v | head -n 1 | cut -f3 -d" ")
echo Clang Version: $(clang -dumpversion)
echo =============================

cd $GITHUB_WORKSPACE/mllib-dal

# Build test
$GITHUB_WORKSPACE/dev/ci-build.sh

# Enable signal chaining support for JNI
# export LD_PRELOAD=$JAVA_HOME/jre/lib/amd64/libjsig.so

# -Dtest=none to turn off the Java tests

# Test all
# mvn -Dtest=none -Dmaven.test.skip=false test

# Individual test
mvn --no-transfer-progress -Dtest=none -DwildcardSuites=org.apache.spark.ml.clustering.IntelKMeansSuite test
mvn --no-transfer-progress -Dtest=none -DwildcardSuites=org.apache.spark.ml.feature.IntelPCASuite test
# mvn -Dtest=none -DwildcardSuites=org.apache.spark.ml.recommendation.IntelALSSuite test

# Yarn cluster test
$GITHUB_WORKSPACE/dev/test-cluster/ci-test-cluster.sh
21 changes: 7 additions & 14 deletions dev/install-build-deps-centos.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,10 @@ gpgcheck=1
repo_gpgcheck=1
gpgkey=https://yum.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2023.PUB
EOF
sudo mv /tmp/oneAPI.repo /etc/yum.repos.d
sudo yum install -y intel-oneapi-dal-devel-2021.1.1 intel-oneapi-tbb-devel-2021.1.1
sudo mv /tmp/oneAPI.repo /etc/yum.repos.d
# sudo yum groupinstall -y "Development Tools"
# sudo yum install -y cmake
sudo yum install -y intel-oneapi-dpcpp-cpp-2021.2.0 intel-oneapi-dal-devel-2021.2.0 intel-oneapi-tbb-devel-2021.2.0
else
echo "oneAPI components already installed!"
fi
Expand All @@ -23,16 +25,7 @@ cd /tmp
rm -rf oneCCL
git clone https://github.com/oneapi-src/oneCCL
cd oneCCL
git checkout 2021.1
mkdir -p build && cd build
git checkout 2021.2
mkdir build && cd build
cmake ..
make -j 2 install

#
# Setup building environments manually:
#
# export ONEAPI_ROOT=/opt/intel/oneapi
# source /opt/intel/oneapi/dal/latest/env/vars.sh
# source /opt/intel/oneapi/tbb/latest/env/vars.sh
# source /tmp/oneCCL/build/_install/env/setvars.sh
#
make -j 2 install
19 changes: 6 additions & 13 deletions dev/install-build-deps-ubuntu.sh
Original file line number Diff line number Diff line change
@@ -1,32 +1,25 @@
#!/usr/bin/env bash

if [ ! -f /opt/intel/oneapi ]; then
if [ ! -d /opt/intel/oneapi ]; then
echo "Installing oneAPI components ..."
cd /tmp
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2023.PUB
sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2023.PUB
rm GPG-PUB-KEY-INTEL-SW-PRODUCTS-2023.PUB
echo "deb https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt-get update
sudo apt-get install intel-oneapi-dal-devel-2021.1.1 intel-oneapi-tbb-devel-2021.1.1
# sudo apt-get install -y build-essential cmake
sudo apt-get install -y intel-oneapi-dpcpp-cpp-2021.2.0 intel-oneapi-dal-devel-2021.2.0 intel-oneapi-tbb-devel-2021.2.0
else
echo "oneAPI components already installed!"
fi
fi

echo "Building oneCCL ..."
cd /tmp
rm -rf oneCCL
git clone https://github.com/oneapi-src/oneCCL
cd oneCCL
git checkout 2021.1
git checkout 2021.2
mkdir build && cd build
cmake ..
make -j 2 install

#
# Setup building environments manually:
#
# export ONEAPI_ROOT=/opt/intel/oneapi
# source /opt/intel/oneapi/dal/latest/env/vars.sh
# source /opt/intel/oneapi/tbb/latest/env/vars.sh
# source /tmp/oneCCL/build/_install/env/setvars.sh
#

0 comments on commit 0158e96

Please sign in to comment.