What is it?
SYCL-ML is a framework providing simple classical machine learning algorithms using SYCL. It is meant to be accelerated on any OpenCL device supporting SPIR or SPIRV (experimental). The following links give more details on what SYCL is:
What can it do?
Some linear algebra operations had to be written from scratch such as:
- Matrix inversion
- SVD decomposition
- QR decomposition
In terms of more machine learning related operations it includes:
- Principal Component Analysis: used to reduce the dimensionality of a problem.
- Linear Classifier (see naive Bayes classifier): classify assuming all variables are equally as important.
- Gaussian Classifier: classify using the Gaussian distribution.
- Gaussian Mixture Model: based on the EM algorithm, uses multiple Gaussian distribution for each labels.
- Support Vector Machine: C-SVM with any possible kernel function.
SYCL-ML is a header only library which make it easy to integrate.
More details on what the project implements and how it works can be found on our website. Make sure to use the blogpost branch if you want to observe the same results as shown there.
- Optimize SVD decomposition for faster PCA. The algorithm probably needs to be changed to compute eigenpairs differently.
- Optimize SVM for GPU. More recent papers on SVM for GPU should be experimented.
- Implement an LDA (or dimensionality reduction algorithms) which would be used as a preprocessing step similarly to a PCA.
- Implement a K-means (or other clustering algorithms) which could be used to improve the initialization of the EM.
- Add a proper way to select a SYCL device.
SYCL-ML has been tested with:
- Ubuntu 16.04.3, kernel 4.13.0-26, amdgpu pro driver 17.40 OR Ubuntu 14.04.5, kernel 3.19.0-79, fglrx driver 2:15.302
- CMake 3.0
- g++ 5.4
- ComputeCpp 1.0.1
ComputeCpp can be downloaded from the CodePlay website.
Once extracted, ComputeCpp path should be set as an environment variable to
COMPUTECPP_PACKAGE_ROOT_DIR (usually /usr/local/computecpp).
Alternatively, it can be given as an argument to cmake with
SYCL-ML depends on SYCLParallelSTL.
SYCLParallelSTL's path must be set to
SYCL_PARALLEL_STL_ROOT either as an environment variable or as an argument to cmake.
git clone https://github.com/KhronosGroup/SyclParallelSTL.git
The last requirement is the Eigen-Optimised-Tensor-Vector-Contraction branch of Eigen.
Eigen's path must be set to
EIGEN_INCLUDE_DIRS either as an environment variable or as an argument to cmake.
The version of Eigen needed is slightly different than the upstream.
The changes are packed in the
eigen.patch file which the next section shows how to apply.
hg clone https://bitbucket.org/codeplaysoftware/eigen
The eigen patch file must be applied first then cmake and make:
patch -p1 -d <Eigen_root> < eigen.patch mkdir build && cd build cmake -DSYCL_PARALLEL_STL_ROOT=<SYCLParallelSTL_root> -DEIGEN_INCLUDE_DIRS=<Eigen_root> .. make
Note that on Unix CMake will take care of downloading the MNIST dataset using wget and gunzip.
It is recommended to run the tests before running the examples:
cd build/tests ctest --output-on-failure
The documentation can be built with doxygen. It requires dot from the graphviz package. Simply run:
The project is under the Apache 2.0 license. Any contribution is welcome! Also feel free to raise an issue for any questions or suggestions.