First steps

Grigori Fursin edited this page Jul 17, 2018 · 1 revision

[ Home ]

Demonstrating portable and customizable CK benchmarking workflows

We originally developed CK to help our colleagues and partners (researchers, developers) implement portable, customizable and reusable workflows for collaborative and reproducible experimentation.

We use such CK workflows to automate systems, ML and AI research which often includes tedious and repetitive benchmarking, optimization and software/hardware co-design (see ACM ReQuEST tournaments).

Here is a simple example of how to compile and run programs with multiple datasets on different platforms via CK workflows.

First you need to install CK with a few minimal dependencies: Python 2.7+ or 3.4+, pip, git and wget (see minimal CK installation guide how to install such dependencies on Linux, MacOS and Windows):

You can install CK via PIP with sudo (skip sudo on Windows):

$ sudo pip install ck

However, if you don't have root access, you can install CK in your local user space as follows:

$ git clone https://github.com/ctuning/ck ck-master
$ export PATH=$PWD/ck-master/bin:$PATH
$ export PYTHONPATH=$PWD/ck-master:$PYTHONPATH

On MacOS we also suggest you to install complete LLVM along with the native one as follows:

$ brew install llvm

Note that we also recommend to setup CK to install new packages inside CK virtual env entries (we will provide more details further):

$ ck set kernel var.install_to_env=yes

Now you can pull CK repo with multiple benchmarks in the CK format.

$ ck pull repo:ctuning-programs

CK will also automatically obtain other CK repos with related workflows and artifacts. You can see them as follows:

$ ck ls repo

You can now see all shared programs in the CK format:

$ ck ls program

You can find and investigate the CK format for a given program (say cbench-automotive-susan) as follows:

$ ck find program:cbench-automotive-susan

It's probably better to see it online with all the sources: https://github.com/ctuning/ctuning-programs/tree/master/program/cbench-automotive-susan .

You can also see a "scary" CK JSON meta description of this entry: https://github.com/ctuning/ctuning-programs/blob/master/program/cbench-automotive-susan/.cm/meta.json .

Now you can try to compile this program:

$ ck compile program:cbench-automotive-susan --speed

CK will invoke function "compile" of a module "program" (you can find source code of this module in the following CK entry "ck find module:program") which will read above meta information and perform some actions.

For example, CK will attempt to automatically detect all installed software dependencies such as compilers and libs. CK uses multiple plugins describing how to detect different software from here: https://github.com/ctuning/ck-env/tree/master/soft . You can find a list of supported software here.

Extra plugins can be also added by users in their own CK repositories.

You can also perform software detection manually, for example to detect all installed GCC versions:

$ ck detect soft --tags=compiler,gcc

All detected software is registered in the CK with an associated virtual environment similar to Python and Conda but for with a support for any binary installation:

$ ck show env
$ ck show env --tags=compiler,gcc

Now you can run this program as follows:

$ ck run program:cbench-automotive-susan

CK will collect and unify various characteristics (execution time, code size, etc) via JSON API.

This allows one to perform unified benchmarking with multiple executions, monitoring CPU/GPU frequency, performing statistical analysis of empirical results, etc:

$ ck benchmark program:cbench-automotive-susan

Note that CK programs can also take multiple data sets which can be shared by users in different repos (for example, when publishing a new paper)

$ ck search dataset
or
$ ck search dataset --tags=jpeg

Now users can assemble their own experiments just by reusing such workflows (rather than preparing all this infrastructure).

Note, that if software dependency is not resolved, then we invoke our internal CK package manager to automatically install a given software. You can see available CK packages here.

You can see them from the command line as follows:

$ ck search package --all

For example, you can install the latest LLVM as follows:

$ ck install package --tags=llvm,v6.0.0

Note that an associated CK environment will be also created:

$ ck show env --tags=llvm,v6.0.0
Since all packages are installed in a user space ($HOME/CK-TOOLS) we also implemented a virtual env based on our user feedback and similar to Conda but even for binary installations:
$ ck virtual env --tags=llvm,v6.0.0

In such case, multiple versions of the same tools can easily co-exist in the CK since we automatically set up PATH, LD_LIBRARY_PATH, etc. CK workflows can then easily use specific versions of required tools.

You can also set several virtual environments at once:

$ ck show env
$ ck virtual env {UID1 from above list} {UID2 from above list} ...

Another important CK feature is that all these steps work in the same way across Windows, Linux, MacOS and even Android (you just need to add --target_os=android23-arm64 when installing packages or compiling and running your programs) and automatically supports both Python 2 and 3+.

Now you can try a more complex example with TensorFlow. You should pull a related repository and install CPU-version of TensorFlow via CK:

$ ck pull repo:ck-tensorflow
$ ck install package --tags=lib,tensorflow,vcpu,vprebuilt

Check that it's installed fine:

$ ck show env --tags=lib,tensorflow

You can find a path to a given entry (with TF installation) as follows:

$ ck find env:{env UID from above list}

Run CK virtual environment and test TF:

$ ck virtual env --tags=lib,tensorflow
$ ipython
> import tensorflow as tf

Run CK classification workflow example using installed TF:

$ ck run program:tensorflow --cmd_key=classify

You can even try to rebuild TensorFlow via CK:

$ ck install package:lib-tensorflow-1.7.0-cuda

CK will attempt detect your CUDA compiler and related libs, Java, Basel and will try to rebuild TF. Note that you may still need to install some extra deps yourself: https://github.com/ctuning/ck-tensorflow#prerequisites-for-ubuntu

You can now try to build another AI framework such as Caffe with CUDA support and run classification in a similar way! Note that CK should reuse detected CUDA compilers, libraries and other deps from the previous step, or will attempt to install missing packages:

$ ck pull repo --url=https://github.com/dividiti/ck-caffe
$ ck install package:lib-caffe-bvlc-master-cuda-universal
$ ck run program:caffe --cmd_key=classify

You can see how to install Caffe for Linux, MacOS, Windows and Android via CK here.

You can even participate in crowd-tuning of some C program (see shared optimization cases in http://cKnowledge.org/repo):

$ ck pull repo:ck-crowdtuning

$ ck crowdtune program:cbench-automotive-susan

You can also invoke CK from your own Python scripts using one unified access function. For example you can run above program:caffe from a Python script as follows:

import ck.kernel as ck

r=ck.access({'action':'run',
             'module_uoa':'program',
             'data_uoa':'caffe',
             'cmd_key':'classify',
             'out':'con'})
if r['return']>0: ck.err(r)

print (r)

You can also reuse CK kernel productivity functions which we made portable across Python 2 and 3, and different OS and platforms!

Finally, you can check a complex SW/HW co-design workflow implemented and unified using CK for image classification using deep learning on ARM GPU platforms: https://github.com/dividiti/ck-request-asplos18-mobilenets-armcl-opencl

As you may notice, CK is simply a local repository and workflow manager which allows one to share code and data in a customizable, portable and reusable way with a unified CMD/JSON API and meta information. It promotes artifact reuse while gradually substituting and unifying numerous ad-hoc scripts and data structures which easily die after developers leave project.

Find and reuse other shared CK workflows and components:

You can check how above CK workflows and components are used in ACM ReQuEST tournaments to collaboratively co-design SW/HW stack for emerging workloads such as deep learning:

You can also check two other alternative Getting Started Guides 1 and 2.

Questions and comments

You are welcome to get in touch with the CK community if you have questions or comments!

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.