Skip to content

Adding new workflows

Grigori Fursin edited this page Dec 20, 2018 · 53 revisions

[ Home ]

This document is continuously updated based on your feedback.

Table of Contents

Introduction

Here we describe how to add new CK program workflows, software detection plugins and packages either in a new CK repository or in already existing ones. These are the most commonly used CK workflows to compile, run and validate different programs (benchmarks) with different compilers, libraries, data sets and models.

We strongly suggest you to check CK getting started guide, CK basics and available reusable CK components (programs, workflows, packages, software detection plugins, etc) before moving further.

You can also see portable and customizable CK workflows for the 1st ACM ReQuEST-ASPLOS'18 tournament:

the distinguished artifact/workflow from the ACM CGO'17:

and the interactive article sponsored by the Raspberry Pi foundation and automatically generated by CK:

Note that we regularly help our partners and CK end-users to reuse existing CK components and add new ones - do not hesitate to get in touch with CK community via Google group or Slack channel if you have questions or comments!

Creating new CK repository

If you plan to contribute to already existing CK repositories you can skip this subsection. Otherwise, you need to manually create a new dummy CK repository (we are gradually automating this process and plan to make a GUI-based front-end too).

If you do not plan to share your new repository, you can choose some user friendly name such as "my-new-repo" and create it from the command line (Linux, Windows and MacOS) as follows:

 $ ck add repo:my-new-repo --quiet

You can then find where CK created your dummy repo using the following command:

 $ ck where repo:my-new-repo

However, if you plan to share your repository with the community or with your private workgroup to reuse new components, you must first create a dummy repository at GitHub, GitLab, BitBucket or any other Git service. Let's say you created "my-new-repo" at "https://github.com/my_name".

Next you need to pull this repository using CK as follows:

 $ ck pull repo --url=https://github.com/my_name/my-new-repo

CK will create my_name_repo repository and will mark it as shared:

 $ ck where repo:my-new-repo

You can later commit and push updates for this repository back to Git as follows:

 $ ck push repo:my-new-repo

We suggest you to make the first commit immediately after you pulled your dummy repository to the CK to push some internal and automatically created CK files such as repository descriptor (".ckr.json").

Note that you can also commit and push all updates using Git commands from the CK repository directory! Just do not forget to always commit hidden CK directories ".cm/*" at all levels (we plan to automate this process further later).

Now, you can use a newly created repository to add, share, cross-link new components. At the same time, your colleagues or Artifact Evaluation Committee can also obtain your repository and validate your changes or fix bugs in the same way:

 $ ck pull repo --url=https://github.com/my_name/my-new-repo

Such methodology helps us automate and speed up artifact evaluation at open ACM ReQuEST tournaments.

Adding new CK workflows (programs, benchmarks)

You are now ready to add your new CK workflow for example to compile and run some algorithm or benchmark.

Since CK concept is about reusing and extending existing components with a common API to build upon existing work, we first suggest you to look at this index of shared CK programs in case someone already shared the same or related workflow!

If you found a similar program, for example "cbench-automotive-susan" you can create a working copy of it in your new CK repository for further editing as follows:

$ ck pull repo:ctuning-programs 
$ ck cp ctuning-programs:program:cbench-automotive-susan my-new-repo:program:my-copy-of-cbench-automotive-susan

Now you have a working copy of the CK "cbench-automotive-susan" program entry in your new repository which contains sources and the CK meta information about how to compile and run this program as described in CK getting started guide:

$ ck compile program:my-copy-of-cbench-automotive-susan --speed
$ ck run program:my-copy-of-cbench-automotive-susan

You can find and explore the new CK entry from command line as follows:

$ ck find program:my-copy-of-cbench-automotive-susan

In this directory you will see the following files:

  • .cm/desc.json - custom API description (usually empty - can skip for now)
  • .cm/info.json - internal CK information about this entry (date of creation/update, author, license, etc)
  • .cm/meta.json - CK meta information about how to compile and run this program
  • susan.c - C source code of this program

When committing such files to Git do not forget to add .cm directories since they are usually hidden in Linux!

If you did not find similar program, you can still create a new program using shared templates as follows:

$ ck add my-new-repo:program:my-new-program

CK will then ask you to select a template:

0) C program "Hello world" (--template=template-hello-world-c)
1) C program "Hello world" with compile and run scripts (--template=template-hello-world-c-compile-run-via-scripts)
2) C program "Hello world" with jpeg dataset (--template=template-hello-world-c-jpeg-dataset)
3) C program "Hello world" with output validation (--template=template-hello-world-c-output-validation)
4) C program "Hello world" with xOpenME interface and pre/post processing (--template=template-hello-world-c-openme)
5) C++ TensorFlow classification example (--template=image-classification-tf-cpp)
6) C++ program "Hello world" (--template=template-hello-world-cxx)
7) Fortran program "Hello world" (--template=template-hello-world-fortran)
8) Java program "Hello world" (--template=template-hello-world-java)
9) Python MXNet image classification example (--template=mxnet)
10) Python TensorFlow classification example (--template=image-classification-tf-py)
11) Python program "Hello world" (--template=template-hello-world-python)
12) image-classification-tflite (--template=image-classification-tflite)
13) Empty entry

If you select "Python TensorFlow classification example", CK will create a working image classification program in your new repository with software dependencies on TensorFlow AI framework and related models (since it's a Python program, you do not need to compile it):

$ ck run program:my-new-program

Note that you can later make your own program a template by adding the following key to its meta.json file:

"template": "yes"

Updating program sources

If you found a similar program with all necessary software dependencies, you can now just update or change its sources for your own program.

In such case, you may need to update the following keys in the meta.json of this program entry:

  • source files:
"source_files": [
  "susan.c"
], 
  • command line(s) to run program (see "run_cmd_main"):
"run_cmds": {
  "corners": {
    "dataset_tags": [
      "image", 
      "pgm", 
      "dataset"
    ], 
    "ignore_return_code": "no", 
    "run_time": {
      "run_cmd_main": "$#BIN_FILE#$ $#dataset_path#$$#dataset_filename#$ tmp-output.tmp -c" 
    }
  }, 
  "edges": {
    "dataset_tags": [
      "image", 
      "pgm", 
      "dataset"
    ], 
    "ignore_return_code": "no", 
    "run_time": {
      "run_cmd_main": "$#BIN_FILE#$ $#dataset_path#$$#dataset_filename#$ tmp-output.tmp -e" 
    }
  }, 

Note that you can have more than one possible command line to run this program. In such case, CK will ask you which one to use when you run this program. For example, this is used to perform model training ("train"), validation ("test") and image classification ("classify") in different AI programs in the CK such as tensorflow-classification or caffe-classification.

You can also update meta.json keys related to program compilation and execution:

"build_compiler_vars": {
  "XOPENME": ""
}, 

"compiler_env": "CK_CC", 

"extra_ld_vars": "$<<CK_EXTRA_LIB_M>>$", 

"run_vars": {
  "CT_REPEAT_MAIN": "1",
  "NEW_VAR":"123"
}, 

Note that you can update environment variables when running a given program in a unified way from the command line to customize your execution as follows:

$ ck run program:my-new-program --env.NEW_VAR=321

You can later expose different program parameters via environment for example to apply CK customizable autotuner as used in this CK ReQuEST workflow to automatically explore (co-design) different MobileNets configurations in terms of speed, accuracy and costs.

Here is the brief description of other important keys in program meta.json:

"run_cmds": {                

  "corners": {               # User key describing a given execution command line

    "dataset_tags": [        # Data set tags - will be used to query CK
      "image",               # and automatically find related entries such as images
      "pgm", 
      "dataset"
    ], 

    "run_time": {            # Next is the execution command line format
                             # $#BIN_FILE#$ will be automatically substituted with the compiled binary
                             # $#dataset_path#$$#dataset_filename#$ will be substituted with
                             # the first file from the CK data set entry (see above example
                             # of adding new data sets to CK).
                             # tmp-output.tmp is and output file of a processed image.
                             # Basically, you can shuffle below words to set your own CMD

      "run_cmd_main": "$#BIN_FILE#$ $#dataset_path#$$#dataset_filename#$ tmp-output.tmp -c", 

      "run_cmd_out1": "tmp-output1.tmp",  # If !='', add redirection of the stdout to this file
      "run_cmd_out2": "tmp-output2.tmp",  # If !='', add redirection of the stderr to this file

      "run_output_files": [               # Lists files that are produced during
                                          # benchmark execution. Useful when program
                                          # is executed on remote device (such as
                                          # Android mobile) to pull necessary
                                          # files to host after execution
        "tmp-output.tmp", 
        "tmp-ck-timer.json"
      ],


      "run_correctness_output_files": [   # List files that should be used to check
                                          # that program executed correctly.
                                          # For example, useful to check benchmark correctness
                                          # during automatic compiler/hardware bug detection
        "tmp-output.tmp", 
        "tmp-output2.tmp"
      ], 

      "fine_grain_timer_file": "tmp-ck-timer.json"  # If XOpenME library is used, it dumps run-time state
                                                    # and various run-time parameters (features) to tmp-ck-timer.json.
                                                    # This key lists JSON files to be added to unified 
                                                    # JSON program workflow output
    },

    "hot_functions": [                 # Specify hot functions of this program
      {                                # to analyze only these functions during profiling
        "name": "susan_corners",       # or during standalone kernel extraction
        "percent": "95"                # with run-time memory state (see "codelets"
                                       #  shared in CK repository from the MILEPOST project
                                       #  and our recent papers for more info)
      }
    ] 

    "ignore_return_code": "no"         # Some programs have return code >0 even during
                                       # successful program execution. We use this return code
                                       # to check if benchmark failed particularly during
                                       # auto-tuning or compiler/hardware bug detection
                                       #  (when randomly or semi-randomly altering code,
                                       #   for example, see Grigori Fursin's PhD thesis with a technique
                                       #   to break assembler instructions to detect 
                                       #   memory performance bugs) 
  }, 
  ...
}, 

You can also check how to use pre and post-processing (Python) scripts before and after running your program in these examples: classification program using Caffe framework and classification program using ArmCL with MobileNets.

Updating software dependencies

If you new program rely on extra software dependencies (libraries, models, data sets) you must first find the one you need in this index of shared CK software detection plugins and then add it to either "compile_deps" dictionary key or "run_deps" in the meta.json of your new program as follows:

"compile_deps": {
  "compiler": {
    "local": "yes", 
    "name": "C compiler", 
    "sort": 10, 
    "tags": "compiler,lang-c"
  }, 
  "xopenme": {
    "local": "yes", 
    "name": "xOpenME library", 
    "sort": 20, 
    "tags": "lib,xopenme"
  }
}, 
"run_deps": {
  "lib-tensorflow": {
    "local": "yes",
    "name": "TensorFlow library",
    "sort": 10,
    "tags": "lib,tensorflow",
    "no_tags":"vsrc"
  },
  "tensorflow-model": {
    "local": "yes",
    "name": "TensorFlow model (net and weights)",
    "sort": 20,
    "tags": "tensorflowmodel,native"
  }
},

As a minimum, you just need to add a new sub-key such as "lib-tensorflow", a user friendly name such as "TensorFlow library", several tags for a given software detection plugin from above index (CK will use these tags to find related plugins) and an order of resolution of dependencies using "sort" key.

You can also select version ranges with the following keys:

    "version_from": [1,64,0], # inclusive
    "version_to": [1,66,0]    # exclusive

You can look at more complex examples in the meta information of Caffe CUDA package: meta.json

Adding dependency on other repositories when reusing CK components

When you add dependencies on existing CK components from other repositories, you need to add a dependency to all these repositories in the ".ckr.json" description of your new repository as follows:

...
"dict": {
  ...
  "repo_deps": [
    {
      "repo_uoa": "ck-env"
    },
    {
      "repo_uoa": "ck-autotuning"
    },
    {
      "repo_uoa": "ck-caffe",
      "repo_url": "https://github.com/dividiti/ck-caffe"
    }
  ] 
}, 

In such case, if another user pulls your repository, CK will automatically pull all other required CK repositories!

When CK compiles or runs programs, it first automatically resolves all software dependencies. After a required software is detected it is registered in the CK virtual environment (see this getting started guide) and env.sh or env.bat batch scripts are generated with multiple environment variables. All above environment scripts are then loaded one after another based on "sort" key to aggregate all required environment variables and pass them either to a compile script or to a final program. The CK program can then use all these variable to customize its execution and adapt it to a user environment and software selection.

Reusing or adding simple data sets

CK provides a mechanism to use simple and shared data sets such a individual images in program workflows.

Some data sets are already shared in the ctuning-datasets-min repository.

If you want to use them in your program workflow, you can find a related one here, check its tags (see meta.json of image-jpeg-0001 entry), and add them to your program meta as follows:

"run_cmds": {
  "corners": {
    "dataset_tags": [
      "image", 
      "pgm", 
      "dataset"
    ], 
    "ignore_return_code": "no", 
    "run_time": {
      "run_cmd_main": "$#BIN_FILE#$ $#dataset_path#$$#dataset_filename#$ tmp-output.tmp -c" 
    }
  }, 

In such case, CK will search for all data set entries with these flags, will ask user which one to use in case of multiple entries, and will substitute $#dataset_path#$$#dataset_filename#$ with a path and file from a selected entry.

For example you can see all pgm images available in your local CK installation as follows:

$ ck pull repo:ctuning-datasets-min
$ ck search dataset --tags=dataset,image,pgm

Such approach allows to get rid of hardwired paths in ad-hoc scripts while easily sharing and reusing related data sets.

For example, you can add a new dataset in your new repository as follows:

$ ck add my-new-repo:dataset:my-new-dataset

You will be asked to enter tags and select file which will be also placed inside a new entry.

However, if more complex data sets are required such as multiple images as training sets for AI frameworks, we suggest you to use CK packages as described further.

Adding new CK software detection plugins

If CK plugin doesn't exist for the software you need, you can add your own CK software detection plugin either in your own repository or in a public ck-env repository.

You must first find already existing software detection plugin of a similar type using this index of shared CK plugins, and make a copy in your repository as follows:

$ ck copy soft:lib.armcl my-new-repo:soft:lib.my-new-lib
 or
$ ck copy soft:dataset.imagenet.train my-new-repo:soft:my-new-data-set

Alternatively, you can add a new soft entry and CK will ask you to select the most close template:

$ ck add my-new-repo:soft:my-new-data-set

You must then update related keys in the ".cm/meta.json" of the new entry which you can find as follows:

$ ck find soft:lib.my-new-lib
{
  "auto_detect": "yes",
  "customize": {
    "check_that_exists": "yes",
    "ck_version": 10,
    "env_prefix": "CK_ENV_LIB_ARMCL",
    "limit_recursion_dir_search": {
      "linux": 4,
      "win": 4
    },
    "soft_file": {
      "linux": "libarm_compute.a",
      "win": "arm_compute.lib"
    },
    "soft_path_example": {
    }
  },
  "soft_name": "ARM Compute Library",
  "tags": [
    "lib",
    "arm",
    "armcl",
    "arm-compute-library"
  ]
}

You must update "tags" for your new software, "soft_name" to provide user-friendly name of your software, "env_prefix" which is used to expose different environment variables for the detected software in automatically generated "env.sh" or "env.bat", and "soft_file" keys to tell CK which unique file to search when detecting this software.

Instead of "soft_file" dictionary, you can sometimes use a universal (portable) string as follows:

  "soft_file_universal": "libGL$#file_ext_dll#$",

where CK will substitute "file_ext_dll" with "dll" key from "file_extensions" dictionary in the target OS (see example for 64-bit Linux and 64-bit Windows).

You can tell CK to detect a given soft for a different target such as Android as follows:

$ ck detect soft:compiler.gcc.android.ndk --target_os=android21-arm64
$ ck detect soft --tags=compiler,android,ndk,llvm --target_os=android21-arm64

Next you may want to update "customize.py" in the new entry. It can have multiple functions to customize detection of a given software and automatic generation of "env.sh" or "env.bat" when registering virtual Ck environment.

There are many options and nuances so we suggest you to have a look at existing examples or contact CK community for further details - we regularly explain users different features and help them add new software plugins.

Briefly, "setup" function receives a full path to a found software file specified using above "soft_name" keys:

  cus=i.get('customize',{})
  fp=cus.get('full_path','')

It is then used to prepare different environment variables with different paths (see "env" dictionary) as well as embedding commands directly to "env.sh" or "env.bat" using "s" string in the returned dictionary:

  return {'return':0, 'bat':s}

Here is a sample of a generated "env.sh" on a user machine:

#! /bin/bash
# CK generated script

if [ "$1" != "1" ]; then if [ "$CK_ENV_LIB_ARMCL_SET" == "1" ]; then return; fi; fi

# Soft UOA           = lib.armcl (fc544df6941a5491)  (lib,arm,armcl,arm-compute-library,compiled-by-gcc,compiled-by-gcc-8.1.0,vopencl,vdefault,v18.05,v18,channel-stable,host-os-linux-64,tar
get-os-linux-64,64bits,v18.5,v18.5.0)
# Host OS UOA        = linux-64 (4258b5fe54828a50)
# Target OS UOA      = linux-64 (4258b5fe54828a50)
# Target OS bits     = 64
# Tool version       = 18.05-b3a371b
# Tool split version = [18, 5, 0]

# Dependencies:
. /home/fursin/CK/local/env/fd0d1d044f44c09b/env.sh
. /home/fursin/CK/local/env/72fa25bd445a993f/env.sh

export CK_ENV_LIB_ARMCL_LIB=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/install/lib
export CK_ENV_LIB_ARMCL_INCLUDE=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/install/include


export LD_LIBRARY_PATH="/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/install/lib":$LD_LIBRARY_PATH
export LIBRARY_PATH="/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/install/lib":$LIBRARY_PATH

export CK_ENV_LIB_ARMCL=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/install
export CK_ENV_LIB_ARMCL_CL_KERNELS=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/src/src/core/CL/cl_kernels/
export CK_ENV_LIB_ARMCL_DYNAMIC_CORE_NAME=libarm_compute_core.so
export CK_ENV_LIB_ARMCL_DYNAMIC_NAME=libarm_compute.so
export CK_ENV_LIB_ARMCL_LFLAG=-larm_compute
export CK_ENV_LIB_ARMCL_LFLAG_CORE=-larm_compute_core
export CK_ENV_LIB_ARMCL_SRC=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/src
export CK_ENV_LIB_ARMCL_SRC_INCLUDE=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/src/include
export CK_ENV_LIB_ARMCL_STATIC_CORE_NAME=libarm_compute_core.a
export CK_ENV_LIB_ARMCL_STATIC_NAME=libarm_compute.a
export CK_ENV_LIB_ARMCL_TESTS=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/src/tests
export CK_ENV_LIB_ARMCL_UTILS=/home/fursin/CK-TOOLS/lib-armcl-opencl-18.05-gcc-8.1.0-linux-64/src/utils

export CK_ENV_LIB_ARMCL_SET=1

All these environment variables will be exposed to the CK program compilation and execution flow if this software dependency is seleted in program meta description.



Note that it is in our plans to standardize and document this process with out CK consortium and partners.

You can also look at how this functionality is implemented in the soft module.

Adding new CK packages

Whenever a required software is not found, CK will automatically search for existing CK packages with the same tags.

CK package module provides a unified JSON API to automatically install, rebuild and mix a given software with already installed software in a portable way across Linux, Windows, MacOS and Android. Basically it is a unified front-end for other download or build tools such as wget, make, cmake, scons, EasyBuild, Spack, etc.

If CK packages for a given software are not found, CK will print notes from an "install.txt" file from a given software detection plugin about how to install such package manually as shown in this example for CUDA compiler.

However, you may be interested to provide a related CK package for such software to automate installation and to let the community reuse it (unless you make a private CK package for your own workgroup).

Similar to CK software detection plugins, you must find the most close package from this index of shared CK packages and make a copy in your repository (unless you want to share it immediately with the community in already existing CK repositories).

For example, let's copy Protobuf library which is downloaded as tgz archive and configured for building using cmake:

$ ck cp package:lib-protobuf-3.5.1-host my-new-repo:package:my-new-lib

Most importantly, you first need to connect this package with a related software detection plugin, for example with "soft:lib.my-new-lib" created in the previous section. Find its Unique CK ID as follows:

$ ck info soft:lib.my-new-lib
and add this UID to the "soft_uoa" of the package meta.

Next, copy/paste *the same* tags from the meta information of the soft plugin to the package meta and add extra tags specifying a version. Just see examples of existing tags in existing packages such as lib-armcl-opencl-18.05 and compiler-llvm-6.0.0-universal.

Alternatively, you can add a new package using existing templates while specifying a related software plugin in the command line as follows:

$ ck add my-new-repo:package:my-new-lib --soft=lib.my-new-lib

In such case, CK will automatically substitute correct values for "soft_uoa" and "tags"!

Then you need to update ".cm/meta.json" of a new package:

$ ck find package:my-new-lib

For example, you need to update other keys in the package meta to customize downloads and building (not required when you download binary packages such as LLVM compiler or different AI models):

 "install_env": {
    "CMAKE_CONFIG": "Release",
    "PACKAGE_AUTOGEN": "NO",
    "PACKAGE_BUILD_TYPE": "cmake",
    "PACKAGE_CONFIGURE_FLAGS": "-Dprotobuf_BUILD_TESTS=OFF",
    "PACKAGE_CONFIGURE_FLAGS_LINUX": "-DCMAKE_INSTALL_LIBDIR=lib",
    "PACKAGE_CONFIGURE_FLAGS_WINDOWS": "-DBUILD_SHARED_LIBS=OFF -Dprotobuf_MSVC_STATIC_RUNTIME=OFF",
    "PACKAGE_FLAGS_LINUX": "-fPIC",
    "PACKAGE_NAME": "v3.5.1.tar.gz",
    "PACKAGE_NAME1": "v3.5.1.tar",
    "PACKAGE_NAME2": "v3.5.1",
    "PACKAGE_RENAME": "YES",
    "PACKAGE_SUB_DIR": "protobuf-3.5.1",
    "PACKAGE_SUB_DIR1": "protobuf-3.5.1/cmake",
    "PACKAGE_UNGZIP": "YES",
    "PACKAGE_UNTAR": "YES",
    "PACKAGE_UNTAR_SKIP_ERROR_WIN": "YES",
    "PACKAGE_URL": "https://github.com/google/protobuf/archive",
    "PACKAGE_WGET": "YES"
  },
  "version": "3.5.1"

You also need to describe other software dependencies if needed using "deps" dictionary.

You must also describe the file which will be downloaded or created at the end of package installation process using "end_full_path" key:

"end_full_path": {
  "linux": "install$#sep#$lib$#sep#$libprotobuf.a",
  "win": "install\\lib\\libprotobuf.lib"

If your package use your own install script to download and possibly build a given package, you need to update it too. See examples of install.sh and install.bat to download ImageNet 2012 aux data set (used in the 1st ACM ReQuEST-ASPLOS'18 tournament and register it in the CK virtual environment.

Note that CK will pass at least 2 environment variables to this script:

  • PACKAGE_DIR - the path to the CK package entry. This is useful if your script need additional files or subscripts from the CK package entry.
  • INSTALL_DIR - the path where this package will be installed. Note that "end_full_path" key will be appended to this path!

If you need to know extra CK variables passed to this script, you can just export all environment variable to some file and check the ones starting from CK_ .

For example, if your package has software dependencies for example on a specific Python version, all environment variables from the resolved software dependencies will be available in your installation script. This allows you for example to use "${CK_ENV_COMPILER_PYTHON_FILE}" instead of python to use the correct version of Python in all packages.

At the end of package installation, CK will check if this file was created, and will pass it to software detection plugin to register CK virtual environment - this fully automates the process of rebuilding required environment for a given workflow using CK!

Yet again, we could describe only a tiny part of all available functionality of the CK package manager. In the past 4 years we added many practical features based on the feedback from our partners and end-users. Feel free to look at existing packages and reuse them:

If you want to create a package which will simply fetch a source repo, configure it, and build it with a make file, please use this lib-openmpi-1.10.3-universal CK package.

  • "PACKAGE_URL": "https://www.open-mpi.org/software/ompi/v1.10/downloads"
  • "PACKAGE_NAME": "openmpi-1.10.3.tar.gz"
  • "PACKAGE_NAME1": "openmpi-1.10.3.tar"
  • "PACKAGE_NAME2": "openmpi-1.10.3"
  • "PACKAGE_SUB_DIR": "openmpi-1.10.3"
  • "PACKAGE_SUB_DIR1": "openmpi-1.10.3"
  • "linux": "install/lib/libmpi.so"

Note that the community plans to improve our package manager and simplify it in 2019! Don't hesitate to contact the CK community if you have problems and questions - we always help CK users to add new packages!

Creating experimental workflows

Users can assemble experimental workflows in the CK using two ways:

Traditional way using systems scripts

We added module "script" which allows you to add CK entry with different system scripts to prepare, run and validate experiments. Such scripts can call different ck modules to install packages, build and run programs, prepare interactive graphs and articles, etc.

You can see examples of such scripts from this CGO'17 paper which won distinguished artifact award. Unified Artifact Appendix in this article describes how to run those scripts.

You can add your own CK script entry as follows:

 $ ck add my-new-repo:script:my-scripts-to-run-experiments
 $ ck add my-new-repo:script:my-scripts-to-generate-articles

Instead of system scripts, you can also write Python scripts. For example this ReQuEST-ASPLOS'18 submission uses benchmark.py to prepare, run and customize experiments. It includes CK kernel as Python modules and thus calls CK functions directly from Python:

#! /usr/bin/python

import ck.kernel as ck
import os

...

def do(i, arg):

    # Process arguments.
    if (arg.accuracy):
        experiment_type = 'accuracy'
        num_repetitions = 1
    else:
        experiment_type = 'performance'
        num_repetitions = arg.repetitions

    random_name = arg.random_name
    share_platform = arg.share_platform

    # Detect basic platform info.
    ii={'action':'detect',
        'module_uoa':'platform',
        'out':'con'}
    if share_platform: ii['exchange']='yes'
    r=ck.access(ii)
    if r['return']>0: return r
...

CK way using CK modules

You can also add a new module "workflow.my-new-experiments" with different functions to prepare, run and validate experiments. This is a preferred method which allows you to use unified CK APIs and reuse this module in other projects:

 $ ck add my-new-repo:module:workflow.my-new-experiments

You can then add a function "run" to run your workflow (or any other function you need):

 $ ck add_action my-new-repo:module:workflow.my-new-experiments --func=run

CK will create a working dummy function which you can test as follows:

 $ ck run workflow.my-new-experiments

You can then find "module.py" and change "run" function to implement your workflow:

 $ ck find module:workflow.my-new-experiments
 $ ls *.py

Don't hesitate to get in touch with the CK community if you have questions or comments.

Using unified predictive analytics

We had a very tedious and time-consuming experience creating machine-learning based compiler. We had to prepare numerous scripts and workflows to train and use different models across different frameworks with incompatible APIs and data formats, and then continuously update them when APIs or data formats were changing. This motivated us to create a common CK API for training, validating and using models. You can now reuse this functionality in your experimental workflows as described in this page. You can see real usage example of such unified predictive analytics via CK in this report describing "a Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques" sponsored by the Raspberry Pi foundation.

Generating (interactive) articles

Unified CK APIs and experimental workflows allow us to automate generation of tables and graphs for articles, and even automate generation of interactive papers. For example, this interactive report and its ArXiv PDF were automatically generated by CK from this CK repository.

We recently started documenting this functionality here and will continue improving it. In the mean time, if you interested to automate generation of your paper, don't hesitate to get in touch with the CK community.

Archiving a given repository

You can archive a given repository (for example with experimental results) as follows:

$ ck zip repo:my-new-repo

This command will create ckr-my-new-repo.zip which you can send to your colleagues or archive at Zenodo, etc. Other colleagues can then add it to their local user space as follows:

$ ck add repo --zip=ckr-my-new-repo.zip

They can also unzip entries to an existing repository (local by default) as follows:

$ ck unzip repo --zip


== Preparing CK artifact pack for Digital Libraries ==

During the [http://cKnowledge.org/request 1st ACM ReQuEST-ASPLOS'18 tournament] 
the authors shared snapshots of their implementations of efficient deep learning algorithms 
with the CK workflows in the [https://doi.org/10.1145/3229762 ACM Digital Library]. 

You can see the ACM DL links with artifacts for all accepted workflows [https://github.com/ctuning/ck-request-asplos18-results here].

We added a new functionality to the CK to automatically prepare such a snapshot 
of a given repository with all dependencies on other CK repositories 
together with the latest CK framework in one zip file: 
 $ ck snapshot artifact --repo=my-new-repo

```

It will create ck-artifacts-{date}.zip which will have all CK repositories, CK framework and two scripts:

  • prepare_virtual_ck.bat
  • run_virtual_ck.bat

The first script will unzip all CK repositories and the CK framework inside your current directory.

The second script will set environment variables to point to above CK repositories in such a way that it will not influence you existing CK installation! Basically it creates a virtual CK environment for a given CK snapshot. At the end, this script will run bash (or cmd on Windows) allowing you to run CK commands to install and run a given CK workflow.

Preparing Docker image

Note that CK is orthogonal to Docker images. For example, final Intel submission of optimized Caffe to the ReQuEST tournament included Docker image with CK workflows which you can test here.

You can see instructions about how to build and share a Docker image for your CK repository in the Readme of the ck-docker repository.

Preparing more sophisticated workflows

Researchers can create even more complex CK workflows which will automatically compile, run and validate multiple applications with different compilers, data sets and models across different platforms while sharing, visualizing and comparing experimental results via a customizable dashboard, and automatically generating interactive and reproducible article as show in the following CK-powered projects:

Questions, comments and next steps

We now continue standardizing, improving and automating CK framework, individual components and APIs (see ACM ReQuEST-ASPLOS'18 report), but we also hope that you can help us improve this documentation or write blog articles similar to this one about your CK experience and hints.

We also regularly help our partners and end-users to reuse or improve existing CK components and implement new ones via CK Google group or CK slack channel - feel free to get in touch with the CK community if you have questions, comments or would like to participate in these community activities!

Clone this wiki locally
You can’t perform that action at this time.