Collective Knowledge framework helps to organize local code, data and scripts; convert them into portable, customizable and reusable components with a Python JSON API and integrated package manager; quickly prototype research workflows on Linux, Windows, MacOS and Android; automate & crowsource complex experiments; generate interative papers, etc:
Python PHP Other
Clone or download

README.md

License Linux/MacOS: Build Status Windows: Windows Build status Coverage: Coverage Status

Introduction

Collective Knowledge (CK) is an open-source framework to speed up collaborative and reproducible R&D with reusable, customizable and portable components. Trusted by a growing number of academic and industrial partners, CK helps to automate artifact evaluation and accelerate complex experiments such as benchmarking, co-design and optimization of the whole SW/HW stack for AI/ML. Git it a try!

CK framework is based on agile, DevOps and Wikipedia principles helping users to:

Please, check out the latest ACM ReQuEST-ASPLOS'18 report about results of the 1st CK-powered competition on co-designing Pareto-efficient SW/HW stack for deep learning, CK motivation slides and CK use cases from our partners including reproducible ACM tournaments on reproducible SW/HW co-design of emerging workloads and artifact sharing via ACM Digital Library.

Join the CK consortium to influence CK long-term developments and standardization of APIs and meta descriptions of all shared CK workflows and components!

CK resources

Minimal installation

The minimal installation requires:

  • Python 2.7 or 3.3+ (limitation is mainly due to unitests)
  • Git command line client
  • wget (Linux/MacOS)

Linux/MacOS

You can install CK in your local user space as follows:

$ git clone http://github.com/ctuning/ck
$ export PATH=$PWD/ck/bin:$PATH
$ export PYTHONPATH=$PWD/ck:$PYTHONPATH

You can also install CK via PIP with sudo to avoid setting up environment variables yourself:

$ sudo pip install ck

Finally, start from Ubuntu 18.10, you can install it via apt:

$ sudo apt install python-ck
 or
$ sudo apt install python3-ck

Windows

First you need to download and install a few dependencies from the following sites:

You can then install CK as follows:

 $ pip install ck

or

 $ git clone https://github.com/ctuning/ck.git ck-master
 $ set PATH={CURRENT PATH}\ck-master\bin;%PATH%
 $ set PYTHONPATH={CURRENT PATH}\ck-master;%PYTHONPATH%

Customization and troubleshooting

You can find troubleshooting notes or other ways to install CK such as via pip here. You can find how to customize your CK installation here.

Getting first feeling about portable and customizable workflows for collaborative benchmarking

Test ck:

$ ck version

Get shared ck-tensorflow repo with all dependencies:

$ ck pull repo:ck-tensorflow

List CK repos:

$ ck ls repo | sort

Find where CK repos are installed on your machine:

$ ck where repo:ck-tensorflow

Detect your platform properties via extensible CK plugins as follows (needed to unify benchmarking across diverse platforms with Linux, Windows, MacOS and Android):

$ ck detect platform

Now detect available compilers on your machine and register virtual environments in the CK:

$ ck detect soft --tags=compiler,gcc
$ ck detect soft --tags=compiler,llvm
$ ck detect soft --tags=compiler,icc

See virtual environments in the CK:

$ ck show env

We recommend to setup CK to install new packages inside CK virtual env entries:

$ ck set kernel var.install_to_env=yes

Now install CPU-version of TensorFlow via CK packages:

$ ck install package --tags=lib,tensorflow,vcpu,vprebuilt

Check that it's installed fine:

$ ck show env --tags=lib,tensorflow

You can find a path to a given entry (with TF installation) as follows:

$ ck find env:{env UID from above list}

Run CK virtual environment and test TF:

$ ck virtual env --tags=lib,tensorflow
$ ipython
> import tensorflow as tf

Run CK classification workflow example using installed TF:

$ ck run program:tensorflow --cmd_key=classify

Now you can try a more complex example to build Caffe with CUDA support and run classification. Note that CK should automatically detect your CUDA compilers, libraries and other deps or install missing packages:

$ ck pull repo --url=https://github.com/dividiti/ck-caffe
$ ck install package:lib-caffe-bvlc-master-cuda-universal
$ ck run program:caffe --cmd_key=classify

You can see how to install Caffe for Linux, MacOS, Windows and Android via CK here.

Finally, compile, run, benchmark and crowd-tune some C program (see shared optimization cases in http://cKnowledge.org/repo):

$ ck pull repo:ck-crowdtuning

$ ck ls program
$ ck ls dataset

$ ck compile program:cbench-automotive-susan --speed
$ ck run program:cbench-automotive-susan

$ ck benchmark program:cbench-automotive-susan

$ ck crowdtune program:cbench-automotive-susan

You can also quickly your own program/workflow using provided templates as follows:

$ ck add program:my-new-program

When CK asks you to select a template, please choose "C program "Hello world". You can then immediately compile and run your C program as follows:

$ ck compile program:my-new-program --speed
$ ck run program:my-new-program
$ ck run program:my-new-program --env.CK_VAR1=222

Find and reuse other shared CK workflows and artifacts:

Further details:

Trying CK using Docker image

You can try CK using the following Docker image:

 $ (sudo) docker run -it ctuning/ck

Note that we added Docker automation to CK to help evaluate artifacts at the conferences, share interactive and reproducible articles, crowdsource experiments and so on.

For example, you can participate in GCC or LLVM crowd-tuning on your machine simply as follows:

 $ (sudo) docker run ck-crowdtune-gcc
 $ (sudo) docker run ck-crowdtune-llvm

You can then browse top shared optimization results on the live CK scoreboard: http://cKnowledge.org/repo

Open ACM ReQuEST tournaments are now using our approach and technology to co-design efficient SW/HW stack for deep learning and other emerging workloads: http://cKnowledge.org/request

You can also download and view one of our CK-based interactive and reproducible articles as follows:

 $ ck pull repo:ck-docker
 $ ck run docker:ck-interactive-article --browser (--sudo)

See the list of other CK-related Docker images here.

However note that the main idea behind CK is to let the community collaboratively improve common experimental workflows while making them adaptable to latest environments and hardware, and gradually fixing reproducibility issues as described here!

Citing CK (BibTeX)

@inproceedings{ck-date16,
    title = {{Collective Knowledge}: towards {R\&D} sustainability},
    author = {Fursin, Grigori and Lokhmotov, Anton and Plowman, Ed},
    booktitle = {Proceedings of the Conference on Design, Automation and Test in Europe (DATE'16)},
    year = {2016},
    month = {March},
    url = {https://www.researchgate.net/publication/304010295_Collective_Knowledge_Towards_RD_Sustainability}
}
@inproceedings{cm:29db2248aba45e59:c4b24bff57f4ad07,
   author = {{Fursin}, Grigori and {Lokhmotov}, Anton and {Savenko}, Dmitry and {Upton}, Eben},
    title = "{A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1801.08024},
 primaryClass = "cs.CY",
 keywords = {Computer Science - Computers and Society, Computer Science - Software Engineering},
     year = 2018,
    month = jan,
    url = {https://arxiv.org/abs/1801.08024},
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180108024F}
}

Some ideas were also originally presented in this 2009 paper.

Discussions/questions/comments

CK authors

License

  • Permissive 3-clause BSD license. (See LICENSE.txt for more details).

Acknowledgments

CK development is coordinated by the cTuning foundation (non-profit research organization) and dividiti. We would like to thank the TETRACOM 609491 Coordination Action for initial funding and all our partners for continuing support. We are also extremely grateful to all volunteers for their valuable feedback and contributions.