Skip to content

Commit

Permalink
updated docs (portable workflows)
Browse files Browse the repository at this point in the history
  • Loading branch information
gfursin committed Sep 2, 2020
1 parent 9ebd21e commit ed8eaba
Show file tree
Hide file tree
Showing 5 changed files with 296 additions and 6 deletions.
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,22 @@ and ACM at [cKnowledge.org/partners](https://cKnowledge.org/partners.html).
* [experiment replay]( https://cKnowledge.io/c/module/experiment )
* [live scoreboards]( https://cKnowledge.io/reproduced-results )
* [Real-world use-cases](https://cKnowledge.org/partners)
* [MLPerf CK solution (GUI)](https://cKnowledge.io/test)
* [MLPerf CK workflows and components (development version)](https://github.com/ctuning/ck-mlperf)
* [ck-mlperf:soft:lib.mlperf.loadgen.static](https://github.com/ctuning/ck-mlperf/tree/master/soft/lib.mlperf.loadgen.static)
* [ck-mlperf:package:lib-mlperf-loadgen-static](https://github.com/ctuning/ck-mlperf/tree/master/package/lib-mlperf-loadgen-static)
* [ck-mlperf:package:model-onnx-mlperf-mobilenet](https://github.com/ctuning/ck-mlperf/tree/master/package/model-onnx-mlperf-mobilenet/.cm)
* [ck-tensorflow:package:lib-tflite](https://github.com/ctuning/ck-tensorflow/tree/master/package/lib-tflite)
* [ck-mlperf:program:image-classification-tflite-loadgen](https://github.com/ctuning/ck-mlperf/tree/master/program/image-classification-tflite-loadgen)
* [ck-tensorflow:program:image-classification-tflite](https://github.com/ctuning/ck-tensorflow/tree/master/program/image-classification-tflite)
* [ck-mlperf:docker:*](https://github.com/ctuning/ck-mlperf/tree/master/docker)
* [ck-mlperf:docker:speech-recognition.rnnt](https://github.com/ctuning/ck-mlperf/tree/master/docker/speech-recognition.rnnt)
* [ck-object-detection:docker:object-detection-tf-py.tensorrt.ubuntu-18.04](https://github.com/ctuning/ck-object-detection/blob/master/docker/object-detection-tf-py.tensorrt.ubuntu-18.04)
* [ck-object-detection:package:model-tf-*](https://github.com/ctuning/ck-object-detection/tree/master/package)
* [ck-mlperf:script:mlperf-inference-v0.7.image-classification](https://github.com/ctuning/ck-mlperf/tree/master/script/mlperf-inference-v0.7.image-classification)
* [ck-object-detection:jnotebook:object-detection](https://nbviewer.jupyter.org/urls/dl.dropbox.com/s/5yqb6fy1nbywi7x/medium-object-detection.20190923.ipynb)
* [MLPerf stable crowd-benchmarking demo with the live scoreboard](https://cKnowledge.io/test)
* [CK-based live research paper (collaboration with the Raspberry Pi foundation)](https://cKnowledge.io/report/rpi3-crowd-tuning-2017-interactive).
* [Publications](https://github.com/ctuning/ck/wiki/Publications)

## Installation
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ and ACM: https://cKnowledge.org/partners.

src/commands
src/specs
src/portable-workflows
src/how-to-contribute

.. toctree::
Expand Down
59 changes: 55 additions & 4 deletions docs/src/commands.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CK commands and APIs
# CK CLI and API

Most of the CK functionality is implemented using [CK modules](https://cKnowledge.io/modules)
with [automation actions]( https://cKnowledge.io/actions ) and associated
Expand All @@ -24,7 +24,7 @@ You can also use a JSON file as the input to a given action:
ck {action} ... @input.json
```

## Managing CK repositories
## CLI to manage CK repositories

* Automation actions are implemented using the internal CK module [*repo*]( https://cknowledge.io/c/module/repo ).
* See the list of all automation actions and their API at [cKnowledge.io platform]( https://cknowledge.io/c/module/repo/#api ).
Expand Down Expand Up @@ -142,7 +142,7 @@ ck unzip repo:{CK repo name} --zip={path to a zip file with the CK repo}



## Managing CK entries
## CLI to manage CK entries

CK repository is basically a database of CK modules and entries.
You can see internal CK commands to manage CK entries as follows:
Expand Down Expand Up @@ -422,7 +422,7 @@ ck cp ctuning-datasets-min:dataset:image-jpeg-dnn-computer-mouse local::new-imag



## Managing CK actions
## CLI to manage CK actions

All the functionality in CK is implemented as automation actions in CK modules.

Expand Down Expand Up @@ -557,8 +557,59 @@ Finally, a given CK module has an access to the 3 dictionaries:

## CK Python API

One of the goals of the CK framework was to make it very simple for any user to access any automation action.
That is why we have developed just one [unified Python "access" function](https://ck.readthedocs.io/en/latest/src/ck.html#ck.kernel.access)
that allows one to access all automation actions with a simple I/O (dictionary as input and dictionary as output).

You can call this function from any Python script or from CK modules as follows:

```Python
import ck.kernel as ck

i={'action': # specify action
'module_uoa': # specify CK module UID or alias

check keys from a given automation action for a given CK module (ck action module --help)
}

r=ck.access(i)

if r['return']>0: return r # if used inside CK modules to propagate to all CK callers
#if r['return']>0: ck.err(r) # if used inside Python scripts to print an error and exit

#r dictionary will contain keys from the given automation action.
# See API of this automation action (ck action module --help)
```

Such approach allows users to continue extending different automation actions by adding new keys
while keeping backward compatibility. That's how we managed to develop 50+ modules with the community
without breaking portable CK workflows for our ML&systems R&D.

At the same time, we have implemented a number of "productivity" functions
in the CK kernel that are commonly used by many researchers and engineers.
For example, you can load JSON files, list files in directories, copy strings to clipboards.
At the same time, we made sure that these functions work in the same way across
different Python versions (2.7+ and 3+) and different operating systems
thus removing this burden from developers.

You can see the list of such productivity functions [here](https://ck.readthedocs.io/en/latest/src/ck.html).
For example, you can [load a json file](https://ck.readthedocs.io/en/latest/src/ck.html#ck.kernel.load_json_file)
in your script or CK module in a unified way as follows:

```Python
import ck.kernel as ck

r=ck.load_json_file({'json_file':'some_file.json'})
if r['return']>0: ck.err(r)

d=r['dict']

d['modify_some_key']='new value'

r=ck.save_json_to_file({'json_file':'new_file.json', 'dict':d, 'sort_keys':'yes'})
if r['return']>0: ck.err(r)

```



Expand Down
16 changes: 14 additions & 2 deletions docs/src/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,8 +197,8 @@ make it more collaborative, reproducible, and reusable,
enable portable MLOps, and make it possible to understand [what happens]( https://cknowledge.io/solution/demo-obj-detection-coco-tf-cpu-benchmark-linux-portable-workflows/#dependencies )
inside complex and "black box" computational systems.

Our dream is to see portable workflows shared along with [all published research techniques](https://cKnowledge.io/events)
to be able to reuse and compare them across different data sets, models, software, and hardware!
Our dream is to see portable workflows shared along with new systems, algorithms, and [published research techniques](https://cKnowledge.io/events)
to be able to quickly test, reuse and compare them across different data sets, models, software, and hardware!
That is why we support related reproducibility and benchmarking initiatives
including [artifact evaluation](https://cTuning.org/ae),
[MLPerf](https://mlperf.org), [PapersWithCode](https://paperswithcode.com),
Expand Down Expand Up @@ -238,7 +238,19 @@ interested to know more!*
## CK use cases

* [Real-world use cases from our industrial and academic partners](https://cKnowledge.org/partners.html)
* [MLPerf CK solution (GUI)](https://cKnowledge.io/test)
* [MLPerf CK workflow (development version)](https://github.com/ctuning/ck-mlperf)
* [ck-mlperf:soft:lib.mlperf.loadgen.static](https://github.com/ctuning/ck-mlperf/tree/master/soft/lib.mlperf.loadgen.static)
* [ck-mlperf:package:lib-mlperf-loadgen-static](https://github.com/ctuning/ck-mlperf/tree/master/package/lib-mlperf-loadgen-static)
* [ck-mlperf:package:model-onnx-mlperf-mobilenet](https://github.com/ctuning/ck-mlperf/tree/master/package/model-onnx-mlperf-mobilenet/.cm)
* [ck-tensorflow:package:lib-tflite](https://github.com/ctuning/ck-tensorflow/tree/master/package/lib-tflite)
* [ck-mlperf:program:image-classification-tflite-loadgen](https://github.com/ctuning/ck-mlperf/tree/master/program/image-classification-tflite-loadgen)
* [ck-tensorflow:program:image-classification-tflite](https://github.com/ctuning/ck-tensorflow/tree/master/program/image-classification-tflite)
* [ck-mlperf:docker:*](https://github.com/ctuning/ck-mlperf/tree/master/docker)
* [ck-mlperf:docker:speech-recognition.rnnt](https://github.com/ctuning/ck-mlperf/tree/master/docker/speech-recognition.rnnt)
* [ck-object-detection:package:model-tf-*](https://github.com/ctuning/ck-object-detection/tree/master/package)
* [ck-mlperf:script:mlperf-inference-v0.7.image-classification](https://github.com/ctuning/ck-mlperf/tree/master/script/mlperf-inference-v0.7.image-classification)
* [ck-object-detection:jnotebook:object-detection](https://nbviewer.jupyter.org/urls/dl.dropbox.com/s/5yqb6fy1nbywi7x/medium-object-detection.20190923.ipynb)
* [MLPerf stable crowd-benchmarking demo with the live scoreboard](https://cKnowledge.io/test)
* [CK-based live research paper (collaboration with the Raspberry Pi foundation)](https://cKnowledge.io/report/rpi3-crowd-tuning-2017-interactive).

Expand Down
210 changes: 210 additions & 0 deletions docs/src/portable-workflows.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
# Automating ML&systems R&D

After releasing CK we started working with the community to [gradually automate](introduction.md#how-ck-supports-collaborative-and-reproducible-mlsystems-research)
the most common and repetitive tasks for ML&systems R&D (see the [FastPath'20 presentation](https://doi.org/10.5281/zenodo.4005773)).

We started adding the following CK modules and actions with a unified API and I/O.

## Platform and environment detection

These CK modules automate and unify the detection of different properties of user platforms and environments.

* *module:os* [[API](https://cknowledge.io/c/module/platform/#api)] [[components](https://cKnowledge.io/c/os)]
* *module:platform* [[API](https://cknowledge.io/c/module/platform/#api)]
* *module:platform.os* [[API](https://cknowledge.io/c/module/platform.os/#api)]
* *module:platform.cpu* [[API](https://cknowledge.io/c/module/platform.cpu/#api)]
* *module:platform.gpu* [[API](https://cknowledge.io/c/module/platform.gpu/#api)]
* *module:platform.gpgpu* [[API](https://cknowledge.io/c/module/platform.gpgpu/#api)]
* *module:platform.nn* [[API](https://cknowledge.io/c/module/platform.nn/#api)]

Examples:
```bash
ck detect platform
ck detect platform.gpgpu --cuda
```

## Software detection

This CK module automates the detection of a given software or files (datasets, models, libraries, compilers, frameworks, tools, scripts)
on a given platform using CK names, UIDs, and tags:

* *module:soft* [[API](https://cknowledge.io/c/module/soft/#api)] [[components](https://cKnowledge.io/c/soft)]

It helps to understand a user platform and environment to prepare portable workflows.

Examples:
```bash
ck detect soft:compiler.python
ck detect soft --tags=compiler,python
ck detect soft:compiler.llvm
ck detect soft:compiler.llvm --target_os=android23-arm64
```


## Virtual environment

* *module:env* [[API](https://cknowledge.io/c/module/env/#api)]

Whenever a given software or files are found using software detection plugins,
CK creates a new "env" component in the local CK repository
with an env.sh (Linux/MacOS) or env.bat (Windows).

This environment file contains multiple environment variables
with unique names usually starting from *CK_* with automatically
detected information about a given soft such as versions and paths
to sources, binaries, include files, libraries, etc.

This allows you to detect and use multiple versions of different software
that can easily co-exist on your system in parallel.

Examples:
```bash
ck detect soft:compiler.python
ck detect soft --tags=compiler,python
ck detect soft:compiler.llvm

ck show env
ck show env --tags=compiler
ck show env --tags=compiler,llvm
ck show env --tags=compiler,llvm --target_os=android23-arm64

ck virtual env --tags=compiler,python

```



## Meta packages

When a given software is not detected on our system, we usually want to install related packages with different versions.

That's why we have developed the following CK module that can automate installation of missing packages (models, datasets, tools, frameworks, compilers, etc):

* *module:package* [[API](https://cknowledge.io/c/module/package/#api)] [[components](https://cKnowledge.io/c/package)]

This is a meta package manager that provides a unified API to automatically download, build, and install
packages for a given target (including mobile and edge devices)
using existing building tools and package managers.

All above modules can now support portable workflows that can automatically adapt to a given environment
based on [soft dependencies](https://cknowledge.io/solution/demo-obj-detection-coco-tf-cpu-benchmark-linux-portable-workflows/#dependencies).


Examples:

```bash
ck pull repo:ck-mlperf
ck install package --tags=lib,tflite,v2.1.1
ck install package --tags=tensorflowmodel,tflite,edgetpu
```

See an example of variations to customize a given package: [lib-tflite](https://github.com/ctuning/ck-tensorflow/tree/master/package/lib-tflite).


## Scripts

We also provided an abstraction for ad-hoc scripts:

* *module:script* [[API](https://cknowledge.io/c/module/script/#api)] [[components](https://cKnowledge.io/c/script)]

See an example of the CK component with a script used for MLPerf benchmark submissions: [GitHub](https://github.com/ctuning/ck-mlperf/tree/master/script/mlperf-inference-v0.7.image-classification)



## Portable program pipeline (workflow)

Next we have implemented a CK module to provide a common API to compile, run, and validate programs while automatically adapting to any platform and environment:

* *module:program* [[API](https://cknowledge.io/c/module/program/#api)] [[components](https://cKnowledge.io/c/program)]

A user describes dependencies on CK packages in the CK program meta as well as commands to build, pre-process, run, post-process, and validate a given program.

Examples:
```bash
ck pull repo:ck-crowdtuning

ck compile program:cbench-automotive-susan --speed
ck run program:cbench-automotive-susan --repeat=1 --env.OMP_NUM_THREADS=4

```

## Reproducible experiments

We have developed an abstraction to record and reply experiments using the following CK module:

* *module:experiment* [[API](https://cknowledge.io/c/module/experiment/#api)] [[components](https://cKnowledge.io/c/experiment)]

This module records all resolved dependencies, inputs and outputs when running above CK programs
thus allowing to preserve experiments with all the provenance and replay them later on the same or different machine:

```bash
ck benchmark program:cbench-automotive-susan --record --record_uoa=my_experiment

ck find experiment:my_experiment

ck replay experiment:my_experiment

ck zip experiment:my_experiment
```

## Dashboards

Since we can record all experiments in a unified way, we can also visualize them in a unified way.
That's why we have developed a simple web server that can help to create customizable dashboards:

* *module:web* [[API](https://cknowledge.io/c/module/web/#api)]

See examples of such dashboards:
* [view online at cKnowledge.io platform](https://cKnowledge.io/reproduced-results)
* [view locally (with or without Docker)](https://github.com/ctuning/ck-mlperf/tree/master/docker/image-classification-tflite.dashboard.ubuntu-18.04)




## Interactive articles

One of our goals for CK was to automate the (re-)generation of reproducible articles.
We have validated this possibility in [this proof-of-concept project](https://cKnowledge.org/rpi-crowd-tuning)
with the Raspberry Pi foundation.

We plan to develop a GUI to make the process of generating such papers more user friendly!




## Jupyter notebooks

It is possible to use CK from Jupyter and Colab notebooks. We provided an abstraction to share Jupyter notebooks in CK repositories:

* *module:jnotebook* [[API](https://cknowledge.io/c/module/jnotebook/#api)] [[components](https://cKnowledge.io/c/jnotebook)]

You can see an example of a Jupyter notebook with CK commands to process MLPerf benchmark results
[here](https://nbviewer.jupyter.org/urls/dl.dropbox.com/s/5yqb6fy1nbywi7x/medium-object-detection.20190923.ipynb).



## Docker

We provided an abstraction to build, pull, and run Docker images:

* *module:docker* [[API](https://cknowledge.io/c/module/docker/#api)] [[components](https://cKnowledge.io/c/docker)]

You can see examples of Docker images with unified CK commands to automate MLPerf benchmark
[here](https://github.com/ctuning/ck-mlperf/tree/master/docker).



# Further info

During the past few years we converted all the workflows and components from our past ML&systems R&D
including the [MILEPOST and cTuning.org project](https://github.com/ctuning/reproduce-milepost-project) to the CK format.

There are now [150+ CK modules](https://cKnowledge.io/modules) with actions automating and abstracting
many tedious and repetitive tasks in ML&systems R&D including model training and prediction,
universal autotuning, ML/SW/HW co-design, model testing and deployment, paper generation and so on:

* [A high level overview of portable CK workflows](https://cknowledge.org/high-level-overview.pdf)
* [A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques (collaboration with the Raspberry Pi foundation)]( https://cKnowledge.org/report/rpi3-crowd-tuning-2017-interactive )
* [A summary of main CK-based projects with academic and industrial partners]( https://cKnowledge.org/partners.html )

Don't hesitate to [contact us](https://cKnowledge.org/contacts.html) if you have a feedback or want to know more about our plans!

0 comments on commit ed8eaba

Please sign in to comment.