Skip to content

Commit

Permalink
improving docs
Browse files Browse the repository at this point in the history
  • Loading branch information
gfursin committed Aug 30, 2020
1 parent dccaa40 commit 275683a
Show file tree
Hide file tree
Showing 8 changed files with 398 additions and 138 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ and ACM at [cKnowledge.org/partners](https://cKnowledge.org/partners.html).
* [Real-world use-cases](https://cKnowledge.org/partners)
* [Publications](https://github.com/ctuning/ck/wiki/Publications)

* [News](https://github.com/ctuning/ck/wiki/News-archive)

## Installation

Follow [this guide](https://ck.readthedocs.io/en/latest/src/installation.html)
Expand Down
5 changes: 3 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ CK framework documentation
.. epigraph::

Collective Knowledge framework (CK) helps to organize any software project
as a database of reusable components with common automation actions
as a database of reusable components (algorithms, datasets, models, frameworks, scripts,
experimental results, papers, etc) with common automation actions
and extensible meta descriptions based on FAIR principles
(findability, accessibility, interoperability, and reusability).

Expand All @@ -35,7 +36,7 @@ and ACM: https://cKnowledge.org/partners.
:caption: Getting Started

src/installation
src/getting-started
src/first-steps

.. toctree::
:maxdepth: 2
Expand Down
13 changes: 12 additions & 1 deletion docs/src/commands.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,15 @@
# CK commands
# CK commands and APIs

## Managing CK repositories

## Managing CK entries

## Managing CK actions

## CK Python API




We plan to rewrite this documentation when we have more resources.
In the mean time have a look at this older wiki-based documentation:
Expand Down
315 changes: 315 additions & 0 deletions docs/src/first-steps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,315 @@
# First steps

## How CK enables portable and customizable workflows

We originally developed CK to help our [partners and collaborators](https://cKnowledge.org/partners.html)
implement modular, portable, customizable, and reusable workflows.
We needed such workflows to enable collaborative and reproducible ML&systems R&D
while focusing on [deep learning benchmarking and ML/SW/HW co-design](https://cKnowledge.org/request).
We also wanted to automate and reuse tedious tasks that are repeated across nearly all ML&systems projects
as described in our [FOSDEM presentation](https://zenodo.org/record/2556147#.XMViWKRS9PY).

In this section, we demonstrate how to use CK with portable and non-virtualized program workflows
that can automatically adapt to any platform and user environment, i.e. automatically detect
target platform properties and software dependencies and then compile and run a given program
with any compatible dataset and model in a unified way.

Note that such approach also supports our [reproducibility initiatives at ML&systems conferences](https://cTuning.org/ae)
to share portable workflows along with [published papers](https://cKnowledge.io/reproduced-papers).
Our goal is to make it easier for the community to reproduce research techniques, compare them,
build upon them, and adopt them in production.

## CK installation

Follow this [guide](installation.md) to install CK on Linux, MacOS, or Windows.
Don't hesitate to [contact us](https://cKnowledge.org/contacts.html)
if you encounter any problem or have questions.


## Pull CK repositories with the universal program workflow


Now you can pull CK repo with the universal program workflow.

```bash
ck pull repo --url=https://github.com/ctuning/ck-crowdtuning
```

CK will automatically pull all required CK repositories with different automation actions, benchmarks, and datasets in the CK format.
You can see them as follows:

```bash
ck ls repo
```

By default, CK stores all CK repositories in the user space in *$HOME/CK-REPOS*. However, you can change it using the environment variable *CK_REPOS*.

## Manage CK entries


You can now see all shared program workflows in the CK format:

```bash
ck ls program
```

You can find and investigate the CK format for a given program (such as *cbench-automotive-susan*) as follows:

```bash
ck find program:cbench-automotive-susan
```

You can see the CK meta description of this program from the command line as follows:
```bash
ck load program:cbench-automotive-susan
ck load program:cbench-automotive-susan --min
```

It may be more convenient to check the structure of this entry at [GitHub](https://github.com/ctuning/ctuning-programs/tree/master/program/cbench-automotive-susan) with all the sources and meta-descriptions.

You can also see the CK JSON meta description for this CK program entry [here](https://github.com/ctuning/ctuning-programs/blob/master/program/cbench-automotive-susan/.cm/meta.json).
When you invoke automation actions in the CK module *program*, the automation code will read this meta description and perform actions for different programs accordingly.

## Invoke CK automation actions

You can now try to compile this program on your platform:

```bash
ck compile program:cbench-automotive-susan --speed
```

CK will invoke the function "compile" in the module "program" (you can see it at [GitHub](https://github.com/ctuning/ck-autotuning/blob/master/module/program/module.py#L3551)
or you can find the source code of this CK module locally using "ck find module:program"),
read the JSON meta of *cbench-automotive-susan*, and perform a given action.

In our case, CK will first attempt to automatically detect the properties of the platform
and all required software dependencies such as compilers and libraries that are already installed on this platform.
CK uses [multiple plugins](https://cKnowledge.io/soft) describing how to detect different software, models, and datasets.

Users can add their own plugins either in their own CK repositories or in already existing ones.

You can also perform software detection manually from the command line. For example you can detect all installed GCC or LLVM versions:
```bash
ck detect soft:compiler.gcc
ck detect soft:compiler.llvm
```

Detected software is registered in the local CK repository together
with the automatically generated environment script (*env.sh* or *env.bat*)
specifying different environment variables for this software
(paths, versions, etc).

You can list registered software as follows:

```bash
ck show env
ck show env --tags=compiler
```

You can use CK as a virtual environment similar to venv and Conda:
```bash
ck virtual env --tags=compiler,gcc
```

Such approach allows us to separate CK workflows from hardwired dependencies and automatically plug in the requied ones.

You can now run this program as follows:
```bash
ck run program:cbench-automotive-susan
```

While running the program, CK will collect and unify various characteristics (execution time, code size, etc).
This enables unified benchmarking reused across different programs, datasets, models, and platform.
Furthermore, we can continue improving this universal program workflow to monitor CPU/GPU frequencies,
performing statistical analysis of collected characteristics, validating outputs, etc:

```bash
ck benchmark program:cbench-automotive-susan --repetitions=4 --record --record_uoa=ck_entry_to_record_my_experiment
ck replay experiment:ck_entry_to_record_my_experiment
```

Note that CK programs can automatically plug different datasets from CK entries
that can be shared by different users in different repos (for example, when publishing a new paper):

```bash
ck search dataset
ck search dataset --tags=jpeg
```

Our goal is to help researchers reuse this universal CK program workflow
instead of rewriting complex infrastructure from scratch in each research project.

## Install missing packages

Note, that if a given software dependency is not resolved,
CK will attempt to automatically install it using CK meta packages
(see the list of shared CK packages at [cKnowledge.io](https://cKnowledge.io/packages)).
Such meta packages contain JSON meta information and scripts
to install and potentially rebuild a given package
for a given target platform while reusing existing
build tools and native package managers if possible
(make, cmake, [scons](https://scons.org),
[spack](https://spack.io),
[python-poetry](https://python-poetry.org), etc).
Furthermore, CK package manager can also install
non-software packages including ML models and datasets
while ensuring compatibility between all components
for portable workflows!

You can list CK packages available on your system (CK will search for them in all CK repositories installed on your system):
```bash
ck search package --all
```

You can then try to install a given LLVM on your system as follows:
```bash
ck install package --tags=llvm,v10.0.0
```

If this package is successfully installed, CK will also create an associated CK environment:
```bash
ck show env --tags=llvm,v10.0.0

```

By default, all packages are installed in the user space (*$HOME/CK-TOOLS*).
You can change this path using the CK environment variable *CK_TOOLS*.

Note that you can now detect or install multiple versions of the same tool on your system
that can be picked up and used by portable CK workflows!

You can run a CK virtual environment to use a given version as follows:

```bash
ck virtual env --tags=llvm,v10.0.0

```

You can also run multiple virtual environments at once to combine different versions of different tools together:
```bash
ck show env
ck virtual env {UID1 from above list} {UID2 from above list} ...
```

Another important goal of CK is invoke all automation actions and portable workflows
across all operating systems and environments
including Linnux, Windows, MacOS, Android (you can retarget your workflow for Andoird by adding *--target_os=android23-arm64* flag
to all above commands when installing packages or compiling and running your programs).
The idea is to have a unified interface for all research techniques and artifacts
shared along with research papers to make the onboarding easier for the community!


## Participate in crowd-tuning

You can even participate in [crowd-tuning](https://cKnowledge.org/rpi-crowd-tuning)
of multiple programs and data sets across diverse platforms:.

```
ck crowdtune program:cbench-automotive-susan
ck crowdtune program
```

You can see the live scoreboard with optimizations [here](https://cKnowledge.org/repo-beta).

## Use CK python API

You can also run CK automation actions directly from any Python (2.7+ or 3.3+) using one *ck.access* function:


```python
import ck.kernel as ck

# Equivalent of "ck compile program:cbench-automotive-susan --speed"
r=ck.access({'action':'compile', 'module_uoa':'program', 'data_uoa':'cbench-automotive-susan',
'speed':'yes'})
if r['return']>0: return r # unified error handling

print (r)

# Equivalent of "ck run program:cbench-automotive-susan --env.OMP_NUM_THREADS=4
r=ck.access({'action':'run', 'module_uoa':'program', 'data_uoa':'cbench-automotive-susan',
'env':{'OMP_NUM_THREADS':4}})
if r['return']>0: return r # unified error handling

print (r)

```


## Try the CK ML workflow

You can now try a more complex example with TensorFlow.
You should pull the related CK repository and install the prebuilt version of TensorFlow CPU via CK:

```bash
ck pull repo:ck-tensorflow
ck install package --tags=lib,tensorflow,vcpu,vprebuilt
```

Check that it was successfully installed:

```bash
ck show env --tags=lib,tensorflow
```

You can find a path to a given entry describing this TF installation as follows:
```bash
ck find env:{env UID from above list}
```

Run the CK virtual environment and test TF:
```bash
ck virtual env --tags=lib,tensorflow
ipython
> import tensorflow as tf
>
```

You can try to run the CK image classification workflow example using the installed TF:

```bash
ck run program:tensorflow --cmd_key=classify
```

You can even try to rebuild TensorFlow via CK for your platform with CUDA:

```bash
ck install package:lib-tensorflow-1.7.0-cuda
```

CK will attempt detect your CUDA compiler and related libraries and tools
including Java, Basel, and will then try to rebuild TF.
Note that you may still need to install some extra dependencies yourself
as described in this [readme](https://github.com/ctuning/ck-tensorflow#prerequisites-for-ubuntu).


You can also try to run ML workflows from the [MLPerf benchmarking initiative](https://mlperf.org)
using this [CK MLPerf repository](https://github.com/ctuning/ck-mlperf).

Finally, you can try our recent [MLPerf automation demo](https://cKnowledge.io/test)
to automate submissions and validations of MLPerf results.


## Further information

As you may notice, CK helps to convert ad-hoc research projects into a unified database
of reusable components with common automation actions and unified meta descriptions.
The goal is to promote artifact sharing and reuse while gradually substituting and unifying
all tedious and repetitive research tasks!

You can find shared CK repositories, components, automation actions, and live scoreboards
at the open [cKnowledge.io platform](https://cKnowledge.io).

You can also check how the universal CK program workflow was successfully reused in
[different projects](https://doi.org/10.5281/zenodo.4005588)
including the [ACM REQUEST tournaments](http://cKnowledge.org/request) to collaboratively co-design SW/HW stack for deep learning
([Report about results of the 1st ReQuEST-ASPLOS'18 tournament and next steps](https://portalparts.acm.org/3230000/3229762/fm/frontmatter.pdf)
and [ACM ReQuEST-ASPLOS'18 proceedings with artifact descriptions](https://doi.org/10.1145/3229762))
and [reproducible quantum tournaments](https://cKnowledge.org/quantum).

Finally, check this [guide](how-to-contribute.md) to learn how to add your own repositories, workflows, and components!


## Contact the CK community

If you encounter problems or have suggestions, do not hesitate to [contact us](https://cKnowledge.org/contacts.html)!
6 changes: 0 additions & 6 deletions docs/src/getting-started.md

This file was deleted.

0 comments on commit 275683a

Please sign in to comment.