Development conventions

Grigori Fursin edited this page Jun 5, 2018 · 2 revisions

[ Home ]

Table of Contents

CK development conventions

CK directory structure

  • /bin - Unix and Windows scripts to invoke CK python kernel
  • /ck/kernel.py - CK main kernel in python
  • /ck/repo - CK repository with main CK modules: index, kernel, module, repo, test, web

CK kernel development

Python source code indentation

Whenever you develop a new Python code, we strongly suggest to use PEP-8 style guide.

When editing existing Python code, we suggest to keep original style (unless you agree with the author to convert it to PEP-8). Below is an example of the original indentation we have been using in the CK:

def test(i):
    """

    Input:  {
            }

    Output: {
              return       - return code =  0, if successful
                                         >  0, if error
              (error)      - error text if return > 0
            }

    """

    j={'a':'b',
       'c':'d'}

    if a==b:
       for x in y:
           try:
              print ('abc')
           except Exception as e: 
              print ('error')
              pass


       while p.poll() == None and t<xto:
          time.sleep(0.1)
          t=time.time()-t0
    elif c==d:
       print ('qqq')
    else:
       print ('xyz')

    return {'return':0}

Function API

It is possible to obtain an API of any function (action) of any module as follows (if provided):

 $ ck [action] [module UOA] --help

If module UOA is omitted, an internal CK function is used instead. For example, here is an API of adding CK entry:

 $ ck add --help

We've made a considerable effort to provide API of all functions. However, if something is missing, please feel free to improve the documentation in a given Python module or contact the community via CK mailing list!

Also note that since we are trying to always keep backward compatibility in modules (by using different keys in input/output if needed), you can always dump the output of a given function and check its structure before using it further (in fact, we often do the same during quick and agile development of research scenarios). For example, you can easily dump output in your code as follows:

    r=ck.access({'action':'list',
                 'module_uoa':'repo'})
    if r['return']>0: return r

    import json
    print (json.dumps(r, indent=2))

Unifying API keys

One of the CK ideas is to chain various CK modules to implement experimental pipelines. We also use schema-free dictionary as input and output to be able to easily extend it, keep backward compatibility, and gradually unify keys with the help of the community (such as 'characteristics', 'choices', 'features', 'state', etc) as described in these publications: 1,2.

Note, that if type of the key is not specified, it is a string by default. Passing Python objects other than bool,long,float,str,list,dict is not recommended (though necessary in rare case) in functions that may be accessed via CK web service.

Below is the list unified keys in the CK JSON API:

API Input

  • action - module function (action)
  • module_uoa - module UOA
  • data_uoa - data UOA
  • repo_uoa - repository UOA
  • cid - (repo_uoa:)module_uoa:data_uoa
  • out - how to output information (con,json,json_with_sep,json_file)
  • out_file - path to output file is out==json_file
  • unparsed_cmd - anything after '--' in CK command line
  • cids - internal
  • xcids - internal

API output

  • return - (int) return code
  • error - error test if return>0

API input and output

  • host_os - UOA of host OS (module os). If omitted, CK will try to detect host OS.
  • target_os - UOA of target OS (module os). If omitted, CK will use host OS UOA.
  • target_device_id - Identifier of a remote device (such as in adb device).

Next are keys used for our long-term initiative to enable faster, cheaper, more energy efficient and more reliable computer systems by continuously and collaboratively learning their behavior (see 1,2):

  • characteristics - dictionary with characteristics of any object
  • choices - dictionary with exposed choices of a given object
  • features - features/properties of a given object
  • state - run-time state of a given object
  • tmp - temporary vars of an object
  • deps - software, hardware and other dependencies

Note that keys are case sensitive. We suggest to use only alphanumeric characters in lower case, '_', '-' and '.'.

CK entry information

Information about any CK entry is stored in {entry_path}/.cm/info.json with the following format:

 {
  "control": {
    "engine": "CK", 
    "version": [...] # engine version as a list of values from high to low number
    "copyright": # brief copyright text
    "license": # brief license text
    "iso_datetime": # creation date and time in ISO format
    "author": # author (if defined)
    "author_email": # author email, if defined using "ck setup kernel"
    "author_webpage": # author webpage, if defined using "ck setup kernel"
  }, 
  "data_name": # user friendly name of the entry (most of the time copy of alias)
  "backup_module_uid": # uid of the module that created this entry
  "backup_module_uoa": # uoa of the module that created this entry
 }

Any further updates are recorded in the file <entry_path>/.cm/updates.json.

This file is not loaded during standard search thus speeding up queries ...

Flat format to reference keys in dictionaries

We specially use easily extensible no-schema data representation in CK to allow researchers quickly prototype ideas rather than spending their time and effort on preparing strict typed data that can evolve over time. Instead, if prototype is successful and is planned to be shared with the community, only then users can provide description of the data.

In some cases, we also use our own flat key format to reference any key in the dictionary. Such flat key always starts with # followed by #key if it's a dictionary key or @number if it's a value in a list.

For example, <FLAT_KEY> for key c in dictionary {"a":[{"c":"d"}]} is ##a@0#c

Referencing other modules and entries

When referencing other modules inside a given module, we strongly suggest to reference them by their UID instead of alias. We envision that in the future API may dramatically change - in such case, we may still keep the same alias, but will change UID. In such case, shared experimental pipelines will still work correctly by finding modules with correct UID.

We use key module_deps in the module meta to match alias with UID. For example, module program from ck-autotuning repo has the following dependencies in its meta:

 $ ck pull repo:ck-autotuning
 $ ck load module:program --min

 ...
    "module_deps": {
      "compiler": "36ebc331048475bb",
      "env": "9b9b3208ac44b891",
      "script": "84e27ad9dd12e734",
      "dataset": "8a7141c59cd335f5",
      "soft": "5e1100048ab875d7",
      "platform": "707ccdfe444cafac",
      "choice": "e4564d6f984400d7",
      "platform.cpu": "aa6b542a420b8db9",
      "platform.os": "41e31cc4496b8a8e",
      "platform.gpu": "55ec7775f4afaabd",
      "dataset.features": "87b55c4f4a2482da"

These sub-modules can be referenced in the program source code via compiler_uid=cfg["module_deps"]["compiler"].

CK data entries can be referenced by UID in a similar way.

Continuous Integration

CK uses the following frameworks to test its internal functionality:

CK uses CoverALLs to measure test coverage.

User perspective

You can run tests with this command:

 $ ck run test

In this format, the command runs every test in every repository it found (including the default one).

To specify which tests to run, you may use one or more of the following arguments:

  1. --repo_uoa - if given, only tests for this repo are run. Example: ck run test --repo_uoa=default to run only tests for the default repo.
  1. --test_module_uoa - if given, only tests for modules with this UOA are run. Example: ck run test --repo_uoa=default --test_module_uoa=kernel to run only tests for the kernel module in the default repo (the kernel itself).
  1. --test_file_pattern - pattern for test file names to run (using shell style pattern matching). If not given, the assumed pattern is test*.py. Example: ck run test --repo_uoa=default --test_module_uoa=kernel --test_file_pattern=test_original_tests.py to run only the original tests for the kernel.

Module/tests developer perspective

Tests for a module must reside in the test/ directory in the module's CK entry. It's highly recommended that the files are named test_SOMETHING.py' (replace _SOMETHING_ with something meaningful), as this is the default filename pattern.

Tests are Python modules. The ck, work and cfg variables are injected by CK (just like they are for the modules). Also, test_util variable is injected, which contains functions to help writing tests. More on this below.

Tests Python modules must contain classes extending from unittest.TestCase. The actual testing methods are method of these classes called test_*. More information on standard tools for writing tests can be found here: https://docs.python.org/2/library/unittest.html

One can look at the kernel's tests for some examples:

  • ck/repo/module/kernel/test/test_kernel.py
  • ck/repo/module/kernel/test/test_original_tests.py.
  • ck/repo/module/kernel/test/test_mgmt.py
Test utility functions

test_util is a module, which is automatically injected in test modules as the test_util variable. It provides certain functions to facilitate tests writing. They come as With-statement context managers. This means they're meant to be used as follows:

with test_util.tmp_file(content='abcd') as fname:
    # fname is a variable 'yielded' from the tmp_file context manager.
    # It is the path to a temporary file with the given content,
    # which was created on entering this 'with' block.
    # The file will be automatically deleted before leaving this 'with' block.

The test_util module is located here: ck/repo/module/test/test_util.py

Please take a look at the code for the information about functions it provides. Examples of their usage can be found in the kernel tests.

Running tests manually

Go to CK folder and run tests:

  • Python2: python -m tests.test
  • Python3: python3 -m tests.test

Check coverage:

  • coverage run -m tests.test && coverage html

You can then see the coverage in the htmlcov/index.html file.




Fun

Questions and comments

You are welcome to get in touch with the CK community if you have questions or comments!

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.