Autotuning example clsmith

Grigori Fursin edited this page Jun 5, 2018 · 7 revisions

[ Home ]

Please do not forget to check Getting Started Guides to understand CK concepts!

Table of Contents

Example of converting CLsmith tool (PLDI'15 artifact) to the CK format

CLsmith is a tool designed to address the compiler correctness problem for many-core systems through novel applications of fuzz testing to OpenCL compilers. It was shared it as an artifact along with PLDI'15 paper "Many-Core Compiler Fuzzing" (Christopher Lidbury, Andrei Lascu, Nathan Chong, Alastair F. Donaldson). You can find the authors' guide to reproduce results (and rebuild proprietary tools) here.

We are currently converting CLsmith, related artifacts and experiment workflow to CK format (finer-grain and reusable components) to simplify installation, enable communication with pre-installed (and possibly proprietary) OpenCL compilers, reuse existing and share data sets, validate and improve technique via experiment crowdsourcing, and apply CK-based predictive analytics to automatically detect unexpected behavior.

We have shared CK-powered CLsmith repository via GitHub, and validated it via experiment crowdsourcing on several machines (see live results). We are planning to add the actual CLsmith generator soon.

CLsmith converted to CK can also serve as a template to describe and share other artifacts in CK format (for example, for PPoPP, ADAPT, CGO, and other conferences and workshops).

Next, we will describe how to obtain and use CLsmith via CK.

Obtaining reusable artifacts

We expect that you have read previous "Motivation" and "Getting Started Guide", and familiar with CK high-level concepts and basic usage.

You can obtain CLsmith artifact in CK format simply via

 $ ck pull repo:ck-clsmith

Extra repositories including ck-autotuning, ck-analytics and ck-env will also be automatically pulled.

Repository structure

This repository has the following entries:

  • program:tool-cl-launcher-1.0 - universal OpenCL launcher in CK format (as program)
  • dataset:clsmith-* - 100 basic and vector OpenCL kernels as CK data sets for above program (old)
  • dataset:clsmith1-* - 100 new OpenCL kernels (all, atomic reduction, atomic section, barrier, basic, vector) as CK data sets
  • script:explore-datasets - scripts to run and check all above kernels for correctness and share results (crowdsourcing experiments)
  • experiment.view:explore-clsmith-datasets - description of experiments to build HTML tables with results

Prerequisites

To run experiments with CLsmith, you need to following additional artifacts:

  • Any standard compiler (for example, GCC, ICC, LLVM or CL)
  • OpenCL - vendor OpenCL library
  • xOpenME - our run time library to expose various parameters to outside world via JSON files

You can check how to install/register above tools in CK from one of the previous example here.

Note, that if you already set up above environment, you can skip this section and start running experiments via CK immediately!

Running local experiments

CLsmith runs multiple and slightly modified OpenCL kernels on multiple machines to detect when compiler crashes, collect statistic and statistically decide whether there is a compiler or application bug.

We prepared several scripts to perform such experiments. They are available in the following directory:

 $ ck find script:explore-datasets

Please change your current directory to above one.

CK format allows us to implement universal OpenCL kernel launcher as standard CK program which takes as parameters dataset entry with kernels, and OpenCL platform and device IDs. This, in turn, allows us to reuse universal autotuning/exploration functionality from CK via program pipeline, and run it on any supported OS (Windows, Linux, Android, OS X, etc).

First, you need to prepare OpenCL launcher pipeline:

$ ./_clean_tool_clsmith_pipeline.bat
$ ./_setup_tool_clsmith_pipeline.bat

You can pre-select any default values (some of them will be changed during data set exploration).

Now, you can test that pipeline runs on your system (and thus all libraries and compilers are properly installed) via

$ ./_start_tool_clsmith_pipeline.bat

In case of success, you can automatically explore all kernels from a selected data set (compiling and running all OpenCL kernels) via a pre-defined script:

$ ./explore_datasets_any.bat

This script records all experiments (failed or passed) in a reproducible way in a CK experiment entry:

$ ck find experiment:explore-clsmith-datasets-any

Visualizing, replaying and analyzing results

It is possible to visualize results via user-friendly CK web front-end. To do so, you will need ck-web package installed and ck web service stared:

 $ ck pull repo:ck-web
 $ ck start web

Now you can open your favorite web browser with the following link:

 http://localhost:3344

Select Experiments in the top menu, then enter clsmith in Tags under Prune entries by and click Search. If only one entry is found, CK will automatically select it and you will see your experiment table which you can sort by clicking on the header (for example, by clicking on Dataset file).

Note, that in the right most column, you have buttons to replay experiments. You can click on any button, copy contents to clipboard, and then simply rerun this command from a command line! Alternatively, you can use the following predefined script to replay experiments:

 $ ./replay_experiment.sh
   or
 $ replay_experiment.bat

It will ask you to select a given point inside an entry (individual experiment). You can add it as follows

 $ ./replay_experiment.sh --point={UID}

You can also obtain all results in a raw JSON format from CMD via

$ ./get_results_all.bat
This script will produce get_results_tmp.json file with all points and meta info.

Finally, we prepared a python script that shows you how to obtain above table, print it and convert it to CSV:

$ ./start_analysis.py

This script will create start_analysis_tmp.csv and can be used as a template to write your own analysis tools.

Crowdsourcing experiments

Briefly, CLsmith needs to run the same experiments on many machines to statistically detect whether there is a bug in an OpenCL kernel or a given compiler. We provided an additional script to crowdsource such experiments among volunteers and aggregate results in some CK server (for a demo, we used our public cknowledge.org/repo server).

You can run the same experiments as above and also share info about bugs via

 $ ./explore_datasets_any_crowd.bat

You can view live results via web for basic kernels here.

You can also see all results at cknowledge.org/repo - just select Experiments and clsmith as tags.

Explaining CK-based CLsmith format

Here we would like to explain how we converted CLsmith to CK format. We used the following guide to add new repositories and artifacts to CK.

Repository

We prepare a dummy repository on GITHUB (ck-clsmith), and pulled it via CK:

 $ ck pull repo:ck-clsmith

Dependencies

We noticed that original CLsmith has 3 groups of files:

  • OpenCL launcher program
  • Datasets (generated OpenCL kernels)
  • Scripts to run experiments, record results and analyze bugs

At the same time, we need a standard compiler and OpenCL library to compile and run experiments. All this functionality already exists in CK repository ck-autotuning. Hence we just added this dependency to ck-clsmith via:

 $ ck update repo:ck-clsmith

We just need to follow questions, answer Yes to add extra dependencies and then add ck-autotuning.

We should also pull ck-autotuning once (afterwards this will be done automatically whenever you pull/update all repositories) to be able to use its modules as containers for CLsmith components:

 $ ck pull repo:ck-autotuning

Adding data sets

Next, we noticed that CLsmith has 2 groups of data sets, 100 OpenCL basic kernels and 100 OpenCL vector kernels. Hence we added two CK entries:

 $ ck add ck-clsmith:dataset:clsmith-basic-100 @@dict

Then we interactively added meta description of this data set (it can be done later by directly editing .cm/meta.json in the newly created entry):

{
  "dataset_files": [
    "00001.cl",
    "00002.cl",
    ...
  ],
  "tags": [
    "dataset", 
    "clsmith",
    "opencl kernel",
    "basic"
  ]
}

In dataset_files we listed all OpenCL kernels (that will be later added to a command line of OpenCL launcher). In tags we added some list of tags uniquely describing this data set (will be needed by OpenCL launcher to automatically find all compatible data sets in CK).

Above command will return path to a newly created entry. We then copied all related OpenCL kernels to this path.

In a similar way, we also added clsmith-vector-100:

 $ ck add ck-clsmith:dataset:clsmith-vector-100 @@dict

We just add slightly different tags to meta:

  "tags": [
    "dataset", 
    "clsmith",
    "opencl kernel",
    "vector
  ]

Adding OpenCL launcher as a standard CK program

We decided to add OpenCL launcher using CK program container since it will allow us to reuse all shared, unified and portable compilation, execution and choices exploration infrastructure from CK.

In addition, we added to the source our xOpenME library to expose run-time info such as OpenCL platform name, device name, number of compute units and options passed to OpenCL compiler. However, it is not strictly necessary.

We also added a flag to pass include path to the OpenCL compiler (to be able to provide path to CK entry).

First, we added program as CK dummy entry (with the alias and unique ID):

 $ ck add ck-clsmith:program:tool-cl-launcher-1.0 @@dict

with the following meta description which is described in detail further and which you can simply copy/paste (may look a bit complex, but in fact, it just describes all compilation and execution steps in CK with automatic search for required tools, libraries and data sets):

 {
  "backup_data_uid": "4eca01a57df4dfb2",
  "build_compiler_vars": {
    "XOPENME": "1"
  }, 
  "compile_deps": {
    "compiler": {
      "local": "yes", 
      "sort": 10, 
      "tags": "compiler,lang-c"
    }, 
    "lib.xopenme": {
      "local": "yes", 
      "sort": 30, 
      "tags": "lib,xopenme"
    }, 
    "lib_opencl": {
      "local": "yes", 
      "sort": 15, 
      "tags": "lib,opencl"
    }
  }, 
  "compiler_env": "CK_CC",
  "data_name": "cl-launcher-1.0",
  "extra_ld_vars": "$<<CK_EXTRA_LIB_M>>$",
  "main_language": "c",
  "program":"yes",
  "process_in_tmp": "yes",
  "run_cmds": {
    "default": {
      "dataset_tags": [
        "dataset", 
        "clsmith",
        "opencl kernel" 
      ], 
      "ignore_return_code": "no",
      "run_time": {
        "fine_grain_timer_file": "tmp-ck-timer.json",
        "post_process_cmds": [
          "python $#src_path_local#$ck_postprocess.py"
        ],
        "run_cmd_main": "$#BIN_FILE#$ -f $#dataset_path#$$#dataset_filename#$ -p $<<CK_COMPUTE_PLATFORM_ID>>$ -d $<<CK_COMPUTE_DEVICE_ID>>$ -i $#src_path#$ -o tmp-output.txt ---debug", 
        "run_cmd_out1": "",
        "run_cmd_out2": "",
        "run_correctness_output_files": [],
        "run_input_files": [
          "CLSmith.h"
        ], 
        "run_output_files": [
          "tmp-output.tmp",
          "tmp-ck-timer.json"
        ]
      }
    } 
  }, 
  "source_files": [
    "cl_launcher.c"
  ], 
  "species": [
  ], 
  "tags": [
    "opencl", 
    "program", 
    "clsmith", 
    "v1.0", 
    "v1", 
    "lang-c"
  ], 
  "target_file": "cl_launcher"
 }

Next you can find the same JSON with explanations (do not copy/paste it):

 {
  "backup_data_uid": "4eca01a57df4dfb2",   # not strictly necessary (should be UID of this entry)
  "build_compiler_vars": {
    "XOPENME": "1"                         # tells compiler to use XOPENME
  }, 
  "compile_deps": {
    "compiler": {                          # dependency on any standard C compiler
      "local": "yes", 
      "sort": 10, 
      "tags": "compiler,lang-c"
    }, 
    "lib.xopenme": {                       # dependency on our XOPENME library
      "local": "yes", 
      "sort": 30, 
      "tags": "lib,xopenme"
    }, 
    "lib_opencl": {                        # dependency on OpenCL library
      "local": "yes", 
      "sort": 15, 
      "tags": "lib,opencl"
    }
  }, 
  "compiler_env": "CK_CC",                 # environment key to pass compiler name
  "data_name": "cl-launcher-1.0",          # user-friendly name of the program
  "extra_ld_vars": "$<<CK_EXTRA_LIB_M>>$", # extra libs, if needed (will be taken from compiler environment
  "main_language": "c",                    # informative: program language
  "program":"yes",                         # tells CK that it has a source code that can be compiled and executed via CK
  "process_in_tmp": "yes",                 # tells CK to compile and run program in a temp directory 
                                           #   (not to pollute original repository)
  "run_cmds": {
    "default": {                           # here we describe one and default CMD to run this tool
      "dataset_tags": [                    # tags to automatically find all available data sets in CK
        "dataset", 
        "clsmith",
        "opencl kernel" 
      ], 
      "ignore_return_code": "no",         # fail program pipeline, if return code !=0
      "run_time": {
        "fine_grain_timer_file": "tmp-ck-timer.json",     # tells CK that XOPENME returns this file
                                                          # to be appended to JSON characteristics
        "post_process_cmds": [
          "python $#src_path_local#$ck_postprocess.py"    # just a demo to show that we can run 
                                                          # some scripts just after program finishes execution
                                                          # and before CK pipeline resumes
        ],
        # next is the command line to run program ($#dataset_path#$ and $#dataset_filename#$ 
        # will be automatically substituted with the CK entry info
        # CK_COMPUTE_PLATFORM_ID and CK_COMPUTE_DEVICE_ID will be selected by 'ck detect platform.gpgpu --opencl' and substituted via env variables
        # ($<< and >>$ will be automatically substituted with the OS specific characters
        #   such as ${ and } on Linux and % on Windows)
        # $#src_path#$ will point to CK program entry thus allowing us to provide include path
        "run_cmd_main": "$#BIN_FILE#$ -f $#dataset_path#$$#dataset_filename#$ -p $<<PLATFORM_ID>>$ -d $<<DEVICE_ID>>$ -i $#src_path#$ ---debug", 
        "run_cmd_out1": "", # we do not save stdout to file (can do it in the future)
        "run_cmd_out2": "", # we do not save stderr to file (can do it in the future)
        "run_correctness_output_files": [], # check output files for correctness (TBD)
        "run_input_files": [
          "CLSmith.h"          # described which files are required for tool to run
                               # (when running code on remote machine such as Android,
                               #  these files will be automatically copied there)
        ], 
        "run_output_files": [
          "tmp-output.tmp",    # when running on remote machine, these files
          "tmp-ck-timer.json"  # will be automatically pulled back
        ]
      }
    } 
  }, 
  "run_vars": {           # specifies default environment variables
   "PLATFORM_ID":0,
   "DEVICE_ID":0
  }, 
  "source_files": [
    "cl_launcher.c"       # provide sources files of the tool
  ], 
  "species": [
  ], 
  "tags": [               # describe unique tags for this tool 
    "opencl", 
    "program", 
    "clsmith", 
    "v1.0", 
    "v1", 
    "lang-c"
  ], 
  "target_file": "cl_launcher"  # describes compiler output name for this tool
 }

Then, we copy cl_launcher.c, cl_safe_math_macros.h, CLSmith.h and safe_math_macros.h to the newly created entry. We also create a small ck_postprocess.py python script that demonstrating how to postprocess XOPENME JSON run-time output and possibly embed own keys/values there:

import json

d={}

print ('  (processing OpenME output ...)')

# Preload tmp-ck-timer.json from OpenME if there
exists=True
try:
  f=open('tmp-ck-timer.json', 'r')
except Exception as e:
  exists=False
  pass

if exists:
   try:
     s=f.read()
     d=json.loads(s)
   except Exception as e:
     exists=False
     pass

   if exists:
      f.close()

d['post_processed']='yes'

# Write CK json
f=open('tmp-ck-timer.json','wt')
f.write(json.dumps(d, indent=2, sort_keys=True)+'\n')
f.close()

This allows us to immediately reuse compile and run functions (provided that OpenCL is already registered in CK environment):

 $ ck compile program:tool-cl-launcher-1.0
 $ ck run program:tool-cl-launcher-1.0

Organizing scripts

We also created an entry to organize all scripts that perform experiments:

 $ ck add ck-clsmith:script:explore-datasets

There we created various customizable OS or python scripts to prepare program pipeline, run experiments, analyze and replay them, and visualize tables (as described at the beginning of this page).

We also provided a python script start_analysis.py that demonstrates how to query repository to obtain results, convert them to CSV or perform some analysis of the results, etc.

Adding experiment view

CK provides a possibility to view produced experiments via user-friendly web browser with customized views (i.e. selecting only specific fields). Hence, we added such entry via:

$ ck add experiment:experiment.view @@dict
where we described which flat keys to show and what they mean:
{
  "flat_keys": [
    "##choices#data_uoa", 
    "##choices#dataset_uoa", 
    "##choices#dataset_file", 
    "##choices#host_os", 
    "##choices#target_os", 
    "##characteristics#run#run_time_state#opencl_platform", 
    "##characteristics#run#run_time_state#opencl_device", 
    "##characteristics#run#run_time_state#opencl_device_units", 
    "##features#compiler_version#raw@0", 
    "##choices#cmd_key",
    "##characteristics#run#return_code",
    "##characteristics#run#run_time_state#positive_results",
    "##characteristics#run#run_time_state#negative_results"
  ], 
  "flat_keys_desc": [
    {
      "desc": "Program", 
      "module_uoa": "b0ac08fe1d3c2615", 
      "type": "uoa"
    }, 
    {
      "desc": "Dataset UOA", 
      "module_uoa": "8a7141c59cd335f5", 
      "type": "uoa"
    }, 
    {
      "desc": "Dataset file", 
      "type": "text"
    }, 
    {
      "desc": "Host OS", 
      "module_uoa": "0440cb72c2bc5cc6", 
      "type": "uoa"
    }, 
    {
      "desc": "Target OS", 
      "module_uoa": "0440cb72c2bc5cc6", 
      "type": "uoa"
    }, 
    {
      "desc": "OpenCL platform"
    }, 
    {
      "desc": "OpenCL device"
    }, 
    {
      "desc": "OpenCL cores"
    }, 
    {
      "desc": "Raw compiler name"
    }, 
    {
      "desc": "CMD key"
    }, 
    {
      "desc": "Run return code"
    },
    {
      "desc": "TBD: Positive results (crowdsourcing)"
    }, 
    {
      "desc": "TBD: Negative results (crowdsourcing)"
    } 
  ]
}

You can see the sample of this output here.

Packing CK repo for Artifact Evaluation

To share the above CK repository with an Artifact Evaluation committee, the authors can create ckr-ck-clsmith.zip, a standard zip archive containing the artifact, using:

 $ ck zip repo:ck-clsmith

On the receiving end, this archive can be trivially installed using:

 $ ck add repo:ck-clsmith --zip=ckr-ck-clsmith.zip --quiet

That's all! Comments are welcome!

Questions and comments

You are welcome to get in touch with the CK community if you have questions or comments!

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.