# Artifact Evaluation: Synthesizing Benchmarks for Predictive Modeling

[Chris Cummins](http://chriscummins.cc/),
[Pavlos Petoumenos](http://homepages.inf.ed.ac.uk/ppetoume/),
[Zheng Wang](http://www.lancaster.ac.uk/staff/wangz3/),
[Hugh Leather](http://homepages.inf.ed.ac.uk/hleather/).

<span style="color:#f00;">**IMPORTANT!**</span> Changes to this document are persistent. Before doing anything else, select from the menu "File" > "Make a Copy". This will prevent your changes from affecting other users. Thank you.

High system load may lead to inconsistent performance results; this may occur if multiple reviewers are accessing the server simultaneously.

### How to use this document

1. Click on the first code block.
1. Press `Ctrl+Enter` to run the code.
1. Once completed, the code will self-test. If the test passes it will display:
<div style="background-color:#5cb85c; color:#fff; text-align:center; border-radius:10px;">
  <h1 style="padding:.5em; font-weight:400;">☑ Complete</h1>
</div>
If the test fails it will display:
<div style="background-color:#d9534f; color:#fff; text-align:center; border-radius:10px;">
  <h1 style="padding:.5em; font-weight:400;">☒ Failed</h1>
</div>
1. Evaluate the output and proceed to the next code block.

Alternatively, run all of the code blocks automatically in sequence by selecting from the menu "Kernel" > "Restart and Run All".

For further information on using Jupyter notebooks, see the [official documentation](https://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/Notebook%20Basics.html).

### Resources
* ["Interactive Paper"](/notebooks/Paper.ipynb) is a comprehensive version of this evaluation for those wishing to evaluate every aspect of the paper.
* Install this artifact on your own hardware: http://chriscummins.cc/cgo17/
* Online version of the OpenCL Turing Test: http://humanorrobot.uk/game/?g=opencl&m=nitt
* CLgen source code: https://github.com/ChrisCummins/clgen/
* CLgen API documentation: http://chriscummins.cc/clgen/api/

Here is the first code block:

In [None]:
# preamble
%load_ext autoreload
%autoreload 2
%matplotlib inline
%run lib/preamble.py

complete(msg="Initial setup complete")

# Experimental Setup

This artifact must be evaluated on a CPU-GPU heterogeneous system. In the paper, we used:
* **Intel Core i7-3820**
* **AMD Tahiti 7970**
* **NVIDIA GTX 970**

Details about this system:

In [None]:
import clgen.clutil
clgen.clutil.platform_info()

import random
uid = random.randint(0, 100000)
fs.rm("../data/usr/{uid}".format(uid=uid))
fs.mkdir("../data/usr/{uid}/clgen".format(uid=uid))
fs.mkdir("../data/usr/{uid}/benchmarks".format(uid=uid))
print("\nUnique test ID:", uid)

complete(can_reproduce_experiments(), "Artifact is running on suitable hardware, please check system load")


# Synthesizing Programs with CLgen
Load our pre-trained Neural Network, generate new programs, validate samples.

In [None]:
print("The model used in the paper (pre-trained):")
model = clgen.model.from_tar("../data/clgen-github-model-2016-nov-2048x3.tar.bz2")
print(model)
complete(model.hash == "f2fb3ad753896d54fe284c138eaa703db3518bbb",
         "Load pre-trained neural network")

In [None]:
# sample model
import clgen.sampler
import clgen.dbutil
import clgen.explore
argspec = ['__global float*', '__global float*', '__global float*', 'const int']
sampler = clgen.sampler.from_json({
        "kernels": { 
            "args": argspec,
            "max_length": 5000,
        },
        "sampler": {
            "batch_size": 25,
            "max_kernels": 10,
        }
    })

print("Sample from the model used in the paper:\n")
print("Seed text:", clgen.sampler.serialize_argspec(argspec))
sampler.cache(model).empty()
sampler.sample(model)

db = sampler.cache(model)["kernels.db"]
num_good_kernels = clgen.dbutil.num_good_kernels(db)
clgen.explore.explore(db)
complete(num_good_kernels >= 5,
         "Generated {} OpenCL kernels".format(num_good_kernels))

In [None]:
print("Generated kernels\n")
try:
    db = clgen.dbutil.connect(sampler.cache(model)["kernels.db"])
    c = db.cursor()

    c.execute("""SELECT Contents FROM PreprocessedFiles WHERE status=0""")
    for i, row in enumerate(c.fetchall()):
        kernel = row[0]
        print("\nKernel ", i+1, ":\n", sep="")
        print(kernel)

    c.close(); db.close()
    complete(msg="Display generated OpenCL kernels")
except:
    complete(False, "Failed to display generated OpenCL kernels")

# Benchmark suite performance results
Generate new runtimes using 1 of the 7 benchmark suites used in the paper:

In [None]:
print("running ...  (this will take a few minutes)")
try:
    !rm -f ../data/benchmarks/*.csv ../data/benchmarks/timestamp.csv
    !cd benchmarks && ./mkdata
    data = pd.read_csv("../data/benchmarks/training.csv")
    benchmarks_timestamp = readfile("../data/benchmarks/timestamp.txt")
    move("../data/benchmarks/training.csv", "../data/usr/{uid}/benchmarks/".format(uid=uid))
    move("../data/benchmarks/timestamp.txt", "../data/usr/{uid}/benchmarks/".format(uid=uid))
    complete(len(data) == 17, "Produced new performance results for benchmarks")
except:
    complete(False, "Did not produce new performance results for benchmarks")

In [None]:
try:
    if benchmarks_timestamp != readfile("../data/usr/{uid}/benchmarks/timestamp.txt".format(uid=uid)):
        print("warning: data timestamp has changed, please re-run experiments", file=sys.stderr)
    data = pd.read_csv("../data/usr/{uid}/benchmarks/training.csv".format(uid=uid))
    ax = sns.barplot(x="benchmark", y="speedup", data=data)
    plt.title("Runtimes generated " + benchmarks_timestamp)
    plt.ylabel("Max speedup")
    plt.xlabel("AMD SDK Benchmark kernels")
    plt.axhline(y=1, color="k", lw=1)  # speedup line
    plt.setp(ax.get_xticklabels(), rotation=90)  # rotate x ticks
    ax.set_xticklabels([shortbenchmark(x.get_text()) for x in ax.get_xticklabels()])
    viz.finalise(figsize=(9,4))
    complete(len(set(data["benchmark"])) == 17, "New performance numbers from 17 AMD kernels")
except:
    complete(False, "Failed to analyze benchmark results")

# CLgen kernel performance results
Generate new runtimes using 1% of CLgen kernels used in the paper:

In [None]:
# print("running ...  (this will take a few minutes)")
# try:
!rm -f ../data/clgen-10/*.csv ../data/clgen-10/timestamp.txt
!cd bin && ./mkdata
data = pd.read_csv("../data/clgen-10/training.csv")
clgen_timestamp = readfile("../data/clgen-10/timestamp.txt")
move("../data/clgen-10/training.csv", "../data/usr/{uid}/clgen/".format(uid=uid))
move("../data/clgen-10/timestamp.txt", "../data/usr/{uid}/clgen/".format(uid=uid))
complete(len(set(data["benchmark"])) == 17, "Produced new performance results for CLgen benchmarks")
# except:
#     complete(False, "Did not produce new performance results for CLgen benchmarks")

In [None]:
try:
    if clgen_timestamp != readfile("../data/usr/{uid}/clgen/timestamp.txt".format(uid=uid)):
        print("warning: data timestamp has changed, please re-run experiments", file=sys.stderr)

    data = pd.read_csv("../data/usr/{uid}/clgen/training.csv".format(uid=uid))   
    ax = sns.barplot(x="benchmark", y="speedup", ci=95, data=data)
    plt.title("Runtimes generated " + clgen_timestamp)
    plt.ylabel("Max speedups (95% CI across datasets)")
    plt.xlabel("CLgen kernels")
    plt.axhline(y=1, color="k", lw=1)  # speedup line
    ax.set_xticklabels(range(1, len(data) + 1))
    viz.finalise(figsize=(9,4))
    complete(len(set(data["benchmark"])) == 17, "New performance numbers from 17 CLgen kernels")
except:
    complete(False, "Failed to analyze CLgen benchmark results")

# Predictive Model performance using CLgen
Test predictive model performance with and without additional CLgen kernels.

In [None]:
try:
    header("Results from the paper on AMD")
    plot_speedups_with_clgen("../data/amd-benchmarks.csv", "../data/amd-clgen.csv", suite="npb")

    header("Results from the paper on NVIDIA")
    plot_speedups_with_clgen("../data/nvidia-benchmarks.csv", "../data/nvidia-clgen.csv", suite="npb")

    header("Results using runtimes generated: Benchmarks",
           readfile("../data/usr/{uid}/benchmarks/timestamp.txt".format(uid=uid)), "- CLgen",
           readfile("../data/usr/{uid}/clgen/timestamp.txt".format(uid=uid)))
    a, b = plot_speedups_with_clgen("../data/usr/{uid}/benchmarks/training.csv".format(uid=uid),
                                    "../data/usr/{uid}/clgen/training.csv".format(uid=uid), suite="amd")
    complete(b > a, "Predictive mode performance improves with CLgen kernels by {:.0f}%".format((b / a) * 100 - 100))
except:
    complete(False, "Failed to generate data for predictive model")

# Extended Predictive Model
Compare performance of extended predictive model over *Grewe et al*.

In [None]:
try:
    header("Results from the paper")
    plot_speedups_extended_model_2platform(("../data/amd-benchmarks.csv", "../data/amd-clgen.csv"),
                                           ("../data/nvidia-benchmarks.csv", "../data/nvidia-clgen.csv"))

    header("Results using new data")
    speedup = plot_speedups_extended_model("../data/usr/{uid}/benchmarks/training.csv".format(uid=uid),
                                           "../data/usr/{uid}/clgen/training.csv".format(uid=uid))
    complete(speedup >= 1.0, "Extended predictie model improves performance by {:.0f}%".format(speedup * 100 - 100))
except:
    complete(False, "Failed to generate data for extended predictive model")

This is the end of the minimal Artifact Evaluation experiments. For a much more comprehensive evaluation of our work, including analysis of OpenCL rewriting, training neural networks, and validating kernel beaviour, see:
# [Interactive Paper](/notebooks/Paper.ipynb)


### Resources
* Install this artifact on your own hardware: http://chriscummins.cc/cgo17/
* Online version of the OpenCL Turing Test: http://humanorrobot.uk/game/?g=opencl&m=nitt
* CLgen source code: https://github.com/ChrisCummins/clgen/
* CLgen API documentation: http://chriscummins.cc/clgen/api/