# Sampler Saturation

*RQ: Does the rate of good samples decrease over time?*

For a sample to be "good", it must be unique. If the rate of good samples decrease over time, that implies a saturation of the sample space.

## Experimental Setup

In [1]:
import clgen

clgen.platform_info()

CLgen:      0.2.12 (with CUDA)
Platform:   Linux
Memory:     32057 MB

Device:     GPU GeForce GTX 1080
Compute #.: 20
Frequency:  1733 HZ
Memory:     8114 MB
Driver:     375.39

Device:     GPU GeForce GTX 1080
Compute #.: 20
Frequency:  1733 HZ
Memory:     8114 MB
Driver:     375.39

Device:     CPU Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
Compute #.: 16
Frequency:  2100 HZ
Memory:     32057 MB
Driver:     1.2.0.25


In [3]:
from clgen.corpus import Corpus

corpus = Corpus.from_json({
    "path": "~/data/kernels/github",
    "vocabulary": "greedy"
})
corpus

UnableToAcquireLockError: Unable to acquire file lock owned by a different process.
Lock acquired by process 185820 on 2017-03-28.
Lock path: /home/cec/.cache/clgen/0.2.12/corpus/fc8a8aea21d03dfad54e95cb77d38649cc5d0b08/LOCK

In [None]:
from clgen.model import Model

model = Model(corpus, **{
    "architecture": {
        "rnn_size": 512,
        "num_layers": 2
    },
    "train_opts": {
        "epochs": 50
    }
}).train()
model

In [None]:
from clgen.sampler import Sampler

sampler = Sampler.from_json({
    "kernels": {
        "args": [
            "__global float*",
            "__global float*",
            "__global float*",
            "const int"
        ],
        "max_length": 10000
    },
    "sampler": {
        "batch_size": 500,
        "static_checker": True,
        "dynamic_checker": False
    }
})
sampler

## Experimental Methodology

In [None]:
sampler.cache(model).empty()
from labm8 import fs
import time

datadir = "data/saturation"
fs.mkdir(datadir)

for i in range(100):
    print("sampler batch", i)

    start = time.time()
    sampler.sample_iteration(model, quiet=True)    
    elapsed = time.time() - start

    src = fs.path(sampler.cache(model).path, "kernels.db")
    dst = fs.path(datadir, "iteration-{i}.db".format(**vars()))
    fs.cp(src, dst)
    
    with open(fs.path(datadir, "iteration-{i}.txt".format(**vars())), "w") as outfile:
        print(elapsed, file=outfile)