# Trace Generation and Modeling

To test our DeepValidate approach we generate a dataset of test traces from a chain of relatively simple arithmetical functions operating on a series of randomized inputs. Given the generated program traces, we train a LSTM classifier to predict whether the output will be valid or result in an error. 

The trace generation is performed by `output_trace.jl` which reproduces much of the functionality of `varextract.jl` with some important differences. Rather than send trace information to `stdout`, we direct the traces to a file `traces.dat`. This raw output is then processed into a CSV of traces (minus the error dumps we want to predict) and a CSV of binary (0, 1) labels indicating whether the run resulted in an error. 

(It must be noted that this is not possible within an IJulia notebook due to restrictions on [task switching in staged functions](https://github.com/JuliaLang/julia/issues/18568) which prevents the trace outputs from being written to a file recursively. However, this works just fine from the command line.)

In [None]:
function Cassette.overdub(ctx::TraceCtx,
                          f,
                          args...)
    open("traces.dat", "a") do file
        write(file, string(f))
        write(file, string(args))
    end
    
    # if we are supposed to descend, we call Cassette.recurse
    if Cassette.canrecurse(ctx, f, args...)
        subtrace = (Any[],Any[])
        push!(ctx.metadata[1], (f, args) => subtrace)
        newctx = Cassette.similarcontext(ctx, metadata = subtrace)
        retval = Cassette.recurse(newctx, f, args...)
        # push!(ctx.metadata[2], subtrace[2])
    else
        retval = Cassette.fallback(ctx, f, args...)
        push!(ctx.metadata[1], :t)
        push!(ctx.metadata[2], retval)
    end
    @info "returning"
    @show retval
    return retval
end

We then modify our `@textset` so that it creates the `traces.dat` file and then loops through a large number of randomized runs of our arithmetic tests. Error conditions happen most often when our inputs are sufficiently close to zero, so a Normal(0,2) distribution gives us a good range of values to generate a reasonable percentage of "bad" traces on which to train. Empirically the share of "bad" traces generated is about 15-17%.

In [None]:
@testset "TraceExtract" begin
    g(x) = begin
        y = add(x.*x, -x)
        z = 1
        v = y .- z
        s = sum(v)
        return s
    end
    h(x) = begin
        z = g(x)
        zed = sqrt(z)
        return zed
    end

    open("traces.dat", "w") do f
        write(f, "")
    end

    seeds = rand(Normal(0,2),30000,3)
    
    for i=1:size(seeds,1)
        ctx = TraceCtx(pass=ExtractPass, metadata = (Any[], Any[]))
        try
            result = Cassette.overdub(ctx, h, seeds[i,:])
        catch DomainError
            dump(ctx.metadata)
        finally
            open("traces.dat", "a") do f
                write(f, "\n")
            end
        end
        if i%1000 == 0
            @info string(i)
        end
    end
end


After generating our raw traces, a small amount of pre-processing is required before attempting to model around them. First, we classify our "good" and "bad" traces based on whether they have resulted in an error. 

We then need to strip out the actual error dump information from our "bad" traces, as this would too easily give away the prediction game. All traces end just before they would error, allowing the validation model to predict the that next outcome. 

In [None]:
text = split(String(read("traces.dat")), "\n");
Ys = Int.(occursin.(Ref(r"(Base[\S(?!\))]+error)"i), text));

text = split.(text, Ref(r"(Base[\S(?!\))]+error)"i));
text = [t[1] for t in text];

sum(Ys)

Finally, we save our traces and our labels our as CSV files for easy ingestion for our model. 

In [None]:
writedlm( "traces.csv",  text[1:end-1], ',')
writedlm( "y_results.csv",  Ys[1:end-1], ',')

## Validation Classifier Model
For our modeling, we use [Flux.jl](https://github.com/FluxML/Flux.jl) and train an LSTM encoder/decoder classifier on our traces.

In [3]:
using Pkg

In [4]:
#Pkg.add(["Flux", "MLDataPattern", "DelimitedFiles"])
Pkg.activate("../../")
Pkg.instantiate()

[32m[1m  Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m  Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[?25l[2K[?25h

In [5]:
using DelimitedFiles
using Flux
using Flux: onehot, throttle, crossentropy, onehotbatch, params, shuffle
using MLDataPattern: stratifiedobs
using Base.Iterators: partition

include("../../src/validation/utils.jl")


┌ Info: Recompiling stale cache file /home/jfairbanks6/.julia/compiled/v1.0/Flux/QdkVy.ji for Flux [587475ba-b771-5e3f-ad9e-33799f191a9c]
└ @ Base loading.jl:1190
┌ Info: Recompiling stale cache file /home/jfairbanks6/.julia/compiled/v1.0/MLDataPattern/qdIQj.ji for MLDataPattern [9920b226-0b2a-5f5f-9153-9aa70a013f8b]
└ @ Base loading.jl:1190


get_data (generic function with 1 method)

In [6]:
#
# Set up inputs for model
#

# Read lines from traces.dat text in to arrays of characters
# Convert to onehot matrices

cd(@__DIR__)

text, alphabet, N = get_data("traces.dat")
stop = onehot('\n', alphabet);
prod(alphabet)

"}(*, [-0.34912857])ad6+Bseroctbmilznpy{DfuASqvF<_Mhwx:g\"#\n"

In [7]:
# Partition into subsequences to input to our model

seq_len = 50

Xs = [collect(partition(t,seq_len)) for t in text];

In [8]:
prod.(Xs[7])

31-element Array{String,1}:
 "getfield(Main, Symbol(\"#h#9\")){getfield(Main, Symb"  
 "ol(\"#g#8\"))}(getfield(Main, Symbol(\"#g#8\"))())([1."
 "2772, -3.93049, -1.6268],)getfield(getfield(Main, "    
 "Symbol(\"#h#9\")){getfield(Main, Symbol(\"#g#8\"))}(ge"
 "tfield(Main, Symbol(\"#g#8\"))()), :g)getfield(Main,"  
 " Symbol(\"#g#8\"))()([1.2772, -3.93049, -1.6268],)Ba"  
 "se.Broadcast.broadcasted(*, [1.2772, -3.93049, -1."    
 "6268], [1.2772, -3.93049, -1.6268])Base.Broadcast."    
 "materialize(Base.Broadcast.Broadcasted(*, ([1.2772"    
 ", -3.93049, -1.6268], [1.2772, -3.93049, -1.6268])"    
 "),)Base.Broadcast.instantiate(Base.Broadcast.Broad"    
 "casted(*, ([1.2772, -3.93049, -1.6268], [1.2772, -"    
 "3.93049, -1.6268])),)copy(Base.Broadcast.Broadcast"    
 ⋮                                                       
 "2, 4.27327], 1)Base.Broadcast.materialize(Base.Bro"    
 "adcast.Broadcasted(-, ([0.354036, 19.3792, 4.27327"    
 "], 1)),)Base.Broadcast.instantiate(Base.Br

In [9]:
Ys = (map(t->occursin("sqrt_llvm", prod(t)), text));

In [10]:
sum(Ys), length(Ys)

(297, 348)

In [11]:
Xs_vec = [[onehotbatch(x, alphabet) for x in Xs[i]] for i in 1:length(Xs)];
Xs_vec[1]

19-element Array{Flux.OneHotMatrix{Array{Flux.OneHotVector,1}},1}:
 [true true … false false; false false … false false; … ; false false … false false; false false … false false]  
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false

In [12]:
#Ys = readdlm("y_results.csv");
labelset = unique(Ys)
#dataset = [(onehotbatch(x, alphabet, '\n'), onehot(Ys[i],labelset))
#           for i in 1:length(Ys) for x in Xs[i]] |> shuffle
dataset = shuffle([(Xs_vec[i], onehot(Ys[i], labelset)) for i in 1:length(Xs_vec)])
Ys = last.(dataset)
first.(dataset)[1][1:25]

25-element Array{Flux.OneHotMatrix{Array{Flux.OneHotVector,1}},1}:
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [true false … false false; false true … true false; … ; false false … false false; false false … false false]   
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false

In [15]:
# Pad sequences to equal lengths
#Xs_padded = [hcat(x,repeat(stop,1,seq_len-size(x)[1])) for x in first.(dataset)]
Xs_padded = [hcat(x[1:10]...) for x in first.(dataset)]
#map(length, Xs_padded)

348-element Array{Array{Bool,2},1}:
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false false; … ; false false … false false; false false … false false]
 [false false … false false; false false … false fal

In [16]:
# There are 972,290 items in our data. We use a train:test split of 90:10, stratified to ensure we have 
# the same share of "bad" and "good" traces in our train and test sets.

(Xtrain, Ytrain), (Xtest, Ytest) = stratifiedobs((Xs_padded, Ys), p=0.9)

train = [(Xtrain[i], Ytrain[i]) for i in 1:length(Ytrain)];
test = [(Xtest[i], Ytest[i]) for i in 1:length(Ytest)];
length(train), length(test)

In [17]:
train[1][1]

58×500 Array{Bool,2}:
  true   true  false  false  false  …  false  false  false  false  false
 false  false   true  false  false     false  false  false  false  false
 false  false  false   true  false     false  false  false  false  false
 false  false  false  false   true     false   true  false  false  false
 false  false  false  false  false     false  false   true  false  false
 false  false  false  false  false  …  false  false  false  false  false
 false  false  false  false  false     false  false  false  false  false
 false  false  false  false  false     false  false  false  false  false
 false  false  false  false  false     false  false  false  false  false
 false  false  false  false  false     false  false  false  false  false
 false  false  false  false  false  …  false  false  false  false  false
 false  false  false  false  false     false  false  false  false  false
 false  false  false  false  false     false  false  false   true  false
     ⋮                       

In [18]:
# We set up our model architecture

scanner = Chain(Dense(length(alphabet), seq_len, σ), LSTM(seq_len, seq_len))
encoder = Dense(seq_len, 2)

function model(x)
  state = scanner.([x])[end]
  Flux.reset!(scanner)
  softmax(encoder(state))
end

loss(tup...) = begin
    #@show typeof.(tup)
    #@show size.(tup)
    crossentropy(model(tup[1]), tup[2])
end
accuracy(tup...) = mean(argmax(model(tup[1])) .== argmax(tup[2]))

opt = ADAM(0.000001)
ps = params(scanner, encoder)
#ps = params(model)

Params([Float32[-0.129354 -0.0241872 … -0.0101191 -0.217834; -0.205283 0.105929 … 0.151116 -0.221793; … ; -0.102469 -0.192997 … -0.0877722 -0.0654175; -0.22055 0.0625328 … -0.110478 0.115831] (tracked), Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] (tracked), Float32[-0.069226 0.0263257 … 0.12184 -0.0979717; 0.0420466 0.117881 … -0.144269 -0.0572768; … ; -0.0101115 -0.0503838 … 0.0683065 0.0389834; 0.0482987 -0.145225 … -0.0969547 0.139767] (tracked), Float32[-0.0433572 0.0325791 … -0.0879745 0.145104; -0.0882896 0.115542 … -0.0810297 0.125361; … ; -0.057541 0.0998544 … 0.139006 -0.0808067; 0.144499 -0.0519149 … 0.0960697 -0.142463] (tracked), Float32[0.0798743, 0.0759112, 0.151637, -0.0658528, -0.00590102, 0.144042, 0.0726334, -0.0810471, -0.13389, 0.126116  …  -0.0381014, -0.00151839, 0.0764952, -0.0388899, 0.0603149, -0.0646586, 0.0466692, -0.131691, 0.0934145, -0.0481369] (tracked), Float32[0.0, 0.0, 0.0, 0.0, 0.0, 0.

In [19]:
# Finally, we set up our callbacks for reporting on training progress.
mean(x) = sum(x)/length(x)
#testacc() = mean(accuracy(t) for t in test)
testloss() = mean(loss(t...) for t in test)

evalcb = () -> @show testloss()#, testacc()


#24 (generic function with 1 method)

In [30]:
# Now, train!
@show length(train), length(test)
epochs = 50
for e in 1:epochs
    Flux.train!(loss, ps, train, opt, cb = throttle(evalcb, 10))
end

(length(train), length(test)) = (313, 35)
testloss() = 244.06038f0 (tracked)
testloss() = 242.83792f0 (tracked)
testloss() = 241.65282f0 (tracked)
testloss() = 240.50385f0 (tracked)
testloss() = 239.39015f0 (tracked)
testloss() = 238.31078f0 (tracked)
testloss() = 237.26488f0 (tracked)
testloss() = 236.25148f0 (tracked)
testloss() = 235.26956f0 (tracked)
testloss() = 234.31836f0 (tracked)
testloss() = 233.39685f0 (tracked)
testloss() = 232.50427f0 (tracked)
testloss() = 231.63972f0 (tracked)
testloss() = 230.80249f0 (tracked)
testloss() = 229.99174f0 (tracked)
testloss() = 229.20673f0 (tracked)
testloss() = 228.44658f0 (tracked)
testloss() = 227.71059f0 (tracked)
testloss() = 226.998f0 (tracked)
testloss() = 226.3081f0 (tracked)
testloss() = 225.64024f0 (tracked)
testloss() = 224.99364f0 (tracked)
testloss() = 224.36768f0 (tracked)
testloss() = 223.7616f0 (tracked)
testloss() = 223.1748f0 (tracked)
testloss() = 222.60672f0 (tracked)
testloss() = 222.0568f0 (tracked)
testloss() = 221.52

In [31]:
ps

Params([Float32[-0.139871 -0.0347597 … -0.0213785 -0.217834; -0.190121 0.121064 … 0.16649 -0.221793; … ; -0.107366 -0.197782 … -0.0929333 -0.0654175; -0.222207 0.0593458 … -0.111654 0.115831] (tracked), Float32[-0.0104232, 0.0150795, 0.0155962, -0.00930544, 0.00410825, -0.0139495, 0.0148873, 0.014865, 0.000389495, 0.00345307  …  0.00894054, 0.0150887, -0.00572559, -0.0140178, 0.0148194, 0.0153684, -0.0142857, -0.00572933, -0.00536107, -0.00245235] (tracked), Float32[-0.0690327 0.0264817 … 0.121986 -0.0978056; 0.0498987 0.125716 … -0.136427 -0.0494317; … ; 0.00559051 -0.0346823 … 0.0839965 0.0546742; 0.0640606 -0.129461 … -0.0811916 0.15553] (tracked), Float32[-0.0417015 0.0342454 … -0.0896309 0.143439; -0.0951674 0.108681 … -0.0741613 0.132219; … ; -0.0751735 0.08222 … 0.156634 -0.0631734; 0.126793 -0.0696225 … 0.113772 -0.124756] (tracked), Float32[0.0800481, 0.0837521, 0.148407, -0.0790332, 0.0105961, 0.130021, 0.0630973, -0.0658107, -0.135941, 0.141698  …  -0.0224647, 0.0133192, 0.0

In [32]:
#model.(Xs_padded)[:,end]

In [33]:
sumη = [0.0,0.0]
df = []
for i in 1:300
    yhat = model(train[i][1])[:,end]
    η = crossentropy(yhat, train[i][2])
    sumη += η.*train[i][2]
    υ = Flux.onecold(train[i][2])
    logods = round(Flux.data(yhat[1]/yhat[2]);digits=3)
    push!(df, (υ, logods, η))
    #println("$υ\t$logods\t$η")
end
ηbar = sumη./sum(train[i][2] for i in 1:length(train))


1	0.269	1.5504481f0 (tracked)
2	0.272	0.24033804f0 (tracked)
2	0.272	0.24062566f0 (tracked)
2	0.269	0.23845132f0 (tracked)
2	0.271	0.23978798f0 (tracked)
2	0.269	0.23845132f0 (tracked)
2	0.265	0.2350539f0 (tracked)
2	0.266	0.23609789f0 (tracked)
2	0.272	0.24033804f0 (tracked)
2	0.274	0.2420508f0 (tracked)
2	0.274	0.2420508f0 (tracked)
2	0.265	0.2350539f0 (tracked)
2	0.273	0.24143757f0 (tracked)
2	0.272	0.24062566f0 (tracked)
2	0.271	0.23971389f0 (tracked)
2	0.269	0.23845132f0 (tracked)
2	0.271	0.23971389f0 (tracked)
2	0.274	0.2420508f0 (tracked)
1	0.269	1.5504481f0 (tracked)
2	0.272	0.24081151f0 (tracked)
2	0.272	0.24062566f0 (tracked)
2	0.266	0.23609789f0 (tracked)
1	0.274	1.537193f0 (tracked)
2	0.264	0.23447493f0 (tracked)
2	0.272	0.24081151f0 (tracked)
2	0.272	0.24033804f0 (tracked)
2	0.271	0.23971389f0 (tracked)
2	0.272	0.24033804f0 (tracked)
2	0.269	0.23845132f0 (tracked)
2	0.275	0.24262035f0 (tracked)
2	0.269	0.23845132f0 (tracked)
2	0.264	0.23408234f0 (tracked)
2	0.276	0.2436583

300-element Array{Any,1}:
 (1, 0.264f0, 1.5653542f0 (tracked)) 
 (1, 0.264f0, 1.5653542f0 (tracked)) 
 (1, 0.264f0, 1.5668414f0 (tracked)) 
 (1, 0.264f0, 1.5668414f0 (tracked)) 
 (1, 0.265f0, 1.5631661f0 (tracked)) 
 (1, 0.265f0, 1.5631661f0 (tracked)) 
 (1, 0.266f0, 1.5592362f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.271f0, 1.545773f0 (tracked))  
 ⋮                                   
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835f0 (tracked))
 (2, 0.276f0, 0.24365835

In [37]:
df[1:50]

50-element Array{Any,1}:
 (1, 0.264f0, 1.5653542f0 (tracked)) 
 (1, 0.264f0, 1.5653542f0 (tracked)) 
 (1, 0.264f0, 1.5668414f0 (tracked)) 
 (1, 0.264f0, 1.5668414f0 (tracked)) 
 (1, 0.265f0, 1.5631661f0 (tracked)) 
 (1, 0.265f0, 1.5631661f0 (tracked)) 
 (1, 0.266f0, 1.5592362f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.269f0, 1.5504481f0 (tracked)) 
 (1, 0.271f0, 1.545773f0 (tracked))  
 ⋮                                   
 (1, 0.275f0, 1.535116f0 (tracked))  
 (1, 0.275f0, 1.535116f0 (tracked))  
 (1, 0.275f0, 1.535116f0 (tracked))  
 (1, 0.275f0, 1.535116f0 (tracked))  
 (1, 0.275f0, 1.535116f0 (tracked))  
 (1, 0.276f0, 1.5313448f0 (tracked)) 
 (2, 0.264f0, 0.23408234f0 (tracked))
 (2, 0.264f0, 0.23408234f0 (tracked))
 (2, 0.264f0, 0.23408234f0 (tracked))
 (2, 0.264f0, 0.23408234f0 (tracked))
 (2, 0.264f0, 0.23408234f0 (tracked))
 (2, 0.264f0, 0.23408234f

In [33]:
map(x->size(x[end]), Xs_padded)

3000-element Array{Tuple{},1}:
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ⋮ 
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()
 ()

In [40]:
Pkg.add(["CuArrays","CUDAnative"])

[32m[1m Resolving[22m[39m package versions...
[32m[1m  Updating[22m[39m `~/.julia/dev/SemanticModels/Project.toml`
 [90m [be33ccc6][39m[92m + CUDAnative v1.0.1[39m
[32m[1m  Updating[22m[39m `~/.julia/dev/SemanticModels/Manifest.toml`
[90m [no changes][39m


In [41]:
Pkg.build("CUDAnative")

[32m[1m  Building[22m[39m LLVM ──────→ `~/.julia/packages/LLVM/tPWXv/deps/build.log`
[32m[1m  Building[22m[39m CUDAdrv ───→ `~/.julia/packages/CUDAdrv/JWljj/deps/build.log`
[32m[1m  Building[22m[39m CUDAnative → `~/.julia/packages/CUDAnative/Mdd3w/deps/build.log`


┌ Error: Error building `CUDAnative`: 
│ ERROR: LoadError: CUDA toolkit at  doesn't contain nvcc
│ Stacktrace:
│  [1] error(::String) at ./error.jl:33
│  [2] find_toolkit_version(::Array{String,1}) at /home/jfairbanks6/.julia/packages/CUDAapi/AVyQs/src/discovery.jl:277
│  [3] main() at /home/jfairbanks6/.julia/packages/CUDAnative/Mdd3w/deps/build.jl:125
│  [4] top-level scope at none:0
│  [5] include at ./boot.jl:317 [inlined]
│  [6] include_relative(::Module, ::String) at ./loading.jl:1044
│  [7] include(::Module, ::String) at ./sysimg.jl:29
│  [8] include(::String) at ./client.jl:392
│  [9] top-level scope at none:0
│ in expression starting at /home/jfairbanks6/.julia/packages/CUDAnative/Mdd3w/deps/build.jl:165
└ @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Pkg/src/Operations.jl:1097


In [48]:
train[1][1][:,3]

58-element Array{Bool,1}:
 false
  true
 false
 false
 false
 false
 false
 false
 false
 false
 false
 false
 false
     ⋮
 false
 false
 false
 false
 false
 false
 false
 false
 false
 false
 false
 false