# Trying a Neural Network to predict the rest-eigen part from the eigenvalues and the eigen-eigen part

If you have an M-Series Apple chip available that has a perfomant GPU, you can use Metal for improving perfomance. Flux also supports CUDA if you wanna use it on NVIDIA GPU's. See [here](http://fluxml.ai/Flux.jl/stable/gpu/).

In [None]:
using LmaPredict, Flux, Metal, Statistics, ProgressMeter

Check which GPU backend is available

In [335]:
Flux.GPU_BACKEND

"Metal"

## Reading the data

In [None]:
const path_config = "/Users/lukasgeyer/Studium/Computational Sciences/Masterarbeit/Daten Simon/dat"
const path_plot = "../plots"

In [None]:
fname = readdir(path_config)[2:5001]
idx_cut = findall(x->x<=2500 , parse.(Int64, fname))
fname = fname[idx_cut]
idx = sortperm( parse.(Int64, fname))
fname = fname[idx]

cnfgarr = Vector{LMAConfig}(undef, 0)
for f in fname
    push!(cnfgarr, get_LMAConfig(joinpath(path_config, f), "g5-g5", em="PA", bc=false))
end

## Splitting data ind training and test sets

In [None]:
# Select a specific Tsource and divide data into training and test set for rr re and ee components
TSRC="12"
NCNFG = length(cnfgarr)
TVALS = length(cnfgarr[1].data["rr"][TSRC]) -1
EIGVALS = minimum(length.([cnfgarr[i].data["eigvals"] for i in 1:NCNFG]))

eigvals_data = Array{Float64}(undef, EIGVALS, 2000)
rr_data = Array{Float64}(undef, TVALS, 2000)
ee_data = Array{Float64}(undef, TVALS, 2000)
re_data = Array{Float64}(undef, TVALS, 2000)

eigvals_data_test = Array{Float64}(undef, EIGVALS, 500)
rr_data_test = Array{Float64}(undef, TVALS, 500)
ee_data_test = Array{Float64}(undef, TVALS, 500)
re_data_test = Array{Float64}(undef, TVALS, 500)


for (k, dd) in enumerate(getfield.(cnfgarr, :data)[1:2000])
    eigvals_data[:,k] = copy(cnfgarr[k].data["eigvals"][1:EIGVALS])
    rr_data[:,k] = getindex(getindex(dd, "rr"), TSRC)[2:end]
    ee_data[:,k] = getindex(getindex(dd, "ee"), TSRC)[2:end]
    re_data[:,k] = getindex(getindex(dd, "re"), TSRC)[2:end]
end
for (k, dd) in enumerate(getfield.(cnfgarr, :data)[2001:2500])
    eigvals_data_test[:,k] = copy(cnfgarr[k].data["eigvals"][1:EIGVALS])
    rr_data_test[:,k] = getindex(getindex(dd, "rr"), TSRC)[2:end]
    ee_data_test[:,k] = getindex(getindex(dd, "ee"), TSRC)[2:end]
    re_data_test[:,k] = getindex(getindex(dd, "re"), TSRC)[2:end]
end

## Describing the Neural Network

As input we choose a vector containing the inverted eigenvalues $\lambda_i$ as the first $n_{\lambda}$ entries, followed by the eigen-eigen contributions $ee_i$:

$$ 

v_{input} = \begin{bmatrix} \ \frac{1}{\lambda_1} \ \\[6pt] \ \frac{1}{\lambda_2} \ \\[6pt] \ \frac{1}{\lambda_3} \ \\[6pt] \vdots \\[6pt] \frac{1}{\lambda_{n_{\lambda}}} \\[6pt]
 \ ee_1 \ \\[6pt] \ ee_2 \ \\[6pt]\ ee_3 \ \\[6pt] \vdots \\[6pt]\ ee_{n_{ee}} \ \end{bmatrix}

$$
 where $n_{ee}$ is the number of time samples, 47 in our case. $\\[10pt]$

Our Neural Network therefore has an input layer of size $n_{\lambda} + n_{ee}$. We then try one fully connected layer of size $2(n_{\lambda} + n_{ee})$ and a fully connected output layer of size $n_{ee}$. 

**Note:** As we do not have the same amount of eigenvalues available for each configuration, we calculate the minimum of available eigenvalues out of all configurations, such that we can use as many as possible.

In [40]:
input_length = TVALS + EIGVALS
output_length = TVALS
hidden_length = 2 * input_length

input_data = vcat(1 ./ eigvals_data, ee_data)
target = re_data

test_input_data = vcat(1 ./ eigvals_data_test,ee_data_test)
test_target = re_data_test;

## Defining the Network

In [None]:
model = Chain(
    Dense(input_length => hidden_length, tanh),
    BatchNorm(hidden_length),
    Dense(hidden_length => output_length),
    softmax) |> gpu

## Defining training input and target data

We use a batch size of 64 configurations to introduce stochastivity

In [None]:
loader = Flux.DataLoader((input_data, target) |> gpu, batchsize=64, shuffle=true)

optim = Flux.setup(Flux.Adam(0.01), model);

## Training the Network

The network sees $1000 \cdot 64$ configurations. As loss function we use standard MSE.

In [None]:
epochs = 1_000

losses = []
@showprogress for epoch in 1:epochs
    for (x, y) in loader
        loss, grads = Flux.withgradient(model) do m
            y_hat = m(x)
            Flux.mse(y_hat, y)
        end
        Flux.update!(optim, model, grads[1])
        push!(losses, loss) 
    end
end

## Checking out-of-smaple results

In [322]:
out_of_sample_predictions = model(test_input_data |> gpu) |> cpu;

**Note:** Notice that I introduced a constant offset of $-0.021$ (which is arbitrary for now). There needs to be further investigation going into this offset!

In [334]:
l = @layout [a b c; d e f; g h i]

c1 = rand([i for i in 1:500])
p1 = scatter(test_target[:,c1], label="Actual")
scatter!(p1, out_of_sample_predictions[:,c1] .- 2.1e-2, label="Prediction", legend=:top)

c2 = rand([i for i in 1:500])
p2 = scatter(test_target[:,c2], label="Actual")
scatter!(p2, out_of_sample_predictions[:,c2] .- 2.1e-2, label="Prediction", legend=:top)

c3 = rand([i for i in 1:500])
p3 = scatter(test_target[:,c3], label="Actual")
scatter!(p3, out_of_sample_predictions[:,c3] .- 2.1e-2, label="Prediction", legend=:top, )

c4 = rand([i for i in 1:500])
p4 = scatter(test_target[:,c4], label="Actual")
scatter!(p4, out_of_sample_predictions[:,c4] .- 2.1e-2, label="Prediction", legend=:top)

c5 = rand([i for i in 1:500])
p5 = scatter(test_target[:,c5], label="Actual")
scatter!(p5, out_of_sample_predictions[:,c5] .- 2.1e-2, label="Prediction", legend=:top)

c6 = rand([i for i in 1:500])
p6 = scatter(test_target[:,c6], label="Actual")
scatter!(p6, out_of_sample_predictions[:,c6] .- 2.1e-2, label="Prediction", legend=:top)

c7 = rand([i for i in 1:500])
p7 = scatter(test_target[:,c7], label="Actual")
scatter!(p7, out_of_sample_predictions[:,c7] .- 2.1e-2, label="Prediction", legend=:top)

c8 = rand([i for i in 1:500])
p8 = scatter(test_target[:,c8], label="Actual")
scatter!(p8, out_of_sample_predictions[:,c8] .- 2.1e-2, label="Prediction", legend=:top)

c9 = rand([i for i in 1:500])
p9 = scatter(test_target[:,c9], label="Actual")
scatter!(p9, out_of_sample_predictions[:,c9] .- 2.1e-2, label="Prediction", legend=:top)

plot(p1, p2, p3, p4, p5, p6, p7, p8, p9, layout = l, size=(1200,1000), dpi=1000, markerstrokewidth = 0)
savefig(joinpath(path_plot, "neural_network_test.pdf"))

"/Users/lukasgeyer/Studium/Computational Sciences/Masterarbeit/Tool Allesandro/repo/LmaPredict/plots/neural_network_test.pdf"