# Regression Experiments

In this notebook, we perform the regression experiments shown in Section 7.5 of the paper "Signatures, Lipschitz-free spaces, and paths of persistence diagrams" (C. Giusti and D. Lee).  

In this experiment, we generate simulated data from a model of collective motion (the 3D D'Orsogna model), which relies on two parameters $C$ and $\ell$. For each simulation, we compute persistent homology at each time step, and use the resulting path of persistence diagrams to perform parameter estimation. This is done by computing the Gram matrices and using kernel support vector regression.  

The precise steps are as follows.

0. **Setup and Parameters**: Create relevant directories and set parameters for the experiment.  

1. **Generate Swarm Data**: Solve the differential equations which govern the 3D D'Orsogna model to generate swarms as time-varying point clouds. This only needs to be computed once for all experiments.  

2. **Compute (Subsampled) Persistent Homology**: Compute persistent homology for each point cloud to obtain time-varying persistence diagrams. Furthermore, we consider experiments with missing data, where we subsample a collection of agents at each time step. We also compute the time-varying persistence diagrams for these cases.  

3. **Compute Features**: We must first apply feature maps to the persistence diagrams. Here, we compute persistence paths and persistence moments. Persistence landscapes and images are computed in the python notebook.  

4. **Compute Kernels**: We compute signature kernels for all of the features, applying between 0-2 lags in the sliding window embedding. We also compute the Euclidean kernel for the CROCKER plots.  

5. **Perform Regression**: Perform regression while optimizing for hyperparameteres via cross-validation.  



## 0. Setup and Parameters


In [1]:
# Import libraries
using MAT
using Statistics
include("compute_features.jl")
include("PathSignatures.jl")
include("regression_utils.jl")
include("utils.jl")

SW_MMD_kernel (generic function with 6 methods)

In [None]:
## Only need to change these parameters

# Agent subsample number (in the paper, we use 200 and 50)
ss = 200

## Heterogeneous experiments
# Only run ss=50/mixed=true after both ss=200/mixed=false and ss=50/mixed=false have been computed
mixed = false

In [2]:
# Directory names
swarm_fpath = "data/swarm/" # Directory for the swarm data
base_fpath = string("data/ss", ss, "/") # Base directory for this experiment
temp_subsample = ["init100/", "init50/", "init20/", "random50/", "random20/"]

# Simulation parameters
alpha = 1.0 # This is alpha - gamma in the nguyen paper
gamma = 1.0
beta = 0.5
Ca = 1
la = 1
m = 1
N = 200 # Number of agents in one simulation
temp = 0.0

numT = 100 # Number of time points in simulation
numRun = 500 # Number of simulations
endT = 200.0 # End time of simulation
tspan = (0,endT)
trange = range(0, stop=endT, length=numT) # Discrete time points of simulation

boundedlimit = 40.0

# Log-discretization for scale parameter (for betti curves)
tp = 10 .^(range(-4, stop=0, length=100))

# Number of temporal subsamples
numT_subsample = [100, 50, 20, 50, 20]

# Feature and Kernel Parameters
moment_level = 2
perspath_level = 2
signature_level = 3
lags = [0,1,2]

# Regression parameters
num_iterations = 100
tr_split = 0.8 # training split
hyp_cv = 4 # number of folds in cross-validation to do hyperparameter optimization

SVR_C = 10. .^(-3:1:1) # SVR C values to optimize over
SVR_eps = 10. .^(-4:1:0) # SVR epsilon values to optimize over

5-element Vector{Int64}:
 100
  50
  20
  50
  20

In [3]:
# Create directories
for tdir in temp_subsample
    if ~isdir(string(base_fpath, tdir))
        mkdir(string(base_fpath, tdir))
    end

    PD_fpath = string(base_fpath, tdir, "PD/")
    FT_fpath = string(base_fpath, tdir, "FT/")
    KE_fpath = string(base_fpath, tdir, "KE/")
    RG_fpath = string(base_fpath, tdir, "RG/")

    if ~isdir(PD_fpath)
        mkdir(PD_fpath)
    end

    if ~isdir(FT_fpath)
        mkdir(FT_fpath)
    end

    if ~isdir(KE_fpath)
        mkdir(KE_fpath)
    end

    if ~isdir(RG_fpath)
        mkdir(RG_fpath)
    end
end


## 1. Generate Swarm Data

**NOTE**: This only needs to be computed once. All subsequent experiments (with agent or temporal subsampling) is done with this dataset.

In [None]:
# Store all parameters in a dictionary
paramDict = Dict{String, Float64}()
paramDict["alpha"] = alpha
paramDict["beta"] = beta
paramDict["Ca"] = Ca
paramDict["la"] = la
paramDict["m"] = m
paramDict["N"] = N
paramDict["temp"] = temp
paramDict["numT"] = numT
paramDict["numRun"] = numRun
paramDict["endT"] = endT
paramDict["boundedlimit"] = boundedlimit

# Compute everything ######################################
PP = Array{Array{Float64, 3},1}(undef, numRun)

CL = zeros(numRun,2)

for j = 1:numRun
    u0 = rand(Uniform(-1,1),6*N)

    for i = 1:N
        u0[6*(i-1)+4:6*(i-1)+6] = randn(3).+1
    end
    
    cur_maxpos = 100
    curP = zeros(3,N,numT)
    curV = zeros(3,N,numT)
    C=0
    l=0
    
    # Run new simulations until we get a bounded phenotype
    while cur_maxpos > boundedlimit
        C = rand(Uniform(0.1,2))
        l = rand(Uniform(0.1,2))
        cur_params = [alpha; beta; C; Ca; l; la; m; N; temp]
        prob = ODEProblem(dorsogna3d, u0, tspan, cur_params)
        sol = solve(prob)
        sol_interp = hcat(sol(trange).u...)

        for n = 1:N
            curP[1,n,:] = sol_interp[6*(n-1)+1,:]
            curP[2,n,:] = sol_interp[6*(n-1)+2,:]
            curP[3,n,:] = sol_interp[6*(n-1)+3,:]
            curV[1,n,:] = sol_interp[6*(n-1)+4,:]
            curV[2,n,:] = sol_interp[6*(n-1)+5,:]
            curV[3,n,:] = sol_interp[6*(n-1)+6,:]
        end
        
        cur_maxpos = maximum(abs.(curP[:,:,1:100]))
    end
    
    CL[j,1] = C
    CL[j,2] = l
    PP[j] = curP
    println(string("Completed simulation ", j, "/", numRun))
    sleep(0.1)
end

# Save swarm simulation data
fname = string(swarm_fpath,"swarm3d_data.mat")
ofile = matopen(fname, "w")
write(ofile, "PP", PP)
write(ofile, "paramDict", paramDict)
close(ofile)

fname = string(swarm_fpath,"CL_data.mat")
ofile = matopen(fname, "w")
write(ofile, "CL", CL)
close(ofile)

## 2. Compute (Subsampled) Persistent Homology

In [None]:
## Compute base persistence diagrams (unnormalized)

# Load swarm data
fname = string(swarm_fpath,"swarm3d_data.mat")
file = matopen(fname,"r")
PP = read(file, "PP")
close(file)

# Initialize arrays for (unnormalized) persistence data
# B0, B1, B2: persistence diagrams (birth, death)
# BE: Betti curve
B0 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun) 
B1 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
B2 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
BE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)

# Compute unnormalized persistence
for j = 1:numRun

    # Get current swarm simulation run
    curP = PP[j]

    # Initialize current arrays for persistence data
    curB0 = Array{Array{Float64, 2},1}(undef, numT)
    curB1 = Array{Array{Float64, 2},1}(undef, numT)
    curB2 = Array{Array{Float64, 2},1}(undef, numT)
    curBE = Array{Array{Float64, 2},1}(undef, numT)
    
    for t = 1:numT

        curP_ss = curP[:,:,t]

        # Compute persistence
        C = eirene(curP_ss, model="pc", maxdim=2)
        curB0[t] = barcode(C, dim=0)
        curB0[t] = curB0[t][1:end-1,:]
        curB1[t] = barcode(C, dim=1)
        curB2[t] = barcode(C, dim=2)
        
        # Compute Betti curves
        curB = [curB0[t], curB1[t], curB2[t]]
        curBE[t] = betti_embedding(curB, tp)
    end

    B0[j] = curB0
    B1[j] = curB1
    B2[j] = curB2
    BE[j] = curBE
end

# Write data
ofname = string(PD_fpath, temp_subsample[1], "PD.mat")
ofile = matopen(ofname, "w")
write(ofile, "B0", B0)
write(ofile, "B1", B1)
write(ofile, "B2", B2)
write(ofile, "BE", BE)
close(ofile)

In [4]:
# Read base persistence diagram
file = matopen("data/ss200/PD/PD.mat","r")
B0 = read(file, "B0")
B1 = read(file, "B1")
B2 = read(file, "B2")
BE = read(file, "BE")
close(file)

In [5]:
# Generate persistence diagrams for all temporal subsamples
# Also compute the normalized persistence diagrams for the mixed case

# Scaling factor (relative to total = 200 points)
sf = ss/float(N)

# initial100 ################################################
nB0 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun) 
nB1 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nB2 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nNBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)

numT = 100
for i = 1:numRun
    nB0[i] = B0[i][1:numT]
    nB1[i] = B1[i][1:numT]
    nB2[i] = B2[i][1:numT]
    nBE[i] = BE[i][1:numT]
end

# Generate normalized BE
for i = 1:numRun
    curNBE = Array{Array{Float64, 2},1}(undef, numT)
    for t = 1:numT
        curB = [nB0[i][t], nB1[i][t], nB2[i][t]]
        curNBE[t] = betti_embedding(curB.*(sf^(1/3)), tp)/sf
    end
    nNBE[i] = curNBE
end

PD_fname = string(base_fpath, "init100/", "PD/PD.mat")
file = matopen(PD_fname, "w")
write(file, "B0", nB0)
write(file, "B1", nB1)
write(file, "B2", nB2)
write(file, "BE", nBE)
write(file, "NBE", nNBE)
close(file)

PDG_fname = string(base_fpath, "init100/", "PD/PDG.mat")
pd_to_giotto_mat(PD_fname, PDG_fname,1,100)
PDGN_fname = string(base_fpath, "init100/", "PD/PDGN.mat")
pd_to_giotto_mat(PD_fname, PDGN_fname,1,100,sf)

# initial50 #################################################
nB0 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun) 
nB1 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nB2 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nNBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)

numT = 50
for i = 1:numRun
    nB0[i] = B0[i][1:numT]
    nB1[i] = B1[i][1:numT]
    nB2[i] = B2[i][1:numT]
    nBE[i] = BE[i][1:numT]
end

# Generate normalized BE
for i = 1:numRun
    curNBE = Array{Array{Float64, 2},1}(undef, numT)
    for t = 1:numT
        curB = [nB0[i][t], nB1[i][t], nB2[i][t]]
        curNBE[t] = betti_embedding(curB.*(sf^(1/3)), tp)/sf
    end
    nNBE[i] = curNBE
end

PD_fname = string(base_fpath, "init50/", "PD/PD.mat")
file = matopen(PD_fname, "w")
write(file, "B0", nB0)
write(file, "B1", nB1)
write(file, "B2", nB2)
write(file, "BE", nBE)
write(file, "NBE", nNBE)
close(file)

PDG_fname = string(base_fpath, "init50/", "PD/PDG.mat")
pd_to_giotto_mat(PD_fname, PDG_fname,1,50)
PDGN_fname = string(base_fpath, "init50/", "PD/PDGN.mat")
pd_to_giotto_mat(PD_fname, PDGN_fname,1,50,sf)


# initial20 #################################################
nB0 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun) 
nB1 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nB2 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nNBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)

numT = 20
for i = 1:numRun
    nB0[i] = B0[i][1:numT]
    nB1[i] = B1[i][1:numT]
    nB2[i] = B2[i][1:numT]
    nBE[i] = BE[i][1:numT]
end

# Generate normalized BE
for i = 1:numRun
    curNBE = Array{Array{Float64, 2},1}(undef, numT)
    for t = 1:numT
        curB = [nB0[i][t], nB1[i][t], nB2[i][t]]
        curNBE[t] = betti_embedding(curB.*(sf^(1/3)), tp)/sf
    end
    nNBE[i] = curNBE
end

PD_fname = string(base_fpath, "init20/", "PD/PD.mat")
file = matopen(PD_fname, "w")
write(file, "B0", nB0)
write(file, "B1", nB1)
write(file, "B2", nB2)
write(file, "BE", nBE)
write(file, "NBE", nNBE)
close(file)

PDG_fname = string(base_fpath, "init20/", "PD/PDG.mat")
pd_to_giotto_mat(PD_fname, PDG_fname,1,20)
PDGN_fname = string(base_fpath, "init20/", "PD/PDGN.mat")
pd_to_giotto_mat(PD_fname, PDGN_fname,1,20,sf)


# random50 #################################################
nB0 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun) 
nB1 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nB2 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nNBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)

numT = 50
for i = 1:numRun
    subTP = sort(randperm(MersenneTwister(i),100)[1:numT])
    nB0[i] = B0[i][subTP]
    nB1[i] = B1[i][subTP]
    nB2[i] = B2[i][subTP]
    nBE[i] = BE[i][subTP]
end

# Generate normalized BE
for i = 1:numRun
    curNBE = Array{Array{Float64, 2},1}(undef, numT)
    for t = 1:numT
        curB = [nB0[i][t], nB1[i][t], nB2[i][t]]
        curNBE[t] = betti_embedding(curB.*(sf^(1/3)), tp)/sf
    end
    nNBE[i] = curNBE
end

PD_fname = string(base_fpath, "random50/", "PD/PD.mat")
file = matopen(PD_fname, "w")
write(file, "B0", nB0)
write(file, "B1", nB1)
write(file, "B2", nB2)
write(file, "BE", nBE)
write(file, "NBE", nNBE)
close(file)

PDG_fname = string(base_fpath, "random50/", "PD/PDG.mat")
pd_to_giotto_mat(PD_fname, PDG_fname,1,50)
PDGN_fname = string(base_fpath, "random50/", "PD/PDGN.mat")
pd_to_giotto_mat(PD_fname, PDGN_fname,1,50, sf)


# random20 #################################################
nB0 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun) 
nB1 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nB2 = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)
nNBE = Array{Array{Array{Float64, 2}, 1},1}(undef, numRun)

numT = 20
for i = 1:numRun
    subTP = sort(randperm(MersenneTwister(i),100)[1:numT])
    nB0[i] = B0[i][subTP]
    nB1[i] = B1[i][subTP]
    nB2[i] = B2[i][subTP]
    nBE[i] = BE[i][subTP]
end

# Generate normalized BE
for i = 1:numRun
    curNBE = Array{Array{Float64, 2},1}(undef, numT)
    for t = 1:numT
        curB = [nB0[i][t], nB1[i][t], nB2[i][t]]
        curNBE[t] = betti_embedding(curB.*(sf^(1/3)), tp)/sf
    end
    nNBE[i] = curNBE
end

PD_fname = string(base_fpath, "random20/", "PD/PD.mat")
file = matopen(PD_fname, "w")
write(file, "B0", nB0)
write(file, "B1", nB1)
write(file, "B2", nB2)
write(file, "BE", nBE)
write(file, "NBE", nNBE)
close(file)

PDG_fname = string(base_fpath, "random20/", "PD/PDG.mat")
pd_to_giotto_mat(PD_fname, PDG_fname,1,20)
PDGN_fname = string(base_fpath, "random20/", "PD/PDGN.mat")
pd_to_giotto_mat(PD_fname, PDGN_fname,1,20,sf)


## 3. Compute Julia Features (Moments and PersPath)

Compute the python features from the other notebook before moving on

In [6]:
for tdir in temp_subsample
    PD_fpath = string(base_fpath, tdir, "PD/")
    FT_fpath = string(base_fpath, tdir, "FT/")

    file = matopen(string(PD_fpath, "PD.mat"), "r")
    B0 = read(file, "B0")
    B1 = read(file, "B1")
    B2 = read(file, "B2")
    BE = read(file, "BE")
    close(file)

    compute_PD_moments(B0, B1, B2, moment_level, FT_fpath)
    compute_PD_perspath(BE, perspath_level, FT_fpath)
end

## 4. Compute Kernels

In [7]:
for tdir in temp_subsample

    FT_fpath = string(base_fpath, tdir, "FT/")
    KE_fpath = string(base_fpath, tdir, "KE/")
    all_feat = readdir(FT_fpath)
    numfeat = length(all_feat)

    for i = 1:numfeat
        cur_feat = all_feat[i]
        feat_name = split(cur_feat,".")[1]
        feat_type = split(cur_feat,"_")[1]

        file = matopen(string(FT_fpath, cur_feat),"r")
        FT = read(file, "FT")
        close(file)

        if feat_type == "PL" || feat_type == "LPL" || feat_type == "PI" || feat_type == "LPI"
            numRun = size(FT)[1]
            
            FTrs = Array{Array{Float64, 2}, 1}(undef, numRun)
            for j = 1:numRun
                FTrs[j] = FT[j,:,:]
            end
            
            FT = FTrs
        end

        numT = size(FT[1])[1]
        numC = size(FT[1])[2]

        for l in lags
            lagp = l+1
            # Add lags
            FT_lag = Array{Array{Float64, 2}, 1}(undef, numRun)
            for i = 1:numRun
                curFT = zeros(numT, numC*lagp)

                for l = 1:lagp
                    curFT[l:end, (l-1)*numC+1:l*numC] = FT[i][1:end-(l-1),:]
                end

                FT_lag[i] = curFT
            end

            K = dsignature_kernel_matrix(FT_lag, [], signature_level, "R")

            fname = string(KE_fpath, feat_name, "_S", signature_level, "_L", l, ".mat")
            file = matopen(fname, "w")
            write(file, "K", K)
            close(file)
        end
    end
end

In [8]:
## CROCKER

numT_CRKR = 20 # number of time points in crocker plot
numE = 100 # number of epsilon points in betti curve

numdim_CRKR = numT_CRKR*numE*3
ndim_block = numE*3

for (t_ind, tdir) in enumerate(temp_subsample)
    numT = numT_subsample[t_ind]
    PD_fpath = string(base_fpath, tdir, "PD/PD.mat")
    KE_fpath = string(base_fpath, tdir, "KE/")

    file = matopen(PD_fpath, "r")
    BE = read(file, "BE")
    close(file)

    # Further subsample the time axis to reduce dimensionality
    tp_CRKR = Int.(round.(collect(range(numT/numT_CRKR, stop=numT ,length=numT_CRKR))))

    CRKR = zeros(numRun, numdim_CRKR)

    for i = 1:numRun
        for t = 1:numT_CRKR
            CRKR[i,(t-1)*ndim_block+1:t*ndim_block] = BE[i][tp_CRKR[t]][:]
        end
    end

    K = zeros(numRun, numRun)

    for i = 1:numRun
        for j = i:numRun
            K[i,j] = dot(CRKR[i,:], CRKR[j,:])
            
            if i!=j
                K[j,i] = K[i,j]
            end
        end
    end

    fname = string(KE_fpath, "CRKR.mat")
    file = matopen(fname, "w")
    write(file, "K", K)
    close(file)
end


## 5. Perform Regression

Regression output stored in RG_fpath.

In [9]:
# Hyperparameter search over lags too

num_iterations = 100
tr_split = 0.8
hyp_cv = 4

SVR_C = 10. .^(-3:1:1)
SVR_eps = 10. .^(-4:1:0)

file = matopen("CL_data.mat", "r")
CL = read(file, "CL")
close(file)

all_reg_mean = []
all_reg_std = []

for tdir in temp_subsample
    FT_fpath = string(base_fpath, tdir, "FT/")
    KE_fpath = string(base_fpath, tdir, "KE/")
    RG_fpath = string(base_fpath, tdir, "RG/")

    all_K = readdir(FT_fpath)
    push!(all_K, "CRKR.mat")
    numK = length(all_K)

    reg_mean = zeros(numK,2)
    reg_std = zeros(numK, 2)

    for i = 1:numK
        cur_K = all_K[i]
        K_name = split(cur_K,".")[1]
        RG_fname = string(RG_fpath, cur_K)

        if isfile(RG_fname)
            file = matopen(RG_fname, "r")
            reg_error = read(file, "reg_error")
            close(file)

            reg_mean[i,:] = mean(reg_error, dims=1)
            reg_std[i,:] = std(reg_error, dims=1)
            println(string("Loading ", RG_fname))
        else
            println(string("Computing ", RG_fname))
            if cur_K == "CRKR.mat"
                file = matopen(string(KE_fpath, cur_K),"r")
                K = read(file, "K")
                close(file)

                reg_error, SVR_params = run_regression(K, K, CL, num_iterations, SVR_C, SVR_eps, hyp_cv, tr_split)
            else
                K_all = []
                for l = 0:2
                    file = matopen(string(KE_fpath, K_name, "_S3_L", l, ".mat"),"r")
                    K = read(file, "K")
                    close(file)
                    push!(K_all, K)
                end

                reg_error, SVR_params = run_regression_multikernel(K_all, K_all, CL, num_iterations, SVR_C, SVR_eps, hyp_cv, tr_split)
            end
            
            fname = string(RG_fpath, cur_K)
            file = matopen(fname, "w")
            write(file, "reg_error", reg_error)
            write(file, "SVR_params", SVR_params)
            close(file)

            reg_mean[i,:] = mean(reg_error, dims=1)
            reg_std[i,:] = std(reg_error, dims=1)
        end
    end
    push!(all_reg_mean, reg_mean)
    push!(all_reg_std, reg_std)
end


Computing data/ss200/init100/RG/LPI_03_10.mat
Computing data/ss200/init100/RG/LPL_5_10.mat
Computing data/ss200/init100/RG/MO_2.mat
Computing data/ss200/init100/RG/PA_2.mat
Computing data/ss200/init100/RG/CRKR.mat
Computing data/ss200/init50/RG/LPI_03_10.mat
Computing data/ss200/init50/RG/LPL_5_10.mat
Computing data/ss200/init50/RG/MO_2.mat
Computing data/ss200/init50/RG/PA_2.mat
Computing data/ss200/init50/RG/CRKR.mat
Computing data/ss200/init20/RG/LPI_03_10.mat
Computing data/ss200/init20/RG/LPL_5_10.mat
Computing data/ss200/init20/RG/MO_2.mat
Computing data/ss200/init20/RG/PA_2.mat
Computing data/ss200/init20/RG/CRKR.mat
Computing data/ss200/random50/RG/LPI_03_10.mat
Computing data/ss200/random50/RG/LPL_5_10.mat
Computing data/ss200/random50/RG/MO_2.mat
Computing data/ss200/random50/RG/PA_2.mat
Computing data/ss200/random50/RG/CRKR.mat
Computing data/ss200/random20/RG/LPI_03_10.mat
Computing data/ss200/random20/RG/LPL_5_10.mat
Computing data/ss200/random20/RG/MO_2.mat
Computing data

In [13]:
all_reg_mean[5]

5×2 Matrix{Float64}:
 0.0225656  0.124158
 0.0333622  0.123827
 0.0282311  0.130941
 0.0607225  0.215561
 0.0324322  0.219933

In [5]:
a = zeros(10)
Threads.@threads for i = 1:10
    a[i] = Threads.threadid()
end

In [6]:
a

10-element Vector{Float64}:
 1.0
 1.0
 2.0
 2.0
 3.0
 4.0
 5.0
 6.0
 7.0
 8.0

In [50]:
for i = 1:numK
    cur_K = all_K[i]
    K_name = split(cur_K,".")[1]

    println(string(K_name, ",   C_error:", reg_mean[i,1], ", L_error:", reg_mean[i,2]))
end

LPL_5_10,   C_error:0.02458420551062546, L_error:0.08149808239342182
MO_2,   C_error:0.024838049571270827, L_error:0.08085823014395054
PA_2,   C_error:0.09452111284562253, L_error:0.29750533532264667
PI_03_10,   C_error:0.04370785941538185, L_error:0.181028121194321
PL_5_10,   C_error:0.04607732944341657, L_error:0.23588418408982967


In [46]:
B = 123

if B == 123
    testingzzzz = 1
end
print(testingzzzz)

1

In [37]:
file = matopen("newKE_data/MMDK_din_2_40.mat","r")
K = read(file, "K")
close(file)

ErrorException: File "newKE_data/MMDK_din_2_40.mat" does not exist and create was not specified

In [48]:
Mreg_error, MSVR_params = run_regression(K, K, CL, 100, SVR_C, SVR_eps, 4, 0.8)

([0.039116456762955595 0.21608558333665737; 0.04215830541442393 0.22462702848163926; … ; 0.04403836321256742 0.19304064623443418; 0.04987211994637922 0.229374195557993], [10.0 10.0; 10.0 10.0; … ; 10.0 10.0; 10.0 10.0;;; 0.0001 0.0001; 0.0001 0.0001; … ; 0.0001 0.0001; 0.0001 0.0001])

In [50]:
mean(Mreg_error, dims=1)

1×2 Matrix{Float64}:
 0.0379933  0.210309

In [51]:
std(Mreg_error,dims=1)

1×2 Matrix{Float64}:
 0.00435456  0.0217293

In [59]:
file = matopen("data/orig/SWD/D_2_40.mat","r")
D = read(file, "D")
close(file)

In [72]:
sigma1 = [0.1, 0.5, 1.0]
sigma2 = [0.1, 0.5, 1.0]

Mreg_mean = zeros(3,3,2)
Mreg_std = zeros(3,3,2)

for i = 1:3
    for j = 1:3
        MMDK = compute_MMD(D,sigma1[i], sigma2[j])
        Mreg_error, MSVR_params = run_regression(MMDK, MMDK, CL, 100, SVR_C, SVR_eps, 4, 0.8)
        Mreg_mean[i,j,:] = mean(Mreg_error, dims=1)
        Mreg_std[i,j,:] = std(Mreg_error, dims=1)
    end
end
    

In [77]:
Mreg_mean[3,:,2]

3-element Vector{Float64}:
 0.11596407263147249
 0.11442928263127414
 0.1151486201256615

In [87]:
MMDK = compute_MMD(D,1.5, 1.0)

500×500 Matrix{Float64}:
 1.0       0.92683   0.620336  0.696988  …  0.643218  0.936368  0.750345
 0.92683   1.0       0.6278    0.705374     0.650957  0.935185  0.759373
 0.620336  0.6278    1.0       0.888658     0.874823  0.62544   0.592214
 0.696988  0.705374  0.888658  1.0          0.898292  0.702722  0.690527
 0.926922  0.952024  0.627777  0.705348     0.650933  0.935884  0.759345
 0.873491  0.883685  0.591529  0.664622  …  0.61335   0.880793  0.715528
 0.949004  0.931392  0.623293  0.70031      0.646284  0.947051  0.753922
 0.851547  0.861792  0.681971  0.778172     0.720929  0.858553  0.803131
 0.625173  0.632695  0.992716  0.888859     0.880392  0.630317  0.596576
 0.693809  0.702157  0.592486  0.697218     0.667823  0.699517  0.795403
 ⋮                                       ⋱                      
 0.767846  0.777084  0.531655  0.600524     0.555678  0.774164  0.727262
 0.694099  0.702449  0.517279  0.591863     0.554393  0.699809  0.810719
 0.654023  0.661892  0.646486  0.7

In [89]:
Mreg_error, MSVR_params = run_regression(MMDK, MMDK, CL, 100, SVR_C, SVR_eps, 4, 0.5)

([0.0276294665407228 0.11350724821960761; 0.025874362953882776 0.12339223208949657; … ; 0.0258865851184854 0.11814825917361532; 0.03010687683282638 0.12291254346866592], [10.0 10.0; 10.0 10.0; … ; 10.0 10.0; 10.0 10.0;;; 0.1 0.01; 0.01 0.0001; … ; 0.001 0.01; 0.0001 0.01])

In [90]:
mean(Mreg_error, dims=1)

1×2 Matrix{Float64}:
 0.0269334  0.118519