Hér prufum við 2 aðferðir við að reikna út K-fylkið

<br><br>
Fyrst er allt fyrirfram ákveðið - fáir parametrar
<br>
Seinna er fyrir mun flóknari módel

In [1]:
# Uppsetning
using Flux
using Zygote
using MLDatasets
using LinearAlgebra

In [2]:
### Aðferð 1 - Allt fyrirfram ákveðið

model = Chain(  Dense(2 => 2), Dense(2 => 1)) # W_2[1x2](W_1[2x2]x[2,1]+b_1[2x1])+b_2[1]

x1 = Float32[0.5852378, 0.62436277] # random datapoint
x2 = Float32[0.0976659, 0.55464536] # random datapoint

# Skilgreini parametra
W1 = Flux.params(model)[1]  # W_1
b1 = Flux.params(model)[2]  # b_1
W2 = Flux.params(model)[3]  # W_1
b2 = Flux.params(model)[4]  # b_1

# Breyti gildum í parametrum
W1 .= ones(2,2)  #  Hér má setja eitthvað "fixed" fylki, breyti gildum í W1
b1 .= [1,1]

W2 .= ones(1,2)
b2 .= 1

# Handvirkt reiknaðar hlutaafleiður
gs_x1=Flux.gradient(() -> model(x1)[1],Flux.params(model))   # Reikna allar hlutaafleiður fyrir x1
gs_x2=Flux.gradient(() -> model(x2)[1],Flux.params(model))   # Reikna allar hlutaafleiður fyrir x2

grads_x1 = []
grads_x2 = []

# Næ í allarhlutaafleiður (*)
for i = 1:length(Flux.params(model))
    push!(grads_x1, gs_x1[Flux.params(model)[i]])
    push!(grads_x2, gs_x2[Flux.params(model)[i]])
end

K1_1 = dot(grads_x1, grads_x1)
K1_2 = dot(grads_x1, grads_x2)

K2_1 = dot(grads_x2, grads_x1)
K2_2 = dot(grads_x2, grads_x2)

K = [K1_1 K1_2 ; K2_1 K2_2]

2×2 Matrix{Float32}:
 14.2293  11.1088
 11.1088   9.09461

(*) Ath að þurfum að sækja þær svona því gs_x1.grads() skilar einnig ":(Main.x1)"
<br><br>
:(Main.x1) er einhver grundvallar þáttur í gradient og ruglar í útreikningunum sem taka við.

In [5]:
# Athugum hvort það séu neikvæð eigingildi
eigval = eigen(K).values
findall(x-> x<0, eigval)

Int64[]

Nú prufum við þetta með NN.jl structinu okkar. <br>
Ég hef bara relevant hluta af structinu hér, en hann er á github ásamt öllu hinu.

In [6]:
### Aðferð 2 - með NN.jl

struct NN
    model::Any
    opt         # optimatzation method, so far only GD and ADAM
    lr          # learning rate
end

function load_MNIST()
    """
    Loading the MNIST dataset.
    10 classes of digits from 0 to 9,
    each with 28x28 pixel dimensions.
    X: Grayscale vector, Y: Correct label.
    """

    X_training, Y_training = MNIST(split = :train)[:]
    X_testing, Y_testing = MNIST(split = :test)[:]
    X_training = Flux.flatten(X_training)
    X_testing = Flux.flatten(X_testing)
    Y_training = Flux.onehotbatch(Y_training, 0:9)
    Y_testing = Flux.onehotbatch(Y_testing, 0:9)
    return X_training, Y_training, X_testing, Y_testing
end

function kernel(nn::NN, n=60000)
    """
    This function computes the "Kernel" of a given NN
    """
    x = load_MNIST()[1]     # training data
    K = zeros(n, n)         # Initialize empty Kernel
    model = nn.model
    

    # Calculate all gradients
    gs_raw = []
    for i = 1:n
        xi =  x[:,i] # current datapoint
        push!(gs_raw, Flux.gradient(() -> model(xi)[1],Flux.params(model)))
    end

    # Collect numerical values
    gs = []
    for i = 1:n
        gs_i = []
        for j = 1:length(Flux.params(model))
            push!(gs_i, gs_raw[i][Flux.params(model)[j]])
        end
        push!(gs, gs_i)
    end

    # Evaluate each K[i,j]
    for i = 1:n
        for j = 1:n
            K[i,j] = dot(gs[i], gs[j])
        end
    end

    return K
end

function model_3LS()
    """
    A 3-layer model using 60 nodes in the inner layers.
    Using the sigmoid activation function.
    """

    m_3LS = Chain(
        Dense(28*28, 60, sigmoid), # Input Layer -> Hidden Layer 1
        Dense(60, 60, sigmoid), # Hidden Layer 1 -> Hidden Layer 2
        Dense(60, 10, sigmoid), # Hidden Layer 2 -> Output Layer
        softmax      
    )
    return m_3LS
end

model_3LS (generic function with 1 method)

Athugið að við trainum módelið okkar ekki neitt hér

In [7]:
MODEL  = model_3LS()
OPT    = "ADAM"
LR     = 0.001

myNN = NN(MODEL, OPT, LR)

n = 10
K = kernel(myNN,n) # reiknar fyrir n datapunkta

10×10 Matrix{Float64}:
 0.00947103  0.00940689  0.00915153  …  0.00942464  0.00905087  0.00911172
 0.00940689  0.0096843   0.00924237     0.00946756  0.00910301  0.00918444
 0.00915153  0.00924237  0.00931792     0.00927798  0.00896479  0.00908634
 0.00922145  0.00929199  0.00913178     0.00930656  0.00902537  0.00908154
 0.00895091  0.00905947  0.0088799      0.00904118  0.00880793  0.00884354
 0.00902889  0.00911931  0.00894956  …  0.00914539  0.00880236  0.00891403
 0.00914494  0.00918067  0.00903991     0.00919183  0.00904376  0.00898523
 0.00942464  0.00946756  0.00927798     0.00967041  0.00908114  0.00920417
 0.00905087  0.00910301  0.00896479     0.00908114  0.00898571  0.00891116
 0.00911172  0.00918444  0.00908634     0.00920417  0.00891116  0.0091529

In [8]:
# Athugum hvort það séu neikvæð eigingildi
eigval = eigen(K).values
findall(x-> x<0, eigval)

Int64[]