# Subspace Inference

Byesisan inference methods are used to generate uncertainty informations in Neural Networks (DNN). However, usingBayesian inference in Deep Neural network is challenging due to large dimension of parameter space. Subspace inference method is used to reduce generate uncertainty information from subspace of DNN paramter space.

Subspace Inference package is implemented based on

Izmailov, P., Maddox, W. J., Kirichenko, P., Garipov, T., Vetrov, D., & Wilson, A. G. (2020, August). Subspace inference for Bayesian deep learning. In Uncertainty in Artificial Intelligence (pp. 1169-1179). PMLR.

### This notebook contains uncertainty generation of simple multilayer perceptron
The is DNN contains 2 inputs and one outputs. The hidden layer sizes are as follows [200 50 50]

Start using subspace inference using in Julia

In [None]:
#use packages
using IJulia
IJulia.installkernel("Julia nodeps", "--depwarn=no")
using NPZ
using Plots
using Flux
using Flux: Data.DataLoader
using Flux: @epochs
using Plots
using BSON: @save
using BSON: @load
using Zygote
using Statistics
using SubspaceInference

### Set root of project folder
This folder contains data and trained networks

In [None]:
root = pwd();
cd(root);

### Plot Data
This loaded data conatains two columns, one is taken as <em>x</em> and <em>y</em>. The <em>x</em>  is converted to features using <em>features</em> function. Then zipped using <em>DataLoader</em> available with <em>Flux</em>

In [None]:
#laod data
data_ld = npzread("data.npy");
x, y = (data_ld[:, 1]', data_ld[:, 2]');
function features(x)
    return vcat(x./2, (x./2).^2)
end

f = features(x);
data =  DataLoader(f,y, batchsize=50, shuffle=true);

#plot data
scatter(data_ld[:,1],data_ld[:,2],color=["red"], title="Dataset", legend=false)

### DNN Model setup
Simple multilayer perceptron is created as using <em>Dense</em> layer. This DNN conatains 2 inputs, 1 output and hidden layers of [200,50,50] size. All layers other than output layer contains ***ReLu*** activation function.

In [None]:
m = Chain(
		Dense(2,200,Flux.relu), 
		Dense(200,50,Flux.relu),
		Dense(50,50,Flux.relu),
		Dense(50,50,Flux.relu),
		Dense(50,1),
	)

The model is destructed to extract weights and function as below

In [None]:
θ, re = Flux.destructure(m);

### Cost function
Gaussian likelihood cost function is implemented for training as:

In [None]:
L(x, y) = Flux.Losses.mse(m(x), y)/2;

The parameters are load as:

In [None]:
ps = Flux.params(m);

### Optimizer
Optimzer used in this project is Stochastic gradient descent with momentum value:

In [None]:
opt = Momentum(0.01, 0.95);

### Pretrain weights ad save
The DNN needs to pretrains and save for subsce inference as below. NB: training takes little time. This package examppes contain some trained weights.

In [None]:
# epochs = 3000
# for j in 1:5
#    m = Chain(
#            Dense(2,200,Flux.relu),
#            Dense(200,50,Flux.relu),
#            Dense(50,50,Flux.relu),
#            Dense(50,50,Flux.relu),
#            Dense(50,1),
#    )
#    ps = Flux.params(m)
#    SubspaceInference.pretrain(epochs, L, ps, data, opt, lr_init =0.01, print_freq= 100)
#    @save "model_weights_$(j).bson" ps
# end

### Plot different SGD solutions

In [None]:
z = collect(range(-10.0, 10.0,length = 100))
inp = features(z')
trajectories = Array{Float64}(undef,100,5)
for i in 1:5
	@load "model_weights_$(i).bson" ps
	Flux.loadparams!(m, ps)
	out = m(inp)
	trajectories[:, i] = out'
end
SubspaceInference.plot_predictive(data_ld, trajectories, z, title="SGD Solutions")


### Load pretrained weight
The pretrained weights can be found in examples folder

In [None]:
i = 1;
@load "model_weights_$(i).bson" ps;
Flux.loadparams!(m, ps);

### Generate uncertainty of weights using
<em>weight_uncertainty</em> function from ***SubspaceInference*** package is used to generate uncertainty of parameter space.

In [None]:
M = 10 #Rank of PCA or Maximum columns in deviation matrix
T = 10 #Steps
itr = 100
all_chain = SubspaceInference.weight_uncertainty(m, L, data, opt, itr = 100, T=10, M=10)


### Plot uncertainty using different trajectories that generated using PCA with NUTS

In [None]:
z = collect(range(-10.0, 10.0,length = 100))
inp = features(z')
trajectories = Array{Float64}(undef,100,itr)
for i in 1:itr
	m1 = re(all_chain[:,i])
	out = m1(inp)
	trajectories[:, i] = out'
end
SubspaceInference.plot_predictive(data_ld, trajectories, z, title="With PCA and NUTS")