# Deep Universal Regular Conditional Expectations:

---
This implements the universal deep neural model of $\mathcal{NN}_{1_{\mathbb{R}^n},\mathcal{D}}^{\sigma:\star}$ [Anastasis Kratsios](https://people.math.ethz.ch/~kratsioa/) - 2021.

---

## What does this code do?
1. Learn Heteroskedastic Non-Linear Regression Problem
     - $Y\sim f_{\text{unkown}}(x) + \epsilon$ where $f$ is an known function and $\epsilon\sim Laplace(0,\|x\|)$
2. Learn Random Bayesian Network's Law:
    - $Y = W_J Y^{J-1}, \qquad Y^{j}\triangleq \sigma\bullet A^{j}Y^{j-1} + b^{j}, \qquad Y^0\triangleq x$

3. In the above example if $A_j = M_j\odot \tilde{A_j}$ where $\tilde{A}_j$ is a deterministic matrix and $M_j$ is a "mask", that is, a random matrix with binary entries and $\odot$ is the Hadamard product then we recover the dropout framework.
4. Learn the probability distribution that the unique strong solution to the rough SDE with uniformly Lipschitz drivers driven by a factional Brownian motion with Hurst exponent $H \in [\frac1{2},1)$:
$$
X_t^x = x + \int_0^t \alpha(s,X_s^x)ds + \int_0^t \beta(s,X_s^x)dB_s^H
$$
belongs, at time $t=1$, to a ball about the initial point $x$ of random radius given by an independant exponential random-variable with shape parameter $\lambda=2$
5. Train a DNN to predict the returns of bitcoin with GD.  Since this has random initialization then each prediction of a given $x$ is stochastic...We learn the distribution of this conditional RV (conditioned on x in the input space).
$$
Y_x \triangleq \hat{f}_{\theta_{T}}(x), \qquad \theta_{(t+1)}\triangleq \theta_{(t)} + \lambda \sum_{x \in \mathbb{X}} \nabla_{\theta}\|\hat{f}_{\theta_t}(x) - f(x)\|, \qquad \theta_0 \sim N_d(0,1);
$$
$T\in \mathbb{N}$ is a fixed number of "SGD" iterations (typically identified by cross-validation on a single SGD trajectory for a single initialization) and where $\theta \in \mathbb{R}^{(d_{J}+1)+\sum_{j=0}^{J-1} (d_{j+1}d_j + 1)}$ and $d_j$ is the dimension of the "bias" vector $b_j$ defining each layer of the DNN with layer dimensions:
$$
\hat{f}_{\theta}(x)\triangleq A^{(J)}x^{(J)} + b^{(J)},\qquad x^{(j+1)}\triangleq \sigma\bullet A^{j}x^{(j)} + b^{j},\qquad x^{(0)}\triangleq x
.
$$

#### Mode:
Software/Hardware Testing or Real-Deal?

In [1]:
trial_run = True

### Simulation Method:

In [2]:
# Random DNN
# f_unknown_mode = "Heteroskedastic_NonLinear_Regression"

# Random DNN internal noise
#f_unknown_mode = "DNN_with_Random_Weights"
Depth_Bayesian_DNN = 1
width = 5

# Random Dropout applied to trained DNN
f_unknown_mode = "DNN_with_Bayesian_Dropout"
Dropout_rate = 0.1

# GD with Randomized Input
# f_unknown_mode = "GD_with_randomized_input"
GD_epochs = 50

# SDE with fractional Driver
# f_unknown_mode = "Rough_SDE"
N_Euler_Steps = 10**2
Hurst_Exponent = 0.75

#f_unknown_mode = "Rough_SDE_Vanilla"
## Define Process' dynamics in (2) cell(s) below.

#### Vanilla fractional SDE:
If f_unknown_mode == "Rough_SDE_Vanilla" is selected, then we can specify the process's dynamics.  

In [3]:
#--------------------------#
# Define Process' Dynamics #
#--------------------------#
drift_constant = 0.1
volatility_constant = 0.01

# Define DNN Applier
def f_unknown_drift_vanilla(x):
    x_internal = x
    x_internal = drift_constant*x_internal
    return x_internal
def f_unknown_vol_vanilla(x):
    x_internal = volatility_constant*np.diag(np.ones(problem_dim))
    return x_internal

## Problem Dimension

In [4]:
problem_dim = 100

## Note: *Why the procedure is so computationally efficient*?
---
 - The sample barycenters do not require us to solve for any new Wasserstein-1 Barycenters; which is much more computationally costly,
 - Our training procedure never back-propages through $\mathcal{W}_1$ since steps 2 and 3 are full-decoupled.  Therefore, training our deep classifier is (comparatively) cheap since it takes values in the standard $N$-simplex.

---

#### Grid Hyperparameter(s)
- Ratio $\frac{\text{Testing Datasize}}{\text{Training Datasize}}$.
- Number of Training Points to Generate

In [5]:
train_test_ratio = .2
N_train_size = 10**2

Monte-Carlo Paramters

In [6]:
## Monte-Carlo
N_Monte_Carlo_Samples = 10**2

Initial radis of $\delta$-bounded random partition of $\mathcal{X}$!

In [7]:
# Hyper-parameters of Cover
delta = 0.1
Proportion_per_cluster = .75

## Dependencies and Auxiliary Script(s)

In [8]:
# %run Loader.ipynb
exec(open('Loader.py').read())
# Load Packages/Modules
exec(open('Init_Dump.py').read())
import time as time #<- Note sure why...but its always seems to need 'its own special loading...'

Using TensorFlow backend.


Deep Feature Builder - Ready
Deep Classifier - Ready
Deep Feature Builder - Ready


# Simulate or Parse Data

In [9]:
# %run Data_Simulator_and_Parser.ipynb
exec(open('Data_Simulator_and_Parser.py').read())

 16%|█▌        | 16/100 [00:00<00:00, 155.32it/s]

---------------------------------------
Beginning Data-Parsing/Simulation Phase
---------------------------------------
Deciding on Which Simulator/Parser To Load
Setting/Defining: Internal Parameters
Deciding on Which Type of Data to Get/Simulate
Simulating Output Data for given input data


100%|██████████| 100/100 [00:00<00:00, 141.56it/s]
100%|██████████| 20/20 [00:00<00:00, 139.60it/s]

----------------------------------
Done Data-Parsing/Simulation Phase
----------------------------------





#### Scale Data
This is especially important to avoid exploding gradient problems when training the ML-models.

In [10]:
scaler = StandardScaler()
scaler.fit(X_train)

X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

# Run Main:

In [11]:
print("------------------------------")
print("Running script for main model!")
print("------------------------------")
# %run Universal_Measure_Valued_Networks_Backend.ipynb
exec(open('Universal_Measure_Valued_Networks_Backend.py').read())

print("------------------------------------")
print("Done: Running script for main model!")
print("------------------------------------")

------------------------------
Running script for main model!
------------------------------


100%|██████████| 75/75 [00:00<00:00, 10636.44it/s]

Deep Feature Builder - Ready
Deep Classifier - Ready
Training Classifer Portion of Type-A Model
Fitting 2 folds for each of 1 candidates, totalling 2 fits



[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    8.2s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    8.2s finished


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 1

Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


  0%|          | 0/100 [00:00<?, ?it/s]

Training Classifer Portion of Type Model: Done!
#--------------------#
 Get Training Error(s)
#--------------------#


100%|██████████| 100/100 [00:00<00:00, 270.61it/s]
100%|██████████| 20/20 [00:00<00:00, 379.69it/s]

#-------------------------#
 Get Training Error(s): END
#-------------------------#
#----------------#
 Get Test Error(s)
#----------------#
#------------------------#
 Get Testing Error(s): END
#------------------------#
                                        DNM  MC-Oracle
W1-95L                             0.000003   0.000000
W1                                 0.000003   0.000000
W1-95R                             0.000003   0.000000
M-95L                              0.000006   0.000006
M                                  0.000006   0.000006
M-95R                              0.000006   0.000006
N_Par                          75475.000000   0.000000
Train_Time                        15.494996   0.856403
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000
------------------------------------
Done: Running script for main model!
------------------------------------





---
# Run: All Benchmarks

## 1) *Pointmass Benchmark(s)*
These benchmarks consist of subsets of $C(\mathbb{R}^d,\mathbb{R})$ which we lift to models in $C(\mathbb{R}^d,\cap_{1\leq q<\infty}\mathscr{P}_{q}(\mathbb{R}))$ via:
$$
\mathbb{R}^d \ni x \to f(x) \to \delta_{f(x)}\in \cap_{1\leq q<\infty}\mathcal{P}_{q}(\mathbb{R}).
$$

In [12]:
exec(open('CV_Grid.py').read())
# Notebook Mode:
# %run Evaluation.ipynb
# %run Benchmarks_Model_Builder_Pointmass_Based.ipynb
# Terminal Mode (Default):
exec(open('Evaluation.py').read())
exec(open('Benchmarks_Model_Builder_Pointmass_Based.py').read())

Deep Feature Builder - Ready
--------------
Training: ENET
--------------


100%|██████████| 100/100 [00:00<00:00, 1088.90it/s]
100%|██████████| 20/20 [00:00<00:00, 1241.18it/s]

---------------------
Training: ENET - Done
---------------------
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame
                                        DNM  MC-Oracle          ENET
W1-95L                             0.000003   0.000000  1.256779e-06
W1                                 0.000003   0.000000  1.256779e-06
W1-95R                             0.000003   0.000000  1.256779e-06
M-95L                              0.000006   0.000006  1.121061e-03
M                                  0.000006   0.000006  1.121061e-03
M-95R                              0.000006   0.000006  1.121061e-03
N_Par                          75475.000000   0.000000  2.000000e+02
Train_Time                        15.494996   0.856403  1.620057e+09
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080329e-03
----------------


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   1 tasks      | elapsed:    0.1s
[Parallel(n_jobs=4)]: Batch computation too fast (0.0731s.) Setting batch_size=2.
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:    0.1s remaining:    0.1s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.5s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.5s finished
100%|██████████| 100/100 [00:00<00:00, 1364.72it/s]
100%|██████████| 20/20 [00:00<00:00, 2144.11it/s]
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame
                                        DNM  MC-Oracle          ENET    KRidge
W1-95L                             0.000003   0.000000  1.256779e-06  0.000067
W1                                 0.000003   0.000000  1.256779e-06  0.000108
W1-95R                             0.000003   0.000000  1.256779e-06  0.000159
M-95L                              0.000006   0.000006  1.121061e-03  0.005942
M                                  0.000006   0.000006  1.121061e-03  0.008630
M-95R                              0.000006   0.000006  1.121061e-03  0.012027
N_Par                          75475.000000   0.000000  2.000000e+02  0.000000
Train_Time                        15.494996   0.856403  1.620057e+09  0.619309
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080

[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    2.6s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    2.6s finished
100%|██████████| 100/100 [00:00<00:00, 3296.40it/s]
100%|██████████| 20/20 [00:00<00:00, 1807.23it/s]

#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame
                                        DNM  MC-Oracle          ENET  \
W1-95L                             0.000003   0.000000  1.256779e-06   
W1                                 0.000003   0.000000  1.256779e-06   
W1-95R                             0.000003   0.000000  1.256779e-06   
M-95L                              0.000006   0.000006  1.121061e-03   
M                                  0.000006   0.000006  1.121061e-03   
M-95R                              0.000006   0.000006  1.121061e-03   
N_Par                          75475.000000   0.000000  2.000000e+02   
Train_Time                        15.494996   0.856403  1.620057e+09   
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080329e-03   

                                 KRidge          GB


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    7.6s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    7.6s finished


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200


Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch 81/200
Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200


Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200


Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


100%|██████████| 100/100 [00:00<00:00, 3019.35it/s]
100%|██████████| 20/20 [00:00<00:00, 2011.56it/s]

#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame
                                        DNM  MC-Oracle          ENET  \
W1-95L                             0.000003   0.000000  1.256779e-06   
W1                                 0.000003   0.000000  1.256779e-06   
W1-95R                             0.000003   0.000000  1.256779e-06   
M-95L                              0.000006   0.000006  1.121061e-03   
M                                  0.000006   0.000006  1.121061e-03   
M-95R                              0.000006   0.000006  1.121061e-03   
N_Par                          75475.000000   0.000000  2.000000e+02   
Train_Time                        15.494996   0.856403  1.620057e+09   
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080329e-03   

                                 KRidge          GB




# Summary of Point-Mass Regression Models

#### Training Model Facts

In [13]:
print(Summary_pred_Qual_models)
Summary_pred_Qual_models

                                        DNM  MC-Oracle          ENET  \
W1-95L                             0.000003   0.000000  1.256779e-06   
W1                                 0.000468   0.000000  4.656555e-04   
W1-95R                             0.001399   0.000000  1.858852e-03   
M-95L                              0.000006   0.000006  1.121061e-03   
M                                  0.001124   0.001124  2.219702e-03   
M-95R                              0.004478   0.003365  4.416982e-03   
N_Par                          75475.000000   0.000000  2.000000e+02   
Train_Time                        15.494996   0.856403  1.620057e+09   
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080329e-03   

                                 KRidge          GBRF           DNN  
W1-95L                         0.000067  1.091707e-05      0.000003  
W1                             0.000387  3.457320e-04      0.000366  
W1-95R                         0.001073  1.382058e-03      0.001078  

Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN
W1-95L,3e-06,0.0,1.256779e-06,6.7e-05,1.091707e-05,3e-06
W1,0.000468,0.0,0.0004656555,0.000387,0.000345732,0.000366
W1-95R,0.001399,0.0,0.001858852,0.001073,0.001382058,0.001078
M-95L,6e-06,6e-06,0.001121061,0.005942,0.001155688,0.000767
M,0.001124,0.001124,0.002219702,0.00863,0.002923135,0.001463
M-95R,0.004478,0.003365,0.004416982,0.012027,0.00415504,0.00235
N_Par,75475.0,0.0,200.0,0.0,1914800.0,60601.0
Train_Time,15.494996,0.856403,1620057000.0,0.619309,8.803717,15.56482
Test_Time/MC-Oracle_Test_Time,0.693893,1.0,0.004080329,0.005195,0.01120074,0.525559


#### Testing Model Facts

In [14]:
print(Summary_pred_Qual_models_test)
Summary_pred_Qual_models_test

                                        DNM  MC-Oracle          ENET  \
W1-95L                             0.000003   0.000000  1.256779e-06   
W1                                 0.000003   0.000000  1.256779e-06   
W1-95R                             0.000003   0.000000  1.256779e-06   
M-95L                              0.000006   0.000006  1.121061e-03   
M                                  0.000006   0.000006  1.121061e-03   
M-95R                              0.000006   0.000006  1.121061e-03   
N_Par                          75475.000000   0.000000  2.000000e+02   
Train_Time                        15.494996   0.856403  1.620057e+09   
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080329e-03   

                                 KRidge          GBRF           DNN  
W1-95L                         0.000067  1.091707e-05  4.836634e-07  
W1                             0.000108  2.286405e-05  8.677407e-07  
W1-95R                         0.000159  4.213852e-05  1.214113e-06  

Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN
W1-95L,3e-06,0.0,1.256779e-06,6.7e-05,1.091707e-05,4.836634e-07
W1,3e-06,0.0,1.256779e-06,0.000108,2.286405e-05,8.677407e-07
W1-95R,3e-06,0.0,1.256779e-06,0.000159,4.213852e-05,1.214113e-06
M-95L,6e-06,6e-06,0.001121061,0.005942,0.001155688,0.0005343878
M,6e-06,6e-06,0.001121061,0.00863,0.002923135,0.0007681484
M-95R,6e-06,6e-06,0.001121061,0.012027,0.00415504,0.0009839376
N_Par,75475.0,0.0,200.0,0.0,1914800.0,60601.0
Train_Time,15.494996,0.856403,1620057000.0,0.619309,8.803717,15.56482
Test_Time/MC-Oracle_Test_Time,0.693893,1.0,0.004080329,0.005195,0.01120074,0.5255593


## 2) *Gaussian Benchmarks*

- Bencharm 1: [Gaussian Process Regressor](https://scikit-learn.org/stable/modules/gaussian_process.html)
- Benchmark 2: Deep Gaussian Networks:
These models train models which assume Gaussianity.  We may view these as models in $\mathcal{P}_2(\mathbb{R})$ via:
$$
\mathbb{R}^d \ni x \to (\hat{\mu}(x),\hat{\Sigma}(x)\hat{\Sigma}^{\top})\triangleq f(x) \in \mathbb{R}\times [0,\infty) \to 
(2\pi)^{-\frac{d}{2}}\det(\hat{\Sigma}(x))^{-\frac{1}{2}} \, e^{ -\frac{1}{2}(\cdot - \hat{\mu}(x))^{{{\!\mathsf{T}}}} \hat{\Sigma}(x)^{-1}(\cdot - \hat{\mu}(x)) } \mu \in \mathcal{G}_d\subset \mathcal{P}_2(\mathbb{R});
$$
where $\mathcal{G}_1$ is the set of Gaussian measures on $\mathbb{R}$ equipped with the relative Wasserstein-1 topology.

Examples of this type of architecture are especially prevalent in uncertainty quantification; see ([Deep Ensembles](https://arxiv.org/abs/1612.01474)] or [NOMU: Neural Optimization-based Model Uncertainty](https://arxiv.org/abs/2102.13640).  Moreover, their universality in $C(\mathbb{R}^d,\mathcal{G}_2)$ is known, and has been shown in [Corollary 4.7](https://arxiv.org/abs/2101.05390).

In [15]:
# %run Benchmarks_Model_Builder_Mean_Var.ipynb
exec(open('Benchmarks_Model_Builder_Mean_Var.py').read())

Deep Feature Builder - Ready
Fitting 2 folds for each of 2 candidates, totalling 4 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   1 tasks      | elapsed:    0.3s
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:    0.3s remaining:    0.3s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.6s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.6s finished
100%|██████████| 100/100 [00:00<00:00, 1166.39it/s]

Infering Parameters for Deep Gaussian Network to train on!
Done Getting Parameters for Deep Gaussian Network!
Training Deep Gaussian Network!
Fitting 2 folds for each of 1 candidates, totalling 2 fits



[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    9.6s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    9.6s finished


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200


Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch 81/200
Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200


Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200


Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


  0%|          | 0/100 [00:00<?, ?it/s]

Training Deep Gaussian Network!: END
#---------------------------------------#
 Get Training Errors for: Gaussian Models
#---------------------------------------#


100%|██████████| 100/100 [00:00<00:00, 1740.75it/s]
100%|██████████| 20/20 [00:00<00:00, 1242.52it/s]

#-------------------------#
 Get Training Error(s): END
#-------------------------#
#--------------------------------------#
 Get Testing Errors for: Gaussian Models
#--------------------------------------#
#-------------------------#
 Get Training Error(s): END
#-------------------------#
-------------------------------------------------
Updating Performance Metrics Dataframe and Saved!
-------------------------------------------------
                                        DNM  MC-Oracle          ENET  \
W1-95L                             0.000003   0.000000  1.256779e-06   
W1                                 0.000468   0.000000  4.656555e-04   
W1-95R                             0.001399   0.000000  1.858852e-03   
M-95L                              0.000006   0.000006  1.121061e-03   
M                                  0.001124   0.001124  2.219702e-03   
M-95R                              0.004478   0.003365  4.416982e-03   
N_Par                          75475.000000   0.000000 




In [16]:
print("Prediction Quality (Updated): Test")
print(Summary_pred_Qual_models_test)
Summary_pred_Qual_models_test

Prediction Quality (Updated): Test
                                        DNM  MC-Oracle          ENET  \
W1-95L                             0.000003   0.000000  1.256779e-06   
W1                                 0.000003   0.000000  1.256779e-06   
W1-95R                             0.000003   0.000000  1.256779e-06   
M-95L                              0.000006   0.000006  1.121061e-03   
M                                  0.000006   0.000006  1.121061e-03   
M-95R                              0.000006   0.000006  1.121061e-03   
N_Par                          75475.000000   0.000000  2.000000e+02   
Train_Time                        15.494996   0.856403  1.620057e+09   
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080329e-03   

                                 KRidge          GBRF           DNN       GPR  \
W1-95L                         0.000067  1.091707e-05  4.836634e-07  0.000113   
W1                             0.000108  2.286405e-05  8.677407e-07  0.000121   
W

Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN,GPR,DGN
W1-95L,3e-06,0.0,1.256779e-06,6.7e-05,1.091707e-05,4.836634e-07,0.000113,0.926788
W1,3e-06,0.0,1.256779e-06,0.000108,2.286405e-05,8.677407e-07,0.000121,0.973821
W1-95R,3e-06,0.0,1.256779e-06,0.000159,4.213852e-05,1.214113e-06,0.000129,1.041654
M-95L,6e-06,6e-06,0.001121061,0.005942,0.001155688,0.0005343878,0.0,0.015234
M,6e-06,6e-06,0.001121061,0.00863,0.002923135,0.0007681484,0.0,0.025191
M-95R,6e-06,6e-06,0.001121061,0.012027,0.00415504,0.0009839376,0.0,0.03718
N_Par,75475.0,0.0,200.0,0.0,1914800.0,60601.0,0.0,60601.0
Train_Time,15.494996,0.856403,1620057000.0,0.619309,8.803717,15.56482,1.09054,17.454785
Test_Time/MC-Oracle_Test_Time,0.693893,1.0,0.004080329,0.005195,0.01120074,0.5255593,0.072526,0.361645


In [17]:
print("Prediction Quality (Updated): Train")
print(Summary_pred_Qual_models)
Summary_pred_Qual_models

# Remove W1 estimates from x \mapsto \delta_{f(x)}
Summary_pred_Qual_models.loc[['W1-95L','W1','W1-95R'],['ENET','KRidge','ENET','GBRF','DNN']] = "-"

Prediction Quality (Updated): Train
                                        DNM  MC-Oracle          ENET  \
W1-95L                             0.000003   0.000000  1.256779e-06   
W1                                 0.000468   0.000000  4.656555e-04   
W1-95R                             0.001399   0.000000  1.858852e-03   
M-95L                              0.000006   0.000006  1.121061e-03   
M                                  0.001124   0.001124  2.219702e-03   
M-95R                              0.004478   0.003365  4.416982e-03   
N_Par                          75475.000000   0.000000  2.000000e+02   
Train_Time                        15.494996   0.856403  1.620057e+09   
Test_Time/MC-Oracle_Test_Time      0.693893   1.000000  4.080329e-03   

                                 KRidge          GBRF           DNN       GPR  \
W1-95L                         0.000067  1.091707e-05      0.000003  0.000124   
W1                             0.000387  3.457320e-04      0.000366  0.000553   


Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN,GPR,DGN
W1-95L,3e-06,0.0,1.256779e-06,6.7e-05,1.091707e-05,3e-06,0.000124,0.946601
W1,0.000468,0.0,0.0004656555,0.000387,0.000345732,0.000366,0.000553,0.973821
W1-95R,0.001399,0.0,0.001858852,0.001073,0.001382058,0.001078,0.001408,1.041654
M-95L,6e-06,6e-06,0.001121061,0.005942,0.001155688,0.000767,0.0,0.02206
M,0.001124,0.001124,0.002219702,0.00863,0.002923135,0.001463,0.001121,0.025506
M-95R,0.004478,0.003365,0.004416982,0.012027,0.00415504,0.00235,0.003363,0.03718
N_Par,75475.0,0.0,200.0,0.0,1914800.0,60601.0,0.0,60601.0
Train_Time,15.494996,0.856403,1620057000.0,0.619309,8.803717,15.56482,1.09054,17.454785
Test_Time/MC-Oracle_Test_Time,0.693893,1.0,0.004080329,0.005195,0.01120074,0.525559,0.072526,0.361645


# 3) The natural Universal Benchmark: [Bishop's Mixture Density Network](https://publications.aston.ac.uk/id/eprint/373/1/NCRG_94_004.pdf)

This implementation is as follows:
- For every $x$ in the trainingdata-set we fit a GMM $\hat{\nu}_x$, using the [Expectation-Maximization (EM) algorithm](https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm), with the same number of centers as the deep neural model in $\mathcal{NN}_{1_{\mathbb{R}^d},\mathcal{D}}^{\sigma:\star}$ which we are evaluating.  
- A Mixture density network is then trained to predict the infered parameters; given any $x \in \mathbb{R}^d$.

In [None]:
if output_dim == 1:
    # %run Mixture_Density_Network.ipynb
    exec(open('Mixture_Density_Network.py').read())

  0%|          | 0/100 [00:00<?, ?it/s]

Preparing Training Outputs for MDNs using EM-Algorithm


100%|██████████| 100/100 [00:31<00:00,  3.20it/s]
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


Prepared Training Outputs for MDNs using EM-Algorithm!
Deep Feature Builder - Ready
(0)
Training Mixture Density Network (MDN): Means: Start!
Fitting 2 folds for each of 1 candidates, totalling 2 fits


[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    9.1s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    9.1s finished


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200


Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch 81/200
Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200


Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200


## Get Final Outputs
Now we piece together all the numerical experiments and report a nice summary.

---
# Final Results
---

## Prasing Quality Metric Results

#### Finalizing Saving

In [None]:
## Write Performance Metrics
Summary_pred_Qual_models.to_latex((results_tables_path+"/Final_Results/"+"Performance_metrics_Problem_Type_"+str(f_unknown_mode)+"Problemdimension"+str(problem_dim)+"__SUMMARY_METRICS.tex"),
                                 caption=("Quality Metrics; d:"+str(problem_dim)+", D:"+str(output_dim)+", Depth:"+str(Depth_Bayesian_DNN)+", Width:"+str(width)+", Dropout rate:"+str(Dropout_rate)+"."),
                                 float_format="{:0.3g}".format)

# For Terminal Runner(s):

In [None]:
# For Terminal Running
print("===================")
print("Predictive Quality:")
print("===================")
print(Summary_pred_Qual_models)
print("===================")
print(" ")
print(" ")
print(" ")
print("Kernel_Used_in_GPR: "+str(GPR_trash.kernel))
print("🙃🙃 Have a wonderful day! 🙃🙃")
Summary_pred_Qual_models

---
# Fin
---

---