# Deep Universal Regular Conditional Expectations:

---
This implements the universal deep neural model of $\mathcal{NN}_{1_{\mathbb{R}^n},\mathcal{D}}^{\sigma:\star}$ [Anastasis Kratsios](https://people.math.ethz.ch/~kratsioa/) - 2021.

---

## What does this code do?
1. Learn Heteroskedastic Non-Linear Regression Problem
     - $Y\sim f_{\text{unkown}}(x) + \epsilon$ where $f$ is an known function and $\epsilon\sim Laplace(0,\|x\|)$
2. Learn Random Bayesian Network's Law:
    - $Y = W_J Y^{J-1}, \qquad Y^{j}\triangleq \sigma\bullet A^{j}Y^{j-1} + b^{j}, \qquad Y^0\triangleq x$

3. In the above example if $A_j = M_j\odot \tilde{A_j}$ where $\tilde{A}_j$ is a deterministic matrix and $M_j$ is a "mask", that is, a random matrix with binary entries and $\odot$ is the Hadamard product then we recover the dropout framework.
4. Learn the probability distribution that the unique strong solution to the rough SDE with uniformly Lipschitz drivers driven by a factional Brownian motion with Hurst exponent $H \in [\frac1{2},1)$:
$$
X_t^x = x + \int_0^t \alpha(s,X_s^x)ds + \int_0^t \beta(s,X_s^x)dB_s^H
$$
belongs, at time $t=1$, to a ball about the initial point $x$ of random radius given by an independant exponential random-variable with shape parameter $\lambda=2$
5. Train a DNN to predict the returns of bitcoin with GD.  Since this has random initialization then each prediction of a given $x$ is stochastic...We learn the distribution of this conditional RV (conditioned on x in the input space).
$$
Y_x \triangleq \hat{f}_{\theta_{T}}(x), \qquad \theta_{(t+1)}\triangleq \theta_{(t)} + \lambda \sum_{x \in \mathbb{X}} \nabla_{\theta}\|\hat{f}_{\theta_t}(x) - f(x)\|, \qquad \theta_0 \sim N_d(0,1);
$$
$T\in \mathbb{N}$ is a fixed number of "SGD" iterations (typically identified by cross-validation on a single SGD trajectory for a single initialization) and where $\theta \in \mathbb{R}^{(d_{J}+1)+\sum_{j=0}^{J-1} (d_{j+1}d_j + 1)}$ and $d_j$ is the dimension of the "bias" vector $b_j$ defining each layer of the DNN with layer dimensions:
$$
\hat{f}_{\theta}(x)\triangleq A^{(J)}x^{(J)} + b^{(J)},\qquad x^{(j+1)}\triangleq \sigma\bullet A^{j}x^{(j)} + b^{j},\qquad x^{(0)}\triangleq x
.
$$

#### Mode:
Software/Hardware Testing or Real-Deal?

In [1]:
trial_run = True

### Simulation Method:

In [2]:
# # Random DNN
# f_unknown_mode = "Heteroskedastic_NonLinear_Regression"

# # Random DNN internal noise
# f_unknown_mode = "DNN_with_Random_Weights"
Depth_Bayesian_DNN = 1
width = 5

# # Random Dropout applied to trained DNN
# f_unknown_mode = "DNN_with_Bayesian_Dropout"
Dropout_rate = 0.1

# GD with Randomized Input
# f_unknown_mode = "GD_with_randomized_input"
GD_epochs = 2

# SDE with fractional Driver
f_unknown_mode = "Rough_SDE"
N_Euler_Steps = 10**1
Hurst_Exponent = 0.5

## Problem Dimension

In [3]:
problem_dim = 5

## Note: *Why the procedure is so computationally efficient*?
---
 - The sample barycenters do not require us to solve for any new Wasserstein-1 Barycenters; which is much more computationally costly,
 - Our training procedure never back-propages through $\mathcal{W}_1$ since steps 2 and 3 are full-decoupled.  Therefore, training our deep classifier is (comparatively) cheap since it takes values in the standard $N$-simplex.

---

#### Grid Hyperparameter(s)
- Ratio $\frac{\text{Testing Datasize}}{\text{Training Datasize}}$.
- Number of Training Points to Generate

In [4]:
train_test_ratio = .2
N_train_size = 20

Monte-Carlo Paramters

In [5]:
## Monte-Carlo
N_Monte_Carlo_Samples = 10**1

Initial radis of $\delta$-bounded random partition of $\mathcal{X}$!

In [6]:
# Hyper-parameters of Cover
delta = 0.01
Proportion_per_cluster = .75

## Dependencies and Auxiliary Script(s)

In [7]:
# %run Loader.ipynb
exec(open('Loader.py').read())
# Load Packages/Modules
exec(open('Init_Dump.py').read())
import time as time #<- Note sure why...but its always seems to need 'its own special loading...'

Using TensorFlow backend.


Deep Feature Builder - Ready
Deep Classifier - Ready
Deep Feature Builder - Ready


# Simulate or Parse Data

In [8]:
# %run Data_Simulator_and_Parser.ipynb
exec(open('Data_Simulator_and_Parser.py').read())

100%|██████████| 20/20 [00:00<00:00, 149.57it/s]
100%|██████████| 4/4 [00:00<00:00, 123.10it/s]

---------------------------------------
Beginning Data-Parsing/Simulation Phase
---------------------------------------
Deciding on Which Simulator/Parser To Load
Setting/Defining: Internal Parameters
Deciding on Which Type of Data to Get/Simulate
Simulating Output Data for given input data
----------------------------------
Done Data-Parsing/Simulation Phase
----------------------------------





# Run Main:

In [9]:
print("------------------------------")
print("Running script for main model!")
print("------------------------------")
# %run Universal_Measure_Valued_Networks_Backend.ipynb
exec(open('Universal_Measure_Valued_Networks_Backend.py').read())

print("------------------------------------")
print("Done: Running script for main model!")
print("------------------------------------")

------------------------------
Running script for main model!
------------------------------


100%|██████████| 150/150 [00:00<00:00, 17376.83it/s]

Deep Feature Builder - Ready
Deep Classifier - Ready
Training Classifer Portion of Type-A Model
Fitting 2 folds for each of 1 candidates, totalling 2 fits



[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    3.5s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    3.5s finished


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


  0%|          | 0/200 [00:00<?, ?it/s]

Training Classifer Portion of Type Model: Done!
#--------------------#
 Get Training Error(s)
#--------------------#


100%|██████████| 200/200 [00:06<00:00, 29.57it/s]
  8%|▊         | 3/40 [00:00<00:01, 24.60it/s]

#-------------------------#
 Get Training Error(s): END
#-------------------------#
#----------------#
 Get Test Error(s)
#----------------#


100%|██████████| 40/40 [00:01<00:00, 34.09it/s]

#------------------------#
 Get Testing Error(s): END
#------------------------#
                                       DNM  MC-Oracle
W1-95L                            0.405253   0.000000
W1                                0.417215   0.000000
W1-95R                            0.424174   0.000000
M-95L                             1.252847   1.212462
M                                 1.327205   1.327205
M-95R                             1.453645   1.467073
N_Par                          1720.000000   0.000000
Train_Time                       12.701303   0.176438
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000
------------------------------------
Done: Running script for main model!
------------------------------------





---
# Run: All Benchmarks

## 1) *Pointmass Benchmark(s)*
These benchmarks consist of subsets of $C(\mathbb{R}^d,\mathbb{R})$ which we lift to models in $C(\mathbb{R}^d,\cap_{1\leq q<\infty}\mathscr{P}_{q}(\mathbb{R}))$ via:
$$
\mathbb{R}^d \ni x \to f(x) \to \delta_{f(x)}\in \cap_{1\leq q<\infty}\mathcal{P}_{q}(\mathbb{R}).
$$

In [10]:
exec(open('CV_Grid.py').read())
# Notebook Mode:
# %run Evaluation.ipynb
# %run Benchmarks_Model_Builder_Pointmass_Based.ipynb
# Terminal Mode (Default):
exec(open('Evaluation.py').read())
exec(open('Benchmarks_Model_Builder_Pointmass_Based.py').read())

  0%|          | 0/5 [00:00<?, ?it/s]

Deep Feature Builder - Ready
--------------
Training: ENET
--------------


100%|██████████| 5/5 [00:22<00:00,  4.53s/it]
 11%|█         | 22/200 [00:00<00:00, 216.43it/s]

---------------------
Training: ENET - Done
---------------------
#------------#
 Get Error(s) 
#------------#


100%|██████████| 200/200 [00:00<00:00, 210.29it/s]
100%|██████████| 40/40 [00:00<00:00, 208.44it/s]


#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame

[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   1 tasks      | elapsed:    0.0s
[Parallel(n_jobs=4)]: Batch computation too fast (0.0466s.) Setting batch_size=2.
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:    0.1s remaining:    0.1s



                                       DNM  MC-Oracle          ENET
W1-95L                            0.405253   0.000000  9.345086e+00
W1                                0.417215   0.000000  9.785819e+00
W1-95R                            0.424174   0.000000  1.026902e+01
M-95L                             1.252847   1.212462  4.503208e+01
M                                 1.327205   1.327205  4.784373e+01
M-95R                             1.453645   1.467073  4.991505e+01
N_Par                          1720.000000   0.000000  2.000000e+03
Train_Time                       12.701303   0.176438  1.619795e+09
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03
-----------------
Training: K-Ridge
-----------------
Fitting 2 folds for each of 2 candidates, totalling 4 fits


[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.3s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.3s finished
  6%|▌         | 12/200 [00:00<00:01, 119.96it/s]

#------------#
 Get Error(s) 
#------------#


100%|██████████| 200/200 [00:00<00:00, 216.94it/s]
100%|██████████| 40/40 [00:00<00:00, 224.85it/s]
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame
                                       DNM  MC-Oracle          ENET     KRidge
W1-95L                            0.405253   0.000000  9.345086e+00   9.577519
W1                                0.417215   0.000000  9.785819e+00   9.942966
W1-95R                            0.424174   0.000000  1.026902e+01  10.257549
M-95L                             1.252847   1.212462  4.503208e+01  45.517167
M                                 1.327205   1.327205  4.784373e+01  47.991556
M-95R                             1.453645   1.467073  4.991505e+01  49.836028
N_Par                          1720.000000   0.000000  2.000000e+03   0.000000
Train_Time                       12.701303   0.176438  1.619795e+09   1.497953
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   0.025855
--------------
Training: GB

[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s finished


Fitting 2 folds for each of 1 candidates, totalling 2 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s finished


Fitting 2 folds for each of 1 candidates, totalling 2 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s finished


Fitting 2 folds for each of 1 candidates, totalling 2 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s finished


Fitting 2 folds for each of 1 candidates, totalling 2 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    0.2s finished
 10%|█         | 20/200 [00:00<00:00, 195.29it/s]

#------------#
 Get Error(s) 
#------------#


100%|██████████| 200/200 [00:01<00:00, 196.37it/s]
100%|██████████| 40/40 [00:00<00:00, 219.77it/s]
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#
#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame
                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.405253   0.000000  9.345086e+00   
W1                                0.417215   0.000000  9.785819e+00   
W1-95R                            0.424174   0.000000  1.026902e+01   
M-95L                             1.252847   1.212462  4.503208e+01   
M                                 1.327205   1.327205  4.784373e+01   
M-95R                             1.453645   1.467073  4.991505e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                                  KRidge          GBRF  
W1-95L                          9.577519  9.48316

[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    3.1s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    3.1s finished


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


  0%|          | 0/200 [00:00<?, ?it/s]

#------------#
 Get Error(s) 
#------------#


100%|██████████| 200/200 [00:00<00:00, 222.65it/s]
 55%|█████▌    | 22/40 [00:00<00:00, 212.93it/s]

#-----------------#
 Get Error(s): END 
#-----------------#
#------------#
 Get Error(s) 
#------------#


100%|██████████| 40/40 [00:00<00:00, 199.76it/s]

#-----------------#
 Get Error(s): END 
#-----------------#
Updated DataFrame
                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.405253   0.000000  9.345086e+00   
W1                                0.417215   0.000000  9.785819e+00   
W1-95R                            0.424174   0.000000  1.026902e+01   
M-95L                             1.252847   1.212462  4.503208e+01   
M                                 1.327205   1.327205  4.784373e+01   
M-95R                             1.453645   1.467073  4.991505e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                                  KRidge          GBRF         DNN  
W1-95L                          9.577519  9.483166e+00    1.070440  
W1                              9.942966  9.826340e+00    1.121222  
W1-9




# Summary of Point-Mass Regression Models

#### Training Model Facts

In [11]:
print(Summary_pred_Qual_models)
Summary_pred_Qual_models

                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.389920   0.000000  9.021460e+00   
W1                                0.395162   0.000000  9.223720e+00   
W1-95R                            0.400397   0.000000  9.396189e+00   
M-95L                             1.152075   1.177709  4.366266e+01   
M                                 1.204449   1.204449  4.447347e+01   
M-95R                             1.234901   1.236398  4.535899e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                                  KRidge          GBRF         DNN  
W1-95L                          9.092377  9.048603e+00    0.989376  
W1                              9.298649  9.180425e+00    1.016542  
W1-95R                          9.462457  9.406257e+00    1.037143  
M-95L        

Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN
W1-95L,0.38992,0.0,9.02146,9.092377,9.048603,0.989376
W1,0.395162,0.0,9.22372,9.298649,9.180425,1.016542
W1-95R,0.400397,0.0,9.396189,9.462457,9.406257,1.037143
M-95L,1.152075,1.177709,43.66266,43.84938,43.71747,4.831776
M,1.204449,1.204449,44.47347,44.465894,44.47347,4.926057
M-95R,1.234901,1.236398,45.35899,44.829841,45.16915,4.987833
N_Par,1720.0,0.0,2000.0,0.0,1655076.0,125.0
Train_Time,12.701303,0.176438,1619795000.0,1.497953,3.726702,5.373271
Test_Time/MC-Oracle_Test_Time,1.582523,1.0,0.009366185,0.025855,0.1598643,1.029629


#### Testing Model Facts

In [12]:
print(Summary_pred_Qual_models_test)
Summary_pred_Qual_models_test

                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.405253   0.000000  9.345086e+00   
W1                                0.417215   0.000000  9.785819e+00   
W1-95R                            0.424174   0.000000  1.026902e+01   
M-95L                             1.252847   1.212462  4.503208e+01   
M                                 1.327205   1.327205  4.784373e+01   
M-95R                             1.453645   1.467073  4.991505e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                                  KRidge          GBRF         DNN  
W1-95L                          9.577519  9.483166e+00    1.070440  
W1                              9.942966  9.826340e+00    1.121222  
W1-95R                         10.257549  1.034169e+01    1.173450  
M-95L        

Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN
W1-95L,0.405253,0.0,9.345086,9.577519,9.483166,1.07044
W1,0.417215,0.0,9.785819,9.942966,9.82634,1.121222
W1-95R,0.424174,0.0,10.26902,10.257549,10.34169,1.17345
M-95L,1.252847,1.212462,45.03208,45.517167,45.93144,5.016318
M,1.327205,1.327205,47.84373,47.991556,48.24811,5.277003
M-95R,1.453645,1.467073,49.91505,49.836028,49.72675,5.576289
N_Par,1720.0,0.0,2000.0,0.0,1655076.0,125.0
Train_Time,12.701303,0.176438,1619795000.0,1.497953,3.726702,5.373271
Test_Time/MC-Oracle_Test_Time,1.582523,1.0,0.009366185,0.025855,0.1598643,1.029629


## 2) *Gaussian Benchmarks*

- Bencharm 1: [Gaussian Process Regressor](https://scikit-learn.org/stable/modules/gaussian_process.html)
- Benchmark 2: Deep Gaussian Networks:
These models train models which assume Gaussianity.  We may view these as models in $\mathcal{P}_2(\mathbb{R})$ via:
$$
\mathbb{R}^d \ni x \to (\hat{\mu}(x),\hat{\Sigma}(x)\hat{\Sigma}^{\top})\triangleq f(x) \in \mathbb{R}\times [0,\infty) \to 
(2\pi)^{-\frac{d}{2}}\det(\hat{\Sigma}(x))^{-\frac{1}{2}} \, e^{ -\frac{1}{2}(\cdot - \hat{\mu}(x))^{{{\!\mathsf{T}}}} \hat{\Sigma}(x)^{-1}(\cdot - \hat{\mu}(x)) } \mu \in \mathcal{G}_d\subset \mathcal{P}_2(\mathbb{R});
$$
where $\mathcal{G}_1$ is the set of Gaussian measures on $\mathbb{R}$ equipped with the relative Wasserstein-1 topology.

Examples of this type of architecture are especially prevalent in uncertainty quantification; see ([Deep Ensembles](https://arxiv.org/abs/1612.01474)] or [NOMU: Neural Optimization-based Model Uncertainty](https://arxiv.org/abs/2102.13640).  Moreover, their universality in $C(\mathbb{R}^d,\mathcal{G}_2)$ is known, and has been shown in [Corollary 4.7](https://arxiv.org/abs/2101.05390).

In [13]:
# %run Benchmarks_Model_Builder_Mean_Var.ipynb
exec(open('Benchmarks_Model_Builder_Mean_Var.py').read())

DNN Builder - Ready
Fitting 2 folds for each of 2 candidates, totalling 4 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   1 tasks      | elapsed:    0.5s
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:    0.6s remaining:    0.6s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.9s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.9s finished
100%|██████████| 200/200 [00:00<00:00, 1961.16it/s]

Infering Parameters for Deep Gaussian Network to train on!
Done Getting Parameters for Deep Gaussian Network!
Training Deep Gaussian Network!
Fitting 2 folds for each of 1 candidates, totalling 2 fits



[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    2.9s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    2.9s finished


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


  0%|          | 0/200 [00:00<?, ?it/s]

Training Deep Gaussian Network!: END
#---------------------------------------#
 Get Training Errors for: Gaussian Models
#---------------------------------------#


100%|██████████| 200/200 [00:02<00:00, 98.51it/s] 
 22%|██▎       | 9/40 [00:00<00:00, 81.63it/s]

#-------------------------#
 Get Training Error(s): END
#-------------------------#
#--------------------------------------#
 Get Testing Errors for: Gaussian Models
#--------------------------------------#


100%|██████████| 40/40 [00:00<00:00, 90.11it/s]

#-------------------------#
 Get Training Error(s): END
#-------------------------#
-------------------------------------------------
Updating Performance Metrics Dataframe and Saved!
-------------------------------------------------
Training Results to date:
                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.405253   0.000000  9.345086e+00   
W1                                0.417215   0.000000  9.785819e+00   
W1-95R                            0.424174   0.000000  1.026902e+01   
M-95L                             1.252847   1.212462  4.503208e+01   
M                                 1.327205   1.327205  4.784373e+01   
M-95R                             1.453645   1.467073  4.991505e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                             




In [14]:
print("Prediction Quality (Updated): Test")
print(Summary_pred_Qual_models_test)
Summary_pred_Qual_models_test

Prediction Quality (Updated): Test
                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.405253   0.000000  9.345086e+00   
W1                                0.417215   0.000000  9.785819e+00   
W1-95R                            0.424174   0.000000  1.026902e+01   
M-95L                             1.252847   1.212462  4.503208e+01   
M                                 1.327205   1.327205  4.784373e+01   
M-95R                             1.453645   1.467073  4.991505e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                                  KRidge          GBRF         DNN        GPR  \
W1-95L                          9.577519  9.483166e+00    1.070440  10.775148   
W1                              9.942966  9.826340e+00    1.121222  11.139447   
W1-95R     

Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN,GPR,DGN
W1-95L,0.405253,0.0,9.345086,9.577519,9.483166,1.07044,10.775148,1.144341
W1,0.417215,0.0,9.785819,9.942966,9.82634,1.121222,11.139447,1.192211
W1-95R,0.424174,0.0,10.26902,10.257549,10.34169,1.17345,11.489751,1.226343
M-95L,1.252847,1.212462,45.03208,45.517167,45.93144,5.016318,45.92124,24.889407
M,1.327205,1.327205,47.84373,47.991556,48.24811,5.277003,48.429675,26.448704
M-95R,1.453645,1.467073,49.91505,49.836028,49.72675,5.576289,50.94202,27.926624
N_Par,1720.0,0.0,2000.0,0.0,1655076.0,125.0,0.0,400.0
Train_Time,12.701303,0.176438,1619795000.0,1.497953,3.726702,5.373271,2.054518,3.771988
Test_Time/MC-Oracle_Test_Time,1.582523,1.0,0.009366185,0.025855,0.1598643,1.029629,0.048564,1.07024


In [15]:
print("Prediction Quality (Updated): Train")
print(Summary_pred_Qual_models)
Summary_pred_Qual_models

Prediction Quality (Updated): Train
                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.389920   0.000000  9.021460e+00   
W1                                0.395162   0.000000  9.223720e+00   
W1-95R                            0.400397   0.000000  9.396189e+00   
M-95L                             1.152075   1.177709  4.366266e+01   
M                                 1.204449   1.204449  4.447347e+01   
M-95R                             1.234901   1.236398  4.535899e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                                  KRidge          GBRF         DNN        GPR  \
W1-95L                          9.092377  9.048603e+00    0.989376   9.840945   
W1                              9.298649  9.180425e+00    1.016542  10.002771   
W1-95R    

Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN,GPR,DGN
W1-95L,0.38992,0.0,9.02146,9.092377,9.048603,0.989376,9.840945,1.049405
W1,0.395162,0.0,9.22372,9.298649,9.180425,1.016542,10.002771,1.06731
W1-95R,0.400397,0.0,9.396189,9.462457,9.406257,1.037143,10.200879,1.086586
M-95L,1.152075,1.177709,43.66266,43.84938,43.71747,4.831776,43.825818,24.272519
M,1.204449,1.204449,44.47347,44.465894,44.47347,4.926057,44.473469,24.69329
M-95R,1.234901,1.236398,45.35899,44.829841,45.16915,4.987833,45.222007,25.061308
N_Par,1720.0,0.0,2000.0,0.0,1655076.0,125.0,0.0,400.0
Train_Time,12.701303,0.176438,1619795000.0,1.497953,3.726702,5.373271,2.054518,3.771988
Test_Time/MC-Oracle_Test_Time,1.582523,1.0,0.009366185,0.025855,0.1598643,1.029629,0.048564,1.07024


# 3) The natural Universal Benchmark: [Bishop's Mixture Density Network](https://publications.aston.ac.uk/id/eprint/373/1/NCRG_94_004.pdf)

This implementation is as follows:
- For every $x$ in the trainingdata-set we fit a GMM $\hat{\nu}_x$, using the [Expectation-Maximization (EM) algorithm](https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm), with the same number of centers as the deep neural model in $\mathcal{NN}_{1_{\mathbb{R}^d},\mathcal{D}}^{\sigma:\star}$ which we are evaluating.  
- A Mixture density network is then trained to predict the infered parameters; given any $x \in \mathbb{R}^d$.

In [16]:
if output_dim == 1:
    # %run Mixture_Density_Network.ipynb
    exec(open('Mixture_Density_Network.py').read())

## Get Final Outputs
Now we piece together all the numerical experiments and report a nice summary.

# Result(s)

## Prediction Quality

#### Training

In [17]:
print("Final Test-Set Result(s)")
Summary_pred_Qual_models

Final Test-Set Result(s)


Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN,GPR,DGN
W1-95L,0.38992,0.0,9.02146,9.092377,9.048603,0.989376,9.840945,1.049405
W1,0.395162,0.0,9.22372,9.298649,9.180425,1.016542,10.002771,1.06731
W1-95R,0.400397,0.0,9.396189,9.462457,9.406257,1.037143,10.200879,1.086586
M-95L,1.152075,1.177709,43.66266,43.84938,43.71747,4.831776,43.825818,24.272519
M,1.204449,1.204449,44.47347,44.465894,44.47347,4.926057,44.473469,24.69329
M-95R,1.234901,1.236398,45.35899,44.829841,45.16915,4.987833,45.222007,25.061308
N_Par,1720.0,0.0,2000.0,0.0,1655076.0,125.0,0.0,400.0
Train_Time,12.701303,0.176438,1619795000.0,1.497953,3.726702,5.373271,2.054518,3.771988
Test_Time/MC-Oracle_Test_Time,1.582523,1.0,0.009366185,0.025855,0.1598643,1.029629,0.048564,1.07024


#### Test

In [18]:
print("Final Training-Set Result(s)")
Summary_pred_Qual_models_test

Final Training-Set Result(s)


Unnamed: 0,DNM,MC-Oracle,ENET,KRidge,GBRF,DNN,GPR,DGN
W1-95L,0.405253,0.0,9.345086,9.577519,9.483166,1.07044,10.775148,1.144341
W1,0.417215,0.0,9.785819,9.942966,9.82634,1.121222,11.139447,1.192211
W1-95R,0.424174,0.0,10.26902,10.257549,10.34169,1.17345,11.489751,1.226343
M-95L,1.252847,1.212462,45.03208,45.517167,45.93144,5.016318,45.92124,24.889407
M,1.327205,1.327205,47.84373,47.991556,48.24811,5.277003,48.429675,26.448704
M-95R,1.453645,1.467073,49.91505,49.836028,49.72675,5.576289,50.94202,27.926624
N_Par,1720.0,0.0,2000.0,0.0,1655076.0,125.0,0.0,400.0
Train_Time,12.701303,0.176438,1619795000.0,1.497953,3.726702,5.373271,2.054518,3.771988
Test_Time/MC-Oracle_Test_Time,1.582523,1.0,0.009366185,0.025855,0.1598643,1.029629,0.048564,1.07024


# For Terminal Runner(s):

In [19]:
# For Terminal Running
print("============================")
print("Training Predictive Quality:")
print("============================")
print(Summary_pred_Qual_models)
print(" ")
print(" ")
print(" ")
print("===========================")
print("Testing Predictive Quality:")
print("===========================")
print(Summary_pred_Qual_models_test)
print("================================")
print(" ")
print(" ")
print(" ")
print("Kernel_Used_in_GPR: "+str(GPR_trash.kernel))
print("🙃🙃 Have a wonderful day! 🙃🙃")

Training Predictive Quality:
                                       DNM  MC-Oracle          ENET  \
W1-95L                            0.389920   0.000000  9.021460e+00   
W1                                0.395162   0.000000  9.223720e+00   
W1-95R                            0.400397   0.000000  9.396189e+00   
M-95L                             1.152075   1.177709  4.366266e+01   
M                                 1.204449   1.204449  4.447347e+01   
M-95R                             1.234901   1.236398  4.535899e+01   
N_Par                          1720.000000   0.000000  2.000000e+03   
Train_Time                       12.701303   0.176438  1.619795e+09   
Test_Time/MC-Oracle_Test_Time     1.582523   1.000000  9.366185e-03   

                                  KRidge          GBRF         DNN        GPR  \
W1-95L                          9.092377  9.048603e+00    0.989376   9.840945   
W1                              9.298649  9.180425e+00    1.016542  10.002771   
W1-95R           

---
# Fin
---

---