# Deep Weak Stochastic Processes
---

## Meta-Parameters

### Simulation

In [36]:
## Monte-Carlo
N_Euler_Maruyama_Steps = 2
N_Monte_Carlo_Samples = 10**4

## Grid
N_Grid_Finess = 10**3
Max_Grid = 2

### Quantization
*This hyperparameter describes the proportion of the data used as sample-barycenters.*

In [37]:
Quantization_Proportion = 0.5

**Note**: Setting *N_Quantizers_to_parameterize* prevents any barycenters and sub-sampling.

# Training Algorithm
---
Given a set of training inputs $\mathbb{X}$ and a stochastic process $(X_t)_{t\geq 0}$ which we can sample from:
1. **For:** x in $\mathbb{X}$:
    - *Simulate:* $\{x\mapsto X_T(\omega_n)\}_{n=1}^N$
    - *Set*: $\hat{\nu}_{x,T}\triangleq \frac1{N}\sum_{n=1}^N \delta_{X_T(\omega_n)}$
2. **Learn:** Wasserstein Barycenters $\hat{\mu}_1,\dots,\hat{\mu}_N
    \in \underset{{\hat{\mu}_n\in\mathscr{P}_{N}(\mathbb{R}^d)}}{\operatorname{argmin}}
    \, \sum_{n=1}^N W_1(\hat{\mu_n},\hat{\nu}_{x,T})$
3. **Train Classifier:** $\hat{f}:x\mapsto \operatorname{n\leq N}\, W_1(\hat{\mu_n},\hat{\nu}_{x,T})$

#### Mode: Code-Testin Parameter(s)

In [38]:
trial_run = True

### Meta-parameters

In [39]:
# Test-size Ratio
test_size_ratio = .3

### Hyperparameters

Only turn of if running code directly here, typically this script should be run be called by other notebooks.  

In [40]:
# load dataset
results_path = "./outputs/models/"
results_tables_path = "./outputs/results/"
raw_data_path_folder = "./inputs/raw/"
data_path_folder = "./inputs/data/"

### Import

In [41]:
# Load Packages/Modules
exec(open('Init_Dump.py').read())
# Load Hyper-parameter Grid
exec(open('Grid_Enhanced_Network.py').read())
# Load Helper Function(s)
exec(open('Helper_Functions.py').read())
# Import time separately
import time

Deep Feature Builder - Ready
Deep Classifier - Ready


## Get Internal (Hyper)-Parameter(s)
*Initialize the hyperparameters which are fully-specified by the user-provided hyperparameter(s).*

In [74]:
# Get Internal (Counting) Parameters
N_Quantizers_to_parameterize = round(Quantization_Proportion*N_Grid_Finess)
N_Elements_Per_Cluster = int(round(N_Grid_Instances/N_Quantizers_to_parameterize))

# Update User
print("\u2022",N_Quantizers_to_parameterize," Centers will be produced; from a total datasize of: ",N_Grid_Finess,
      "!  (That's ",Quantization_Proportion,
      " percent).")
print("\u2022 Each Wasserstein-1 Ball should contain: ",
      N_Elements_Per_Cluster, 
      "elements from the training set.")

• 500  Centers will be produced; from a total datasize of:  1000 !  (That's  0.5  percent).
• Each Wasserstein-1 Ball should contain:  2 elements from the training set.


### Set Seed

In [75]:
random.seed(2021)
np.random.seed(2021)
tf.random.set_seed(2021)

---

### Simulate Path
$d X_t = \alpha(t,x)dt + \beta(t,x)dW_t ;\qquad X_0 =x$

### Drift

In [76]:
def alpha(t,x):
    return np.sin(math.pi*t)

### Volatility

In [77]:
def beta(t,x):
    return (t+1)**.5

## Initialize Grid
This is $\mathbb{X}$ and it represents the grid of initial states.

In [78]:
# Get Input Data
x_Grid = np.arange(start=-Max_Grid,
                   stop=Max_Grid,
                   step=(2*Max_Grid/N_Grid_Finess))

# Get Number of Instances in Grid
N_Grid_Instances = len(x_Grid)

# Updater User
print("Grid Instances: ", N_Grid_Instances)

Grid Instances:  1000


---

In [79]:
# Initialize List of Barycenters
Wasserstein_Barycenters = []
# Initialize Terminal-Time Empirical Measures
measures_locations_list = []
measures_weights_list = []
# Initialize (Empirical) Weight(s)
measure_weights = np.ones(N_Monte_Carlo_Samples)/N_Monte_Carlo_Samples
# Initialize Quantizer
Init_Quantizer_generic = np.ones(N_Monte_Carlo_Samples)/N_Monte_Carlo_Samples

## Generate $\{\hat{\nu}^{N}_{T,x}\}_{x \in \mathbb{X}}$ Build Wasserstein Cover

#### Get Data

In [80]:
for i in tqdm(range(N_Grid_Instances)):
    # Get Terminal Distribution Shape
    ###
    # DIRECT SAMPLING
    measures_locations_loop = np.random.normal(x_Grid[i],np.abs(x_Grid[i]), N_Monte_Carlo_Samples).reshape(-1,)
    
    # Append to List
    measures_locations_list.append(measures_locations_loop.reshape(-1,1))
    measures_weights_list.append(measure_weights)
    
    # Print Update User #
    #-------------------#
    if (i/N_Grid_Instances)*100 % 10 ==0:
        print("Current Monte-Carlo Step:",i/N_Grid_Instances)
    
# Update User
print("Done Simulation Step")

100%|██████████| 1000/1000 [00:00<00:00, 51281.38it/s]

Current Monte-Carlo Step: 0.0
Current Monte-Carlo Step: 0.1
Current Monte-Carlo Step: 0.2
Current Monte-Carlo Step: 0.3
Current Monte-Carlo Step: 0.4
Current Monte-Carlo Step: 0.5
Current Monte-Carlo Step: 0.6
Current Monte-Carlo Step: 0.7
Current Monte-Carlo Step: 0.8
Current Monte-Carlo Step: 0.9
Done Simulation Step





#### Get Cover

## Get "Sample Barycenters":
Let $\{\mu_n\}_{n=1}^N\subset\mathcal{P}_1(\mathbb{R}^d)$.  Then, the *sample barycenter* is defined by:
1. $\mathcal{M}^{(0)}\triangleq \left\{\hat{\mu}_n\right\}_{n=1}^N$,
2. For $1\leq n\leq \mbox{N sample barycenters}$: 
    - $
\mu^{\star}\in \underset{\tilde{\mu}\in \mathcal{M}^{(n)}}{\operatorname{argmin}}\, \sum_{n=1}^N \mathcal{W}_1\left(\mu^{\star},\mu_n\right),
$
    - $\mathcal{M}^{(n)}\triangleq \mathcal{M}^{(n-1)} - \{\mu^{\star}\},$
*i.e., the closest generated measure form the random sample to all other elements of the random sample.*

---
**Note:** *We simplify the computational burden of getting the correct classes by putting this right into this next loop.*

---

## Build Dissimilarity (Distance) Matrix

In [81]:
# Initialize Disimilarity Matrix
Dissimilarity_matrix_ot = np.zeros([N_Grid_Instances,N_Grid_Instances])


# Update User
print("\U0001F61A"," Begin Building Distance Matrix"," \U0001F61A")
# Build Disimilarity Matrix
for i in tqdm(range(N_Grid_Instances)):
    for j in range(N_Grid_Instances):
        Dissimilarity_matrix_ot[i,j] = ot.emd2_1d(measures_locations_list[j],
                                                  measures_locations_list[i])
# Update User
print("\U0001F600"," Done Building Distance Matrix","\U0001F600","!")

  0%|          | 1/1000 [00:00<02:00,  8.28it/s]

😚  Begin Building Distance Matrix  😚


100%|██████████| 1000/1000 [01:33<00:00, 10.67it/s]

😀  Done Building Distance Matrix 😀 !





## Initialize Quantities to Loop Over

In [82]:
# Initialize Locations Matrix (Internal to Loop)
measures_locations_list_current = copy.copy(measures_locations_list)
Dissimilarity_matrix_ot_current = Dissimilarity_matrix_ot
Barycenters_sample = []

# Initialize Classes (In-Sample)
Classifer_Wasserstein_Centers = np.zeros([N_Grid_Instances,N_Quantizers_to_parameterize])

## Get "Sample Barycenters" and Generate Classes

In [83]:
# Update User
print("\U0001F61A"," Begin Identifying Sample Barycenters"," \U0001F61A")

# Identify Sample Barycenters
for i in tqdm(range(N_Quantizers_to_parameterize)):    
    # Distance-Based Sorting
    ## Get Distances from training data
    Distances_Loop = Dissimilarity_matrix_ot_current.sum(axis=1)

    ## Get Barycenter
    Barycenter_index = Distances_Loop.argsort()[:1][0]
    measures_locations_list_current[Barycenter_index]
    ## Identify Cluster for this barycenter (which elements are closest to it)
    Cluster_indices = Distances_Loop.argsort()[:N_Elements_Per_Cluster]

    # Updates
    ## Get Barycenter 
    Barycenter_loop = measures_locations_list_current[Barycenter_index]
    ## Update Barycenters List
    Barycenters_sample.append(Barycenter_loop)
    ## Update Barycenters Array
    if i == 0:
        # Initialize Barycenters Array
        Barycenters_Array = Barycenter_loop
    else:
        # Populate Barycenters Array
        Barycenters_Array = np.append(Barycenters_Array,Barycenter_loop,axis=-1)

    ## Update Samples List
    ### Remove from Pairwise Distance Matrix
    Dissimilarity_matrix_ot_current = np.delete(Dissimilarity_matrix_ot_current,Cluster_indices, axis=1)
    Dissimilarity_matrix_ot_current = np.delete(Dissimilarity_matrix_ot_current,Cluster_indices, axis=0)
    ### Remove from Sample Measures
    for index in sorted(Cluster_indices, reverse=True):
        del measures_locations_list_current[index]

    # Update Classes
    Classifer_Wasserstein_Centers[Cluster_indices,i] = 1


# Update User
print("\U0001F600"," Done Identifying Sample Barycenters","\U0001F600","!")

  3%|▎         | 13/500 [00:00<00:03, 124.85it/s]

😚  Begin Identifying Sample Barycenters  😚


100%|██████████| 500/500 [00:01<00:00, 452.52it/s]

😀  Done Identifying Sample Barycenters 😀 !





---

### Train Classifier

In this step, we train a deep (feed-forward) classifier:
$$
\hat{f}\triangleq \operatorname{Softmax}_N\circ W_J\circ \sigma \bullet \dots \sigma \bullet W_1,
$$
to identify which barycenter we are closest to.

#### Deep Classifier
Prepare Labels/Classes

In [84]:
# Time-Elapsed Training Deep Classifier
Type_A_timer_Begin = time.time()

Re-Load Grid and Redefine Relevant Input/Output dimensions in dictionary.

In [85]:
# Re-Load Hyper-parameter Grid
exec(open('Grid_Enhanced_Network.py').read())
# Re-Load Helper Function(s)
exec(open('Helper_Functions.py').read())

# Redefine (Dimension-related) Elements of Grid
param_grid_Deep_Classifier['input_dim'] = [1]
param_grid_Deep_Classifier['output_dim'] = [N_Quantizers_to_parameterize]

Deep Feature Builder - Ready
Deep Classifier - Ready


#### Train Deep Classifier

In [86]:
# Train simple deep classifier
predicted_classes_train, predicted_classes_test, N_params_deep_classifier = build_simple_deep_classifier(n_folds = CV_folds, 
                                                                                                        n_jobs = n_jobs, 
                                                                                                        n_iter =n_iter, 
                                                                                                        param_grid_in=param_grid_Deep_Classifier, 
                                                                                                        X_train = x_Grid, 
                                                                                                        y_train = Classifer_Wasserstein_Centers,
                                                                                                        X_test = x_Grid)

Fitting 2 folds for each of 1 candidates, totalling 2 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    9.9s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    9.9s finished


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


In [87]:
# Time-Elapsed Training Deep Classifier
Type_A_timer_End = time.time() - Type_A_timer_Begin

#### Get Predicted Quantized Distributions

In [88]:
Predictions_Train = np.matmul(predicted_classes_train,Barycenters_Array.T)
Predictions_Test = np.matmul(predicted_classes_test,Barycenters_Array.T)

#### Write Predictions

Compute Performance

In [89]:
# Initialize Wasserstein-1 Error Distribution
W1_errors = np.array([])
Mean_errors = np.array([])

# Get Predicted Means
predicted_means = Predictions_Train.mean(axis=1)
#---------------------------------------------------------------------------------------------#

# Populate Error Distribution
for x_i in range(len(measures_locations_list)):
    # Get Laws
    W1_errors = np.append(W1_errors,ot.emd2_1d(Predictions_Train[x_i,].reshape(-1,1),
                                               measures_locations_list[x_i]))
    # Get Means
    Mean_errors = np.array(predicted_means[x_i]-np.mean(measures_locations_list[x_i]))
    
#---------------------------------------------------------------------------------------------#
# Compute Error Statistics/Descriptors
W1_Performance = np.array([np.mean(np.abs(W1_errors)),np.mean(W1_errors**2)])
Mean_prediction_Performance = np.array([np.mean(np.abs(Mean_errors)),np.mean(Mean_errors**2)])

Type_A_Prediction = pd.DataFrame({"W1":W1_Performance,"EX":Mean_prediction_Performance},index=["MAE","MSE"])

# Write Performance
Type_A_Prediction.to_latex((results_tables_path+"Type_A_Prediction.tex"))


#---------------------------------------------------------------------------------------------#
# Update User
print(Type_A_Prediction)

           W1        EX
MAE  1.786451  1.574779
MSE  6.695309  2.479928


---

---
# Fin
---

---