# Deep Weak Stochastic Processes
---

## Meta-Parameters

### Simulation

In [1]:
## Monte-Carlo
N_Euler_Maruyama_Steps = 5
N_Monte_Carlo_Samples = 10000

## Grid
N_Grid_Finess = 1000
Max_Grid = 2

### Quantization

In [2]:
N_Quantizer_support = 5
N_Quantizers_to_parameterize = 0

**Note**: Setting *N_Quantizers_to_parameterize* prevents any barycenters and sub-sampling.

# Training Algorithm
---
Given a set of training inputs $\mathbb{X}$ and a stochastic process $(X_t)_{t\geq 0}$ which we can sample from:
1. **For:** x in $\mathbb{X}$:
    - *Simulate:* $\{x\mapsto X_T(\omega_n)\}_{n=1}^N$
    - *Set*: $\hat{\nu}_{x,T}\triangleq \frac1{N}\sum_{n=1}^N \delta_{X_T(\omega_n)}$
2. **Learn:** Wasserstein Barycenters $\hat{\mu}_1,\dots,\hat{\mu}_N
    \in \underset{{\hat{\mu}_n\in\mathscr{P}_{N}(\mathbb{R}^d)}}{\operatorname{argmin}}
    \, \sum_{n=1}^N W_1(\hat{\mu_n},\hat{\nu}_{x,T})$
3. **Train Classifier:** $\hat{f}:x\mapsto \operatorname{n\leq N}\, W_1(\hat{\mu_n},\hat{\nu}_{x,T})$

#### Mode: Code-Testin Parameter(s)

In [3]:
trial_run = True

### Meta-parameters

In [4]:
# Test-size Ratio
test_size_ratio = .3

### Hyperparameters

Only turn of if running code directly here, typically this script should be run be called by other notebooks.  

In [5]:
# load dataset
results_path = "./outputs/models/"
results_tables_path = "./outputs/results/"
raw_data_path_folder = "./inputs/raw/"
data_path_folder = "./inputs/data/"

### Import

In [6]:
# Load Packages/Modules
exec(open('Init_Dump.py').read())
# Load Hyper-parameter Grid
exec(open('Grid_Enhanced_Network.py').read())
# Load Helper Function(s)
exec(open('Helper_Functions.py').read())
# Import time separately
import time

Using TensorFlow backend.


Deep Feature Builder - Ready
Deep Classifier - Ready


### Set Seed

In [7]:
random.seed(2021)
np.random.seed(2021)
tf.random.set_seed(2021)

---

### Simulate Path
$d X_t = \alpha(t,x)dt + \beta(t,x)dW_t ;\qquad X_0 =x$

### Drift

In [8]:
def alpha(t,x):
    return np.sin(math.pi*t)

### Volatility

In [9]:
def beta(t,x):
    return (t+1)**.5

## Initialize Grid
This is $\mathbb{X}$ and it represents the grid of initial states.

In [10]:
# Get Input Data
x_Grid = np.arange(start=-Max_Grid,
                   stop=Max_Grid,
                   step=(2*Max_Grid/N_Grid_Finess))

# Get Number of Instances in Grid
N_Grid_Instances = len(x_Grid)

# Updater User
print("Grid Instances: ", N_Grid_Instances)

Grid Instances:  1000


### Path Generator

Generates the empirical measure $\sum_{n=1}^N \delta_{X_T(\omega_n)}$ of $X_T$ conditional on $X_0=x_0\in \mathbb{R}$ *($x_0$ and $T>0$ are user-provided)*.

In [11]:
def Euler_Maruyama_Generator(x_0,
                             N_Euler_Maruyama_Steps = N_Euler_Maruyama_Steps,
                             N_Monte_Carlo_Samples = N_Monte_Carlo_Samples,
                             T = 1): 
    
    #----------------------------#    
    # DEFINE INTERNAL PARAMETERS #
    #----------------------------#
    # Initialize Empirical Measure
    X_T_Empirical = np.zeros(N_Monte_Carlo_Samples)


    # Internal Initialization(s)
    ## Initialize current state
    n_sample = 0
    ## Initialize Incriments
    dt = T/N_Euler_Maruyama_Steps
    sqrt_dt = np.sqrt(dt)

    #-----------------------------#    
    # Generate Monte-Carlo Sample #
    #-----------------------------#
    while n_sample < N_Monte_Carlo_Samples:
        # Reset Step Counter
        t = 1
        # Initialize Current State 
        X_current = x_0
        # Perform Euler-Maruyama Simulation
        while t<N_Euler_Maruyama_Steps:
            # Update Internal Parameters
            ## Get Current Time
            t_current = t*(T/N_Euler_Maruyama_Steps)

            # Update Generated Path
            X_current = X_current + alpha(t_current,X_current)*dt + beta(t_current,X_current)*np.random.normal(0,sqrt_dt)

            # Update Counter (EM)
            t = t+1

        # Update Empirical Measure
        X_T_Empirical[n_sample] = X_current

        # Update Counter (MC)
        n_sample = n_sample + 1

    return X_T_Empirical#.reshape(1,-1)

---

In [12]:
# Initialize List of Barycenters
Wasserstein_Barycenters = []
# Initialize Terminal-Time Empirical Measures
measures_locations_list = []
measures_weights_list = []
# Initialize (Empirical) Weight(s)
measure_weights = np.ones(N_Monte_Carlo_Samples)/N_Monte_Carlo_Samples
# Initialize Quantizer
Init_Quantizer_generic = np.ones(N_Quantizer_support)/N_Quantizer_support

## Generate $\{\hat{\nu}^{N}_{T,x}\}_{x \in \mathbb{X}}$ Build Wasserstein Cover

#### Get Data

In [16]:
for i in range(N_Grid_Instances):
    # Get Terminal Distribution Shape
    # EM METHOD
#     measures_locations_loop = Euler_Maruyama_Generator(x_0=x_Grid[i])
    # DIRECT SAMPLING
    measures_locations_loop = np.random.lognormal(np.exp(x_Grid[i]), 0.01, N_Monte_Carlo_Samples).reshape(-1,)
    
    # Append to List
    measures_locations_list.append(measures_locations_loop.reshape(-1,1))
    measures_weights_list.append(measure_weights)
    
    # Print Update User #
    #-------------------#
    print("Current Monte-Carlo Step:",i/N_Grid_Instances)
    
# Update User
print("Done Simulation Step")

Current Monte-Carlo Step: 0.0
Current Monte-Carlo Step: 0.001
Current Monte-Carlo Step: 0.002
Current Monte-Carlo Step: 0.003
Current Monte-Carlo Step: 0.004
Current Monte-Carlo Step: 0.005
Current Monte-Carlo Step: 0.006
Current Monte-Carlo Step: 0.007
Current Monte-Carlo Step: 0.008
Current Monte-Carlo Step: 0.009
Current Monte-Carlo Step: 0.01
Current Monte-Carlo Step: 0.011
Current Monte-Carlo Step: 0.012
Current Monte-Carlo Step: 0.013
Current Monte-Carlo Step: 0.014
Current Monte-Carlo Step: 0.015
Current Monte-Carlo Step: 0.016
Current Monte-Carlo Step: 0.017
Current Monte-Carlo Step: 0.018
Current Monte-Carlo Step: 0.019
Current Monte-Carlo Step: 0.02
Current Monte-Carlo Step: 0.021
Current Monte-Carlo Step: 0.022
Current Monte-Carlo Step: 0.023
Current Monte-Carlo Step: 0.024
Current Monte-Carlo Step: 0.025
Current Monte-Carlo Step: 0.026
Current Monte-Carlo Step: 0.027
Current Monte-Carlo Step: 0.028
Current Monte-Carlo Step: 0.029
Current Monte-Carlo Step: 0.03
Current Monte

Current Monte-Carlo Step: 0.296
Current Monte-Carlo Step: 0.297
Current Monte-Carlo Step: 0.298
Current Monte-Carlo Step: 0.299
Current Monte-Carlo Step: 0.3
Current Monte-Carlo Step: 0.301
Current Monte-Carlo Step: 0.302
Current Monte-Carlo Step: 0.303
Current Monte-Carlo Step: 0.304
Current Monte-Carlo Step: 0.305
Current Monte-Carlo Step: 0.306
Current Monte-Carlo Step: 0.307
Current Monte-Carlo Step: 0.308
Current Monte-Carlo Step: 0.309
Current Monte-Carlo Step: 0.31
Current Monte-Carlo Step: 0.311
Current Monte-Carlo Step: 0.312
Current Monte-Carlo Step: 0.313
Current Monte-Carlo Step: 0.314
Current Monte-Carlo Step: 0.315
Current Monte-Carlo Step: 0.316
Current Monte-Carlo Step: 0.317
Current Monte-Carlo Step: 0.318
Current Monte-Carlo Step: 0.319
Current Monte-Carlo Step: 0.32
Current Monte-Carlo Step: 0.321
Current Monte-Carlo Step: 0.322
Current Monte-Carlo Step: 0.323
Current Monte-Carlo Step: 0.324
Current Monte-Carlo Step: 0.325
Current Monte-Carlo Step: 0.326
Current Mont

Current Monte-Carlo Step: 0.601
Current Monte-Carlo Step: 0.602
Current Monte-Carlo Step: 0.603
Current Monte-Carlo Step: 0.604
Current Monte-Carlo Step: 0.605
Current Monte-Carlo Step: 0.606
Current Monte-Carlo Step: 0.607
Current Monte-Carlo Step: 0.608
Current Monte-Carlo Step: 0.609
Current Monte-Carlo Step: 0.61
Current Monte-Carlo Step: 0.611
Current Monte-Carlo Step: 0.612
Current Monte-Carlo Step: 0.613
Current Monte-Carlo Step: 0.614
Current Monte-Carlo Step: 0.615
Current Monte-Carlo Step: 0.616
Current Monte-Carlo Step: 0.617
Current Monte-Carlo Step: 0.618
Current Monte-Carlo Step: 0.619
Current Monte-Carlo Step: 0.62
Current Monte-Carlo Step: 0.621
Current Monte-Carlo Step: 0.622
Current Monte-Carlo Step: 0.623
Current Monte-Carlo Step: 0.624
Current Monte-Carlo Step: 0.625
Current Monte-Carlo Step: 0.626
Current Monte-Carlo Step: 0.627
Current Monte-Carlo Step: 0.628
Current Monte-Carlo Step: 0.629
Current Monte-Carlo Step: 0.63
Current Monte-Carlo Step: 0.631
Current Mon

Current Monte-Carlo Step: 0.892
Current Monte-Carlo Step: 0.893
Current Monte-Carlo Step: 0.894
Current Monte-Carlo Step: 0.895
Current Monte-Carlo Step: 0.896
Current Monte-Carlo Step: 0.897
Current Monte-Carlo Step: 0.898
Current Monte-Carlo Step: 0.899
Current Monte-Carlo Step: 0.9
Current Monte-Carlo Step: 0.901
Current Monte-Carlo Step: 0.902
Current Monte-Carlo Step: 0.903
Current Monte-Carlo Step: 0.904
Current Monte-Carlo Step: 0.905
Current Monte-Carlo Step: 0.906
Current Monte-Carlo Step: 0.907
Current Monte-Carlo Step: 0.908
Current Monte-Carlo Step: 0.909
Current Monte-Carlo Step: 0.91
Current Monte-Carlo Step: 0.911
Current Monte-Carlo Step: 0.912
Current Monte-Carlo Step: 0.913
Current Monte-Carlo Step: 0.914
Current Monte-Carlo Step: 0.915
Current Monte-Carlo Step: 0.916
Current Monte-Carlo Step: 0.917
Current Monte-Carlo Step: 0.918
Current Monte-Carlo Step: 0.919
Current Monte-Carlo Step: 0.92
Current Monte-Carlo Step: 0.921
Current Monte-Carlo Step: 0.922
Current Mont

#### Get Cover

In [None]:
# Initialization(s)
## Initialize remaining part of f(X) to cover
measures_locations_list_covering = measures_locations_list
## Initialize Centers of Open Cover
Centers_Wasserstein_Open_balls = np.array([])
## Initialize counter
current_counter_measures = 0

# Update User
print(len(measures_locations_list_covering))

if N_Quantizers_to_parameterize  > 0:
    # Build Cover
    while len(measures_locations_list_covering)>N_Quantizers_to_parameterize:
        # 1) Get Barycenter
        #----------------------------------------------------------------------------------------------------#
        # Get Barycenter
        Wasserstein_barycenter_current = ot.lp.free_support_barycenter(measures_locations_list_covering, 
                                                                       measures_weights_list, 
                                                                       Init_Quantizer_generic.reshape(-1,1), 
                                                                       Init_Quantizer_generic)

        # 2) Parse Data (Determine which data is closest to current barycenter)
        #----------------------------------------------------------------------------------------------------#
        # Initialize Disimilarity Matrix
        Dissimilarity_matrix_ot = np.zeros(N_Grid_Instances)

        # Compute Disimilarity Matrix
        for i in range(N_Grid_Instances):
            Dissimilarity_matrix_ot[i] = ot.emd2_1d(Wasserstein_barycenter_current,
                                                    measures_locations_list_covering[i])

    #         # Update User (Periodically)
    #         if i % 50 == 0:
    #             print("Disimilarity Matrix",i,"From Step:",measures_locations_list_covering/N_Quantizers_to_parameterize)


        # Decide which are remaning data
        Boolean_List_Filter = Dissimilarity_matrix_ot.argsort()>=N_Quantizers_to_parameterize


        #---------------------------------------------------------------------------------------------------------------#
        # Update Remaining Samples
        measures_locations_list_covering = list(compress(measures_locations_list_covering, Boolean_List_Filter))

        # Update Size of Grid
        N_Grid_Instances = len(measures_locations_list_covering)

        # Update Collection of Barycenters
        if current_counter_measures == 0:
            Centers_Wasserstein_Open_balls = Wasserstein_barycenter_current
        else:
            Centers_Wasserstein_Open_balls = np.append(Centers_Wasserstein_Open_balls,Wasserstein_barycenter_current,1)


        # 3) Update Counters
        #---------------------------------------------------------------------------------------------------------------#
        current_counter_measures = current_counter_measures + 1


        # Update User #
        #-------------#
        print("Current Step:", current_counter_measures/N_Quantizers_to_parameterize)

        
else:
    for i in range(len(measures_locations_list_covering)):
        # Update Collection of Barycenters
        if current_counter_measures == 0:
            Centers_Wasserstein_Open_balls = measures_locations_list_covering[i]
        else:
            Centers_Wasserstein_Open_balls = np.append(Centers_Wasserstein_Open_balls,measures_locations_list_covering[i],1)
        # 3) Update Counters
        #---------------------------------------------------------------------------------------------------------------#
        current_counter_measures = current_counter_measures + 1

        
#---------------------------------------------------------------------------------------------------------------#
## Get number of centers produced
N_centers_produced = Centers_Wasserstein_Open_balls.shape[1]
# Update User
print(N_centers_produced,"Centers were prodiced to cover sampled grid's image!🙃🙃")

2000


#### Build Classes
Next we identify the index of the $\{\hat{\mu}_n\}$ *(build in the last step)* which is closest to any input datum $x \in \mathbb{X}$.

In [None]:
# Initialize Classes (In-Sample)
Classifer_Wasserstein_Centers = np.zeros([N_Grid_Finess,N_centers_produced])
# Classifer_Wasserstein_Centers = np.zeros(N_Grid_Finess)

# Build Classes
for x_index_current in range(N_Grid_Finess):
    # (RE-)Initialize current distance vector
    Distance_Vector_loop = np.zeros(N_centers_produced)
    # Get Distances
    for i in range(N_centers_produced):
        Distance_Vector_loop[i] = ot.emd2_1d(measures_locations_list[0],
                                             Centers_Wasserstein_Open_balls[:,i].reshape(-1,1))

    # Get Classes (Boolean Values corresponding to which barycenter is closest)
    Classifer_Wasserstein_Centers[x_index_current,] = np.min(Distance_Vector_loop)==Distance_Vector_loop
#     Classifer_Wasserstein_Centers[x_index_current,] = np.argmin(Distance_Vector_loop)
    
# Covert to Integer Type
Classifer_Wasserstein_Centers = Classifer_Wasserstein_Centers.astype(int)

---

### Train Classifier

#### Deep Classifier
Prepare Labels/Classes

In [22]:
# Time-Elapsed Training Deep Classifier
Type_A_timer_Begin = time.time()

Re-Load Grid and Redefine Relevant Input/Output dimensions in dictionary.

In [23]:
# Re-Load Hyper-parameter Grid
exec(open('Grid_Enhanced_Network.py').read())
# Re-Load Helper Function(s)
exec(open('Helper_Functions.py').read())

# Redefine (Dimension-related) Elements of Grid
param_grid_Deep_Classifier['input_dim'] = [1]
param_grid_Deep_Classifier['output_dim'] = [N_centers_produced]

Deep Feature Builder - Ready
Deep Classifier - Ready


#### Train Deep Classifier

In [24]:
# Train simple deep classifier
predicted_classes_train, predicted_classes_test, N_params_deep_classifier = build_simple_deep_classifier(n_folds = CV_folds, 
                                                                                                        n_jobs = n_jobs, 
                                                                                                        n_iter =n_iter, 
                                                                                                        param_grid_in=param_grid_Deep_Classifier, 
                                                                                                        X_train = x_Grid, 
                                                                                                        y_train = Classifer_Wasserstein_Centers,
                                                                                                        X_test = x_Grid)

Fitting 2 folds for each of 1 candidates, totalling 2 fits


[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    5.9s remaining:    0.0s
[Parallel(n_jobs=4)]: Done   2 out of   2 | elapsed:    5.9s finished


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


In [25]:
# Time-Elapsed Training Deep Classifier
Type_A_timer_End = time.time() - Type_A_timer_Begin

#### Get Predicted Quantized Distributions

In [28]:
Centers_Wasserstein_Open_balls.shape

(10000, 1)

In [26]:
Predictions_Train = np.matmul(predicted_classes_train,Centers_Wasserstein_Open_balls.T)
Predictions_Test = np.matmul(predicted_classes_test,Centers_Wasserstein_Open_balls.T)

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 1000)

#### Write Predictions

Compute Performance

In [None]:
# Initialize Wasserstein-1 Error Distribution
W1_errors = np.array([])
Mean_errors = np.array([])

# Get Predicted Means
predicted_means = Predictions_Train.mean(axis=1)
#---------------------------------------------------------------------------------------------#

# Populate Error Distribution
for x_i in range(len(measures_locations_list)):
    # Get Laws
    W1_errors = np.append(W1_errors,ot.emd2_1d(Predictions_Train[x_i,].reshape(-1,1),
                                               measures_locations_list[x_i]))
    # Get Means
    Mean_errors = np.array(predicted_means[x_i]-np.mean(measures_locations_list[x_i]))
    
#---------------------------------------------------------------------------------------------#
# Compute Error Statistics/Descriptors
W1_Performance = np.array([np.mean(np.abs(W1_errors)),np.mean(W1_errors**2)])
Mean_prediction_Performance = np.array([np.mean(np.abs(Mean_errors)),np.mean(Mean_errors**2)])

Type_A_Prediction = pd.DataFrame({"W1":W1_Performance,"EX":Mean_prediction_Performance},index=["MAE","MSE"])

# Write Performance
Type_A_Prediction.to_latex((results_tables_path+"Type_A_Prediction.tex"))


#---------------------------------------------------------------------------------------------#
# Update User
print(Type_A_Prediction)

In [None]:
avg = 0
for i in range(len(measures_locations_list)):
    avg = avg + np.mean(measures_locations_list[i])
avg = avg/len(measures_locations_list)
# Update User
avg

---

---

---

---
# Fin
---