<a href="https://colab.research.google.com/github/maudl3116/higherOrderKME/blob/main/examples/Higher_order_DR.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Optimal stopping time
Sec. 4.2 in [paper](https://arxiv.org/pdf/2109.03582.pdf)
***

First we install the higherOrderKME package

In [None]:
!pip install git+https://github.com/maudl3116/higherOrderKME.git

Here we also need to clone the repository to access the data simulator for this experiment

In [None]:
!git clone https://github.com/maudl3116/higherOrderKME.git

In [1]:
%cd higherOrderKME

/content/higherOrderKME


Finally we install specific Python libraries for this experiment

In [None]:
!pip install -r examples/requirements.txt

Now we generate some data

In [2]:
import pandas as pd
import pickle

In [3]:
%cd data/options_utils

/content/higherOrderKME/data/options_utils


**Generate new input-output pairs**

In [None]:
# !pip install --no-cache-dir -e .

We start by generating the prices (output)

In [None]:
# !python optimal_stopping/run/run_algo.py --configs=config1_prices --nb_jobs=2;

Then we generate the sample paths (input)

In [None]:
# !python optimal_stopping/run/generate_paths.py --configs=config1_paths --nb_jobs=2;

In [4]:
# df_ = pd.read_csv('output/metrics_draft/config1_prices.csv')
# df = df_[['hurst','price']]
# prices = df.groupby('hurst', as_index=False)['price'].mean()
# paths  = pickle.load(open('output/metrics_draft/config1_paths.obj','rb'))

**Alternatively load existing data**

In [4]:
data = pickle.load(open('../data_optimal_stopping.obj','rb'))
prices, paths = data['prices'], data['paths']

Imports

In [5]:
%cd ../../

/content/higherOrderKME


In [6]:
from higherOrderKME.KES import model
from higherOrderKME.DR_RBF import model as model_RBF
from higherOrderKME.DR_matern import model as model_mat

**Run RBF baseline**

In [7]:
scores, stdv, results = model_RBF(paths, prices['price'].to_numpy(), at=True, ll=None, 
                                 cv=3, mode='krr', NUM_TRIALS=3)

100%|██████████| 3/3 [03:54<00:00, 78.03s/it]


In [8]:
print('RBF', scores,stdv)

RBF 0.0010070803628937787 0.0007484485197370775


**Run Matern baseline**

In [9]:
scores, stdv, results = model_mat(paths, prices['price'].to_numpy(), at=True, ll=None, 
                                 cv=3, mode='krr', NUM_TRIALS=3)

100%|██████████| 3/3 [04:42<00:00, 94.21s/it]


In [10]:
print('Matern', scores, stdv)

Matern 0.0027480169133565096 0.0030575108330338154


**Run K_S^1 baseline** 

In [11]:
alphas1 = [0.1,1,10]
alphas2 = [1] # there is no alphas2 for MMD1
lambdas = [1] # there is no lambdas for MMD1

scores, stdv, results, _, _, _ = model(paths, prices['price'].to_numpy(),
                                      order=1, alphas1=alphas1, alphas2=alphas2, 
                                      lambdas=lambdas, at=True, ll=None, cv=3, 
                                      mode='krr', num_trials=3)

100%|██████████| 24/24 [01:18<00:00,  3.29s/it]
100%|██████████| 24/24 [01:09<00:00,  2.88s/it]
100%|██████████| 24/24 [01:09<00:00,  2.88s/it]
 33%|███▎      | 1/3 [00:01<00:02,  1.03s/it]

best scaling parameter (cv on the train set):  (1, 1, 1)
best mse score (cv on the train set):  0.002824595069719927


 67%|██████▋   | 2/3 [00:01<00:00,  1.51it/s]

best scaling parameter (cv on the train set):  (1, 1, 1)
best mse score (cv on the train set):  0.0012808826562819933


100%|██████████| 3/3 [00:01<00:00,  1.61it/s]

best scaling parameter (cv on the train set):  (1, 1, 1)
best mse score (cv on the train set):  0.0010517265074348801





In [12]:
print('MMD1', scores, stdv)

MMD1 0.0010055297396127782 0.0004128193724958566


**Run K_S^2**

In [13]:
alphas1 = [1]
alphas2 = [1]
lambdas = [0.1,1,10]
scores, stdv, results, _, _, _ = model(paths, prices['price'].to_numpy(),
                                      order=2, alphas1=alphas1, alphas2=alphas2, 
                                      lambdas=lambdas, at=True, ll=None, cv=3, 
                                      mode='krr', num_trials=3)

100%|██████████| 24/24 [05:12<00:00, 13.04s/it]
100%|██████████| 24/24 [05:13<00:00, 13.04s/it]
100%|██████████| 24/24 [05:12<00:00, 13.04s/it]
 33%|███▎      | 1/3 [00:02<00:05,  2.66s/it]

best scaling parameter (cv on the train set):  (1, 1, 10)
best mse score (cv on the train set):  0.001870714650013027


 67%|██████▋   | 2/3 [00:03<00:01,  1.33s/it]

best scaling parameter (cv on the train set):  (1, 1, 10)
best mse score (cv on the train set):  0.0008579580018942065


100%|██████████| 3/3 [00:03<00:00,  1.17s/it]

best scaling parameter (cv on the train set):  (1, 1, 0.1)
best mse score (cv on the train set):  0.0022730477944135417





In [14]:
print('MMD2', scores, stdv)

MMD2 0.000531886298712851 0.00022121269197428498
