<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

# Machine Learning for Finance

## Applications

Dr Yves J Hilpisch | The Python Quants GmbH

<a href="http://tpq.io" target="_blank">http://tpq.io</a> | <a href="http://twitter.com/dyjh" target="_blank">@dyjh</a> | <a href="mailto:ai@tpq.io">ai@tpq.io</a>

## Regime Detection

Based on **unsupervised learning** (= features only).

Regimes could be, for example:

* low volatility, positive trend
* high volatility, positive trend
* low volatility, negative trend
* high volatility, negative trend

In [None]:
import numpy as np
import pandas as pd
from pylab import plt
plt.style.use('seaborn')
%config InlineBackend.figure_format = 'svg'

In [None]:
url = 'https://certificate.tpq.io/mlfin.csv'

In [None]:
raw = pd.read_csv(url, index_col=0, parse_dates=True)

In [None]:
data = pd.DataFrame(raw['.SPX']).dropna()

In [None]:
data.info()

In [None]:
data.plot();

In [None]:
data['r'] = np.log(data / data.shift(1))

In [None]:
window = 40

In [None]:
data['m'] = data['r'].rolling(window).mean()  # rolling momentum/trend

In [None]:
data['v'] = data['r'].rolling(window).std()  # rolling volatility

In [None]:
data.head()

In [None]:
data.dropna(inplace=True)

In [None]:
data[['m', 'v']].plot();

In [None]:
data['m'] = data['m'] * 252
data['v'] = data['v'] * 252 ** 0.5

In [None]:
f = ['m', 'v']

In [None]:
data[f].plot.scatter(x='v', y='m');

In [None]:
data[f] = (data[f] - data[f].mean()) / data[f].std()

In [None]:
data[f].plot.scatter(x='v', y='m');

In [None]:
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture

In [None]:
model = KMeans(n_clusters=4)
# model = GaussianMixture(n_components=4)

In [None]:
model.fit(data[f])

In [None]:
model.predict(data[f])

In [None]:
data['p'] = model.predict(data[f])

In [None]:
data[f].plot.scatter(x='v', y='m', c=data['p'], cmap='coolwarm');

In [None]:
plt.scatter(x=data.index, y=data['.SPX'], c=data['p'],
            marker='.', cmap='coolwarm');

## Black-Scholes-Merton

In [None]:
from itertools import product

In [None]:
from bsm import bsm_call_value

In [None]:
bsm_call_value(S0=100, K=105, T=1, r=0.05, sigma=0.2)

In [None]:
S0_ = np.linspace(80, 120, 3)
S0_

In [None]:
K_ = np.linspace(80, 120, 3)
K_

In [None]:
T_ = np.linspace(0.5, 1.5, 3)
T_

In [None]:
r_ = np.linspace(0.01, 0.05, 3)
r_

In [None]:
sigma_ = np.linspace(0.1, 0.3, 3)
sigma_

In [None]:
list(product(S0_, K_))[:8]

In [None]:
data = pd.DataFrame()

In [None]:
%%time
for S0, K, T, r, sigma in product(S0_, K_, T_, r_, sigma_):
    value = bsm_call_value(S0, K, T, r, sigma)
    res = pd.DataFrame({'S0': S0, 'K': K, 'T': T, 'r': r,
                        'sigma': sigma, 'value': value}, index=[0])
    data = pd.concat((data, res), ignore_index=True)

In [None]:
data.info()

In [None]:
data.head()

In [None]:
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error

In [None]:
# MLPRegressor?

In [None]:
model = MLPRegressor(hidden_layer_sizes=[512, 512, 512],
                     max_iter=2000, learning_rate_init=0.0001,
                     activation='tanh')

In [None]:
f = list(data.columns[:5])
f

In [None]:
%time model.fit(data[f], data['value'])

In [None]:
data['estimate'] = model.predict(data[f])

In [None]:
data.head()

In [None]:
mean_squared_error(data['value'], data['estimate'])

In [None]:
data[['value', 'estimate']].plot();

In [None]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'

In [None]:
from tensorflow import keras
from keras.layers import Dense
from keras.models import Sequential

In [None]:
# MLPRegressor?

In [None]:
# MLPRegressor(hidden_layer_sizes=[64, 64], solver='adam',
#             activation='relu')

In [None]:
model = Sequential()
model.add(Dense(64, activation='relu', input_dim=len(f)))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mse', optimizer='adam')

In [None]:
%time model.fit(data[f], data['value'], epochs=1000, verbose=False)

In [None]:
data['estimate'] = model.predict(data[f])

In [None]:
data.head()

In [None]:
mean_squared_error(data['value'], data['estimate'])

In [None]:
data[['value', 'estimate']].plot();

In [None]:
d = np.array(((102.5, 107.5, 0.8, 0.015, 0.2),
              (102.5, 107.5, 0.8, 0.0175, 0.15)))       

In [None]:
np.set_printoptions(suppress=True)

In [None]:
model.predict(d)

In [None]:
bsm_call_value(102.5, 107.5, 0.8, 0.015, 0.2)

In [None]:
bsm_call_value(102.5, 107.5, 0.8, 0.0175, 0.15)

## Efficient Markets

Timmermann and Granger (2004):

> A market is efficient with respect to the information set, $S_t$, search technologies, $T_t$, and forecasting models, $M_t$, if it is impossible to make economic profits by trading on the basis of signals produced from a forecasting model in $M_t$ defined over predictor variables in the information set $S_t$ and selected using a search technology in $T_t$.

<img src='http://hilpisch.com/tpq_logo.png' width="35%" align="right">

<br><br><a href="http://tpq.io" target="_blank">http://tpq.io</a> | <a href="http://twitter.com/dyjh" target="_blank">@dyjh</a> | <a href="mailto:ai@tpq.io">ai@tpq.io</a>