<a href="https://colab.research.google.com/github/JPDaub/ML-in-Finance-Chapter-4/blob/master/Klaas_ML_in_Finance_Chapter_4_Understanding_Time_Series_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Bayesian deep learning

---



Bayesian deep learning couples Bayesian approaches with deep learning in order to allow models to express uncertainty. The main idea is the concept of inherent uncertainty present in the model. A simple trick to turn regular deep networks into Bayesian deep networks is to activate dropout during predictions and then make multiple predictions. 20 random values between -5 and 5 are used as X values and the sine function of these values are the Y values.

In [None]:
X <- runif(20)*10-5
Y <- sin (X)
Z <- rbind(X,Y)
Z
length(X)
length(Y)

The neural network is relatively straightforward. However, Keras does not allow for a dropout layer in the first layer, wherefore, a Dense layer needs to be added which passes the input value through.

In [None]:
#install.packages('keras', repos='http://cran.rstudio.com/')
library(keras)
#install.packages('tensorflow', repos='http://cran.rstudio.com/')
library(tensorflow)
#install.packages('tidyverse', repos='http://cran.rstudio.com/')
library(tidyverse)
model <- keras_model_sequential()
model %>% 
  layer_dense(1) %>% 
  
  layer_dropout(rate = 0.1) %>%
  layer_dense(units = 20, activation = 'relu') %>%
  layer_dropout(rate = 0.1) %>%
  layer_dense(units = 20, activation = 'relu') %>%
  layer_dropout(rate = 0.1) %>%
  layer_dense(units = 20, activation = 'sigmoid') %>%
  layer_dense(1)

Only a relatively low learning rate is needed to fit this function, so the Keras vanilla stochastic gradient descent optimized is imported in order to set the learning rate there. The model is trained for 10,000 epochs.

In [None]:
model %>% compile(
  optimizer_sgd(lr = 0.1),
  loss_mean_squared_error
)
model %>% fit(
  x = X,
  y = Y,
  epochs = 10000,
  batch_size = 10,
  verbose = getOption("keras.fit_verbose", default = 0)
)

To test the model over a larger range of values, a test data set with 200 values ranging from -10 to 10 in 0.1 intervals is created.

In [None]:
#install.packages('listarrays', repos='http://cran.rstudio.com/')
library(listarrays)
X_Test <- seq(-10,10,0.1)
X_Test <- expand_dims(X_Test, -1)

Using keras.backend  the settings are passed to TensorFlow, which runs the operations in the background. The backend is used to set the learning parameter to 1. The Tensor flow will believe that it is in a state of training and will apply dropout. Then, 100 predictions for the test data are made. The result is a probability distribution for the y value at every instance of x.

To start the process:

1. Run the following code:

In [None]:
#install.packages('tensorflow', repos='http://cran.rstudio.com/')
library(tensorflow)
k_clear_session()
k_set_learning_phase(1)

2. obtain Obtain the distributions with the following code:

In [None]:
probs <- c()
for (i in 1:100){
  out <- predict(model, X_Test)
  probs <- append(probs, out)
}

3. Calculate the mean and the standard deviation for the distributions:


In [None]:
p <- matrix(probs, ncol = 1, byrow =TRUE)
mean <- mean(p)
sd <- sd(p)
mean
sd
dim(p)

#p <- matrix(probs, ncol = 1, byrow =TRUE)

4. Plot the model's predictions with one, two, and four standard deviations:

In [None]:
#install.packages('ggplot2', repos='http://cran.rstudio.com/')
library(ggplot2)
X_Test_Plot <- data.frame(X_Test)
ggplot(X_Test_Plot,aes(x = X_Test_Plot[,1], y= mean))+
  geom_line(aes(color = 'blue'))+
  geom_ribbon(aes(ymin=mean-sd*0.5,ymax=mean+sd*0.5),alpha="0.25",fill="blue")+
  geom_ribbon(aes(ymin=mean-sd,ymax=mean+sd),alpha="0.25",fill="blue")+
  geom_ribbon(aes(ymin=mean-sd*2,ymax=mean+sd*2),alpha="0.25",fill="blue")+
  geom_point(aes(X,Y), color='black')



The graph shows that the model is confident around the areas where data exists and less confident the further it gets from the data points. 

# Summary

This chapter provided a broad range of tools for dealing with time series data, insights into how one-dimensional convolution and recurrent architecture works as well as a simple way to train a model to express uncertainty. 
Recap of the things covered in this chapter:

*   Basic data exploration
*   Fourier transformation and autocorrelation
*   Median forecasting as baseline and sanity check

*   Classic prediction models, ARIMA and Kalman filters
*   Feature design including data loading mechanisms
*   One-dimensional convolutions and variants
*   RNNs and LSTMs
*   Modeling uncertainty through Bayesian deep learning 




