<b> 
    <font size="7">
        Computational Finance and FinTech <br><br>
        M.Sc. International Finance
    </font>
</b>
<br><br>
<img src="pics/HWR.png" width=400px>
<br><br>
<b>
    <font size="5"> 
        Prof. Dr. Natalie Packham <br>
        Berlin School of Economics and Law <br>
        Summer Term 2025
    </font>
</b>

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Financial-Time-Series" data-toc-modified-id="Financial-Time-Series-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Financial Time Series</a></span><ul class="toc-item"><li><span><a href="#Financial-Data" data-toc-modified-id="Financial-Data-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Financial Data</a></span></li><li><span><a href="#Correlation-analysis-and-linear-regression" data-toc-modified-id="Correlation-analysis-and-linear-regression-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Correlation analysis and linear regression</a></span></li><li><span><a href="#Time-series-models:-Empirical-stylised-facts" data-toc-modified-id="Time-series-models:-Empirical-stylised-facts-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Time series models: Empirical stylised facts</a></span></li><li><span><a href="#Time-Series-Models:-GARCH" data-toc-modified-id="Time-Series-Models:-GARCH-4.4"><span class="toc-item-num">4.4&nbsp;&nbsp;</span>Time Series Models: GARCH</a></span></li></ul></li></ul></div>

# Financial Time Series

* Further reading: __Py4Fi, Chapter 8__
* This session also covers material not in __Py4Fi__.
* Time series are ubiquitous in finance. 
* `pandas` is the main library in Python to deal with time series. 

## Financial Data

### Financial data

* For the time being we work with locally stored data files.
* These are in `.csv`-files (comma-separated values), where the data entries in each row are separated by commas. 
* Some initialisation:

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
sns.set()
import matplotlib.pyplot as plt

### Data import
* `pandas` provides a numer of different functions and `DataFrame` methods for importing and exporting data.
* Here we use `pd.read_csv()`.
* The file that we load contains end-of-day data for different financial instruments retrieved from Thomson Reuters. 

In [None]:
filename = './data/tr_eikon_eod_data.csv' # path and filename
f = open(filename, 'r')  
f.readlines()[:5]  # show first five lines

### Data import

In [None]:
data = pd.read_csv(filename,  # import csv-data into DataFrame
                   index_col=0, # take first column as index
                   parse_dates=True)  # index values are datetime

In [None]:
data.info()  # information about the DataFrame object

### Data import

In [None]:
data.head()  

### Data import

In [None]:
data.tail()  

### Data import

In [None]:
data.plot(figsize=(10, 10), subplots=True);  

### Data import

* The identifiers used by Thomson Reuters are so-called RIC's. 
* The financial instruments in the data set are:

In [None]:
instruments = ['Apple Stock', 'Microsoft Stock',
               'Intel Stock', 'Amazon Stock', 'Goldman Sachs Stock',
               'SPDR S&P 500 ETF Trust', 'S&P 500 Index',
               'VIX Volatility Index', 'EUR/USD Exchange Rate',
               'Gold Price', 'VanEck Vectors Gold Miners ETF',
               'SPDR Gold Trust']

### Data import

In [None]:
for ric, name in zip(data.columns, instruments):
    print('{:8s} | {}'.format(ric, name))

### Summary statistics

In [None]:
data.describe().round(2)  

### Summary statistics
* The `aggregate()`-function allows to customise the statistics viewed:

In [None]:
data.aggregate([min,  
                np.mean,  
                np.std,  
                np.median,  
                max]  
).round(2)

### Returns
* When working with financial data we typically (=always - you must have good reasons to deviate from this) work with performance data, i.e., __returns__. 
* Reasoning: 
     * Historical data are mainly used to make forecasts one or several time periods forward. 
     * The daily average stock price over the last eight years is meaningless to make a forecast for tomorrow's stock price. 
     * However, the daily returns are possible scenarios for the next time period(s). 
* The function `pct_change()` calculates discrete returns: 
$$r_t^{\rm d}=\frac{S_{t}-S_{t-1}}{S_{t-1}},$$
     where $S_t$ denotes the stock price at time $t$. 

### Returns

In [None]:
data.pct_change().round(3).head()  

### Returns

In [None]:
data.pct_change().mean().plot(kind='bar', figsize=(10, 6));  

### Returns
* In finance, __log-returns__, also called __continuous returns__, are often preferred over discrete returns: 
$r_t^{\rm c} = \ln\left(\frac{S_t}{S_{t-1}}\right).$
* The main reason is that log-return are additive over time. 
* For example, the log-return from $t-1$ to $t+1$ is the sum of the single-period log-returns: 
$$r_{t-1,t+1}^{\rm c} = \ln \left(\frac{S_{t+1}}{S_t}\right) + \ln \left(\frac{S_t}{S_{t-1}}\right) = \ln\left(\frac{S_{t+1}}{S_t}\cdot \frac{S_t}{S_{t-1}}\right) = \ln\left(\frac{S_{t+1}}{S_{t-1}}\right).$$
* Note: If the sampling (time) interval is small (e.g. one day or one week), then the difference between discrete returns and log-returns is negligible. 

### Returns

In [None]:
rets = np.log(data / data.shift(1))  # calculates log-returns in a vectorised way

In [None]:
rets.head().round(3)  

### Returns

In [None]:
rets.cumsum().apply(np.exp).plot(figsize=(10, 6));  # recover price paths from log-returns

### Resampling
* Down-sampling is achieved by `resample()`:

In [None]:
data.resample('1w', label='right').last().head()  # down-sample to weekly time intervals

## Correlation analysis and linear regression
* To further illustrate how to work with financial time series we consider the S&P 500 stock index and the VIX volatility index. 
* Empirical stylised fact: As the S&P 500 rises, the VIX falls, and vice versa. 
* Note: This is about __correlation__ not __causation__. 

### Correlation analysis

In [None]:
# EOD data from Thomson Reuters Eikon Data API
raw = pd.read_csv('./data/tr_eikon_eod_data.csv', index_col=0, parse_dates=True)
data = raw[['.SPX', '.VIX']].dropna()
data.tail()

### Correlation analysis

In [None]:
data.plot(subplots=True, figsize=(10, 6));

### Correlation analysis
* Transform both data series into log-returns:

In [None]:
rets = np.log(data / data.shift(1)) 
rets.head()

In [None]:
rets.dropna(inplace=True) # drop NaN (not-a-number) entries

### Correlation analysis

In [None]:
rets.plot(subplots=True, figsize=(10, 6));

### Correlation analysis

In [None]:
pd.plotting.scatter_matrix(rets,  
                           alpha=0.2,  
                           diagonal='hist',  
                           hist_kwds={'bins': 35},  
                           figsize=(10, 6));

### Correlation analysis

In [None]:
rets.corr()

### OLS regression
* __Linear regression__ captures the linear relationship between two variables. 
* For two variables $x,y$, we postulate a linear relationship: 
$$ y = \alpha + \beta x + \varepsilon, \quad \alpha, \beta\in \mathbb{R}.$$
* Here, $\alpha$ is the __intercept__, $\beta$ is the __slope (coefficient)__ and $\varepsilon$ is the __error term__. 
* Given  data sample of joint observations $(x_1,y_1), \ldots, (x_n,y_n)$, we set 
$$ y_i = \hat\alpha + \hat\beta x_i + \hat\varepsilon_i,$$
where $\hat\alpha$ and $\hat\beta$ are estimates of $\alpha,\beta$ and $\hat\varepsilon_1,
\ldots, \hat\varepsilon_n$ are the so-called __residuals__. 
* The __ordinary least squares (OLS)__ estimator $\hat\alpha,\hat\beta$ corresponds to those values of $\alpha,\beta$ that minimise the sum of squared residuals: 
$$\min_{\alpha,\beta} \sum_{i=1}^n \varepsilon_i^2 = \sum_{i=1}^n (y_i-\alpha-\beta x_i)^2.$$

### OLS regressions
* Simplest form of OLS regression:

In [None]:
reg = np.polyfit(rets['.SPX'], rets['.VIX'], deg=1)  # fit a linear equation (a polynomial of degree 1)
reg.view() # the fitted paramters

In [None]:
ax = rets.plot(kind='scatter', x='.SPX', y='.VIX', figsize=(8, 5)) 
ax.plot(rets['.SPX'], np.polyval(reg, rets['.SPX']), 'r', lw=2);

### OLS regression
* To do a more refined OLS regression with a proper analysis, use the package `statsmodels`. 

In [None]:
import statsmodels.api as sm

Y=rets['.VIX']
X=rets['.SPX']
X = sm.add_constant(X)

In [None]:
model = sm.OLS(Y,X)
results = model.fit()

In [None]:
results.params

In [None]:
results.predict()[0:10]

### OLS regression

In [None]:
print(results.summary())

### OLS regression: Interpretation of output and forecasting
* The column `coef` lists the coefficients of the regression: the coefficient in the row labelled `const` corresponds to $\hat\alpha$ ($=0.0026$) and the coefficient in the row `.SPX` denotes $\hat\beta$ ($=-6.6515$). 
* The estimated model in the example is thus: 
$$
\text{.VIX} = 0.0026 - 6.6516 \text{.SPX}. 
$$
* The best forecast of the VIX return when observing an S&P return of 2% is therefore $0.0026 - 6.6516\cdot 0.02 = -0.130432 = -13.0432\%$. 

### OLS regression: Validation ($R^2$)
* To __validate__ the model, i.e., to determine, if the model in itself and the explanatory variable(s) make sense, we look $R^2$ and various $p$-values (or confidence intervals or $t$-statistics). 
* $R^2$ measures the fraction of variance in the dependent variable $Y$ that is captured by the regression line; $1-R^2$ is the fraction of $Y$-variance that remaines in the residuals $\varepsilon_i^2$, $i=1,\ldots, n$. 
* In the output above $R^2$ is given as $0.647$. In other words, $64.7\%$ of the variance in VIX returns are "explained" by SPX returns. 
* A high $R^2$ (and this one is high) is necessary for making forecasts. 

### OLS regression: Validation (confidence interval)
* An important hypothesis to test in any regression model is whether the explanatory variable(s) have an effect on the independent variable. 
* This can be translated into testing whether $\beta\not=0$. ($\beta=0$ is the same as saying that the $X$ variable can be removed from the model.)
* Formally, we test the null hypothesis $H_0: \beta=0$ against the alternative hypothesis $H_1: \beta\not=0$. 
* There are several statistics to come to the same conclusion: confidence intervals, $t$-statistics and $p$-values. 
* The __confidence interval__ is an interval around the estimate $\hat\beta$ that we are confident contains the true parameter $\beta$. A typial __confidence level__ is 95%. 
* If the 95% confidence interval does __not__ contain 0, then we say $\beta$ is __statistically significant__ at the 5% (=1-95%) level, and we conclude that $\beta\not=0$. 

### OLS regression: Validation ($t$-statistic)
* The $t$-statistic corresponds to the __number of standard deviations__ that the estimated coefficient $\hat\beta$ is away from $0$ (the mean under $H_0$). 
* For a normal distribution, we have the following rules of thumb: 
    * $66\%$ of observations lie within one standard deviation of the mean
    * $95\%$ of observations lie within two standard deviations of the mean
    * $99.7\%$ of observations lie within three standard deviations of the mean  
<center>
<img src="pics/normal6.png" width=400px>
</center>
* If the sample size is large enough, then the $t$-statistic is approximately normally distributed, and if it is large (in absolute terms), then this is an indication against $\beta=0$. 
* In the example above, the $t$-statistics is -62.559, i.e., $\hat\beta$ is approx. 63 standard deviations away from zero, which is practically impossible. 
    

### OLS regression: Validation ($p$-value)
* The $p$-value expresses the probability of observing a coefficient estimate as extreme (away from zero) as $\hat\beta$ under $H_0$, i.e., when $\beta=0$. 
* In other words, it measures the probability of observing a $t$-statistic as extreme as the one observed if $\beta=0$. 
* If the $p$-value (column ``P>|t|``) is smaller than the desired level of significance (typically 5%), then the $H_0$ can be rejected and we conclude that $\beta\not=0$. 
* In the example above, the $p$-value is given as $0.000$, i.e., it is so small, that we can conclude the estimated coefficient $\hat\beta$ is so extreme (= away from zero) that is virtually impossible to obtain such an estimated if $\beta=0$. 

* Finally, the $F$-test tests the hypotheses $H_0:R^2=0$ versus $H_1:R^2\not=0$. In a multiple regression with $k$ independent variables, this is equivalent to $H_0: \beta_1=\cdots=\beta_k=0$. 
* In the example above, the $p$-value of the $F$-test is $0$, so we conclude that the model overall has explanatory power. 
    

## Time series models: Empirical stylised facts
* We discuss empirical stylised facts of financial time series. 
* The GARCH model is the standard workhorse in financial time series analysis.

### Time series models
* Load data set containing of daily DAX closing prices (1990-2019):

In [None]:
dax = pd.read_csv('./data/yahoo_GDAXI.csv',index_col = 0,na_values = 'null')
dax.head()

### Time series models
* Transform closing prices to log returns:

In [None]:
data=dax['Close']
returns = 100*np.log(data / data.shift(1))
returns.dropna(inplace=True)
returns.head()

### Time series models

In [None]:
returns.plot(figsize=(15,6));

### Time series models
*  When working with a data sample, we often assumes that the data are independent and identically distributed ("iid"). 
* The previous plot shows that the "iid" assumption is violated.
* The "iid" assumption is in general not justified for financial data, and more sophisticated models for time series are more appropriate for capturing phenomena such as volatility clustering.

### Time series models
* An __empirical stylised fact__ of a financial time series is an empirical observations that applies to the majority of (daily) series of asset returns, such as log-returns of equities, indexes, FX rates and commodity prices (see Mcneil, 2005, and Cont, 2001).
* Generally accepted stylised facts of asset returns are: 
     1. Return series are not iid although they show little serial correlation.
     2. Series of absolute or squared returns show profound serial correlation.
     3. Conditional expected returns are close to zero.
     4. Volatility appears to vary over time.
     5. Return series are leptokurtic or heavy-tailed.
     6. Extreme returns appear in clusters.
<font size="2">  
  A.J. McNeil, R. Frey, and P. Embrechts. Quantitative Risk
  Management. Princeton University Press, Princeton, NJ, 2005.
    <br>
  R. Cont. Empirical properties of asset returns: stylized
  facts and statistical issues. Quantitative Finance,
  1(2):223–236, 2001. 
</font>


### Time series models
* The figure below illustrates the first three stylised facts (serial correlation = autocorrelation):

In [None]:
ac = [];
acabs=[];
for i in range(0,30):
    ac.append(returns.autocorr(i))
    acabs.append(abs(returns).autocorr(i))

### Time series models
* The figure below illustrates the first three stylised facts (serial correlation = autocorrelation):

In [None]:
fig = plt.figure(figsize=(12,3))
fig.suptitle("DAX autocorrelation. Left: returns; right: absolute returns")
plt.subplot(121)
plt.bar(range(0,30), ac);
plt.xlabel('days');
plt.ylabel('autocorrelation');
plt.subplot(122)
plt.bar(range(0,30), acabs);
plt.xlabel('days');
plt.ylabel('autocorrelation');

### Time series models
* The excess kurtosis of the DAX returns suggests that more extreme events occurs than a normal distribution would suggest. 

In [None]:
returns.kurtosis()

### Time series models
* The following figures shows the DAX volatility based on a rolling time windows of 252 trading days (approx. one year). 
* This illustrates that volatility varies over time. 

In [None]:
vol=returns.rolling(window=252).std()
vol.dropna(inplace=True)

In [None]:
vol.plot(figsize=(10,4));

### Time series models
* The following figure illustrates the 100 most extreme DAX returns over the time period 1990-2019.
* These are not evenly spaced, but appear in clusters.

In [None]:
m = abs(returns).sort_values()[-100] # the top 100 returns are greater than this
m

In [None]:
mreturns = returns.loc[abs(returns) > m]

In [None]:
ret = pd.DataFrame(returns, index=returns.index)
mret = pd.DataFrame(mreturns, index=mreturns.index)
all = ret.join(mret, lsuffix='_caller', rsuffix='_other') # merge the data into one DataFrame

### Time series models

In [None]:
all.plot(figsize=(15,6), style=['', 'ro'], legend=None);

### Time series models
* These phenomena typically become less pronounced as the time period between successive returns is increased. 
* For daily or weekly data, however, it is clear that a model needs to capture the time series variations, most importantly the time-varying volatility. 


## Time Series Models: GARCH

### The GARCH model

* The class of **GARCH (generalised autoregressive conditional heteroskedastic) models** incorporate time-varying volatility, autocorrelation in absolute / squared returns and fatter tails than suggested by the normal distribution (see Bollerlslev, 1986).
* The GARCH(1,1) is the simplest and most widely used of the family of GARCH-type models.
* A process $X=(X_t)_{t\in \mathbb{Z}}$ is a __GARCH(1,1) process__ if it is satisfies
  $$
  \begin{array}{rcl}
  X_t &=& \sigma_t Z_t\\
  \sigma_t^2 &=& \alpha_0 + \alpha_1 X_{t-1}^2 + \beta \sigma_{t-1}^2,
  \end{array}
  $$
  where the **innovations** $Z_t$, $t=1, 2, \ldots$ are iid standard normally distributed, and $\alpha_0>0$, $\alpha_1\geq 0$ and $\beta\geq 0$. 
* In this model periods of high volatility tend to be __persistent__, that is, if either $|X_{t-1}|$ or $\sigma_{t-1}$ are large, then $|X_t|$ has a tendency to be large as well, which in turn causes a high volatility. 

<font size="2">
    Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, pp. 31 (3), 307--327.
</font>


### Properties of the GARCH model

__Proposition__

Let $X$ be a a GARCH(1,1) process satisfying $\alpha_1+\beta<1$. Then, for all $s,t\in\mathbb{Z}$,
1. $\mathbb{E}(X_t)=0$;
2. $\text{Var}(X_t)=\displaystyle\frac{\alpha_0}{1-\alpha_1-\beta}$;
3. the **autocorrelation** $\mathbb E(X_t\, X_s)/\sqrt{\text{Var}(X_t)\, \text{Var}(X_s)}$ is $0$ whenever $s\not=t$;
4. the variance of $X_t$ conditional on the information up to time $t-1$ is $\sigma_t^2$;
5. the kurtosis of $X_t$ is
    $$    \frac{\mathbb E(X_t^4)}{\mathbb E(X_t^2)^2} = \frac{3\, (1-(\alpha_1+\beta)^2)}{1-(\alpha_1+\beta)^2- 2\, \alpha_1^2}, $$
    In particular, $X_t$ has a positive excess kurtosis.

### Variants of the GARCH process
* The more general GARCH($p,q$) model is defined by setting the variance to
$$
\sigma_t^2 = \alpha_0 + \sum_{i=1}^p \alpha_i X_{t-i}^2 + \sum_{j=1}^q \beta_j
  \sigma_{t-j}^2. 
$$
* There are many extensions of GARCH processes (Integrated GARCH, GARCH with leverage, Threshold GARCH, ...). 


### Fitting a GARCH model

* Given a time series, such as the DAX returns, and postulating a GARCH model, we find the parameters that provide the "best" fit for the data. 
* The best fit is generally obtained via the method of __maximum likelihood__.
* The `arch` library in Python will do this for us. 

### Fitting a GARCH model

In [None]:
from arch import arch_model
ret_demeaned=returns-returns.mean(); # de-mean process, i.e., adjust so that mean is zero
am = arch_model(ret_demeaned, mean = 'Zero')
res = am.fit()

### Fitting a GARCH model

In [None]:
res

### Fitting a GARCH model
* The following parameters are obtained: $\alpha_0=0.0283$, $\alpha_1=0.0823$ and $\beta=0.9019$. 
* All estimates are statistically significant ($p$-values<0.01). 

In [None]:
ret = pd.DataFrame(returns, index=returns.index)
vol = pd.DataFrame(res.conditional_volatility, index=vol.index)
all = ret.join(vol, lsuffix='_caller', rsuffix='_other') # merge the data into one DataFrame

### Fitting a GARCH model
* The plot shows the DAX returns together with the fitted GARCH volatility. 
* The initial volatility is typically chosen as the time series' unconditional volatility.

In [None]:
all.plot(figsize=(15,6), style=['', 'r'],legend=None);

### Validating a GARCH model

* To check the quality of the fit, one can compare the "residuals" $Z_t=X_t/\sigma_t$ with a standard normal
    distribution via a QQ-plot, see figure below. 

In [None]:
import scipy.stats as stats
residuals = (all["Close"]/all["cond_vol"]).dropna() # the residuals
stats.probplot(residuals, dist="norm", plot=plt)
plt.title("Normal Q-Q plot")
plt.show()

### Validating a GARCH model
* This is what the residuals look like: 

In [None]:
residuals.plot(figsize=(15,6));

### Validating a GARCH model

* In case the residuals do not fit the normal distribution, one may - in a second step - fit the residuals to a more appropriate distribution, such as the more heavy-tailed Student $t$. 

### Volatility forecasting
* One use of GARCH models is to forecast future volatility. 
* Given asset returns $x_0, \ldots, x_t$ assume that a GARCH model has been fitted and that the condition     $\alpha_1+\beta<1$ (see Proposition above) is fulfilled. 
* A prediction of $\sigma_{t+1}^2$ is given by
$$
\hat\sigma_{t+1}^2 = \mathbb E(X_{t+1}^2|X_t, \sigma_t) = \alpha_0 +
  \alpha_1 X_t^2 + \beta\sigma_t^2,  
$$
and, more generally for one time period $h$ periods forward,
$$
\hat\sigma_{t+h}^2=\mathbb E(X_{t+h}^2|X_t, \sigma_t) = \alpha_0\,
  \sum_{i=0}^{h-1} (\alpha_1+\beta)^i + 
  (\alpha_1+\beta)^{h-1}\, (\alpha_1 X_t^2 + \beta\sigma_t^2). 
$$
* A derivation of this formula is beyond the scope of the course. 


### Volatility forecasting

In [None]:
res.params

In [None]:
sigmasq_f=[]
tmp=[]
alpha0=res.params[0]
alpha1=res.params[1]
beta=res.params[2]

for i in range(0,251):
    tmp.append((alpha1 + beta)**i)
    
for h in range(1,251):
     sigmasq_f.append(alpha0 * np.sum(tmp[0:h]) + tmp[h-1] * (alpha1 * returns[-1]**2 \
                                                              + beta * res.conditional_volatility[-1]**2))
    

### Volatility forecasting
* The figure below shows the volatility forecast for the DAX return data.
* The red line shows the unconditional standard deviation $\displaystyle
\sqrt{\frac{\alpha_0}{1-\alpha_1-\beta}}$.

In [None]:
unconditional_vol=np.sqrt(res.params[0]/(1-res.params[1]-res.params[2]))*np.ones(len(sigmasq_f))
plt.figure(figsize=(10, 4))
plt.plot(np.sqrt(sigmasq_f))
plt.plot(unconditional_vol, 'r')
plt.title('volatility forecast [%]')
plt.xlabel('days');