# Applied Economic Analysis 1: Python Assignment

>     

| Name         |   ANR  | SNR |
|--------------|--------|-----|
|Anneke Funk   | 601229 | 1250081 |
|Ineke Stoop   | 246830 | 1252895 |
|Ruben Uijting | 216301 | 1254243 |

January 31, 2017

The repository can be found [here] [identification tag for link]

[identification tag for link]: http://github.com/RubenU/assignments

## Research question

<p style="text-align: center;"> *Is it possible to formulate a model that can invest using market-movement indicators and real-time AEX-index data that generates a positive profit?* </p>

## Introduction

<p style="text-align: justify;"> In financial literature, it is evident that markets react tremendously fast on information in the market. Rapid advances in computing, technologies, and the internet, have profoundly changed the dynamics of financial markets ([Fan, Stallaert and Whinston, 2000](http://dl.acm.org/citation.cfm?id=353368)). Information has become widely available and individuals are able to process information faster. More and more people are trading online instead of using full-service brokerages. Due to the large number of players in the markets, as new information arrives, stock prices tend to be either over- or undervalued. Everyone desires to profit from the new information. People overreact, in violations to Bayesâ€™ rule, to unexpected news [(Bondt and Thaler, 1985)](http://onlinelibrary.wiley.com/doi/10.1111/j.1540-6261.1985.tb05004.x/full), which causes fast changes in prices. Another phenomenon that causes over- or undervalued security prices is the irrationality of individuals. Although humans can act rationally and find patterns in stock prices, the emotional side is a disadvantage when it comes to trading. Often too eager or afraid for losses ([Tversky and Kahneman, 1991](http://www.jstor.org/stable/pdf/2937956.pdf)), we can choose options that are suboptimal. While not under pressure, we can make more optimal decisions, therefore it makes sense to code our beliefs and strategy, so that a part of human irrationality can be avoided. </p>

<p style="text-align: justify;"> This notebook tries to find out whether these inefficiencies can be exploited. It uses indicators that solely observe data and processes it without human emotion involved. Like professional stock traders, we use real time data for our research. We use the AEX-index for analysis, because price movements are more smoothened relative to individual securities. This is due to the fact that the AEX-index consists of the weighted market capitalization of the 25 largest listed firms in the Netherlands.  </p>

<p style="text-align: justify;"> The outline of this notebook is as follows: we first start with some assumptions for this research, after which we continue with the data import. We then continue with the methodology and model building. The results are presented, followed by conclusions. Lastly, we discuss what the shortcomings of this research are and recommendations are discussed.  </p>

## Assumptions 

- **Markets overreact when news arrives**
> <p style="text-align: justify;"> As already described in the introduction, markets tend to overreact. It takes time for the information in the market to correctly reflect the beliefs of the market about the security. This implies that the higher the frequency of data per time unit, the more reliable the model should be. </p>

- **The correlations between indicators remains constant**
> <p style="text-align: justify;"> We assume that the relationship between indicators remains constant. By assuming this, we can approximate the optimal indicator values and use them for other datasets. </p>

- **Weak mean reversion assumption**
> <p style="text-align: justify;"> We assume that, since markets tend to overreact to news, the price of a security will converge back to a 'true' value. This means that there are inefficiencies in the market that can be exploited. </p>

## Data import

<p style="text-align: justify;"> It is extremely difficult to find real time data that is available on the internet without losing the confidence that the data is legit. For this very reason, we created our own dataset. On the website [one.iex.nl](https://one.iex.nl) real time data can be found. The issue with this data is that it cannot be downloaded or imported. By using the PIL library and an Optical Character Recognition (OCR) engine called [Tesseract](https://github.com/tesseract-ocr/tesseract) , we were able to import the prices into a Pandas Dataframe. The OCR engine we have used is not a standard package in Python, but by installing and linking the right directory towards [Tesseract](https://github.com/tesseract-ocr/tesseract) OCR we were able to import the dataset. We first import the necessary libraries and set up a DataFrame, as can be seen in the code block below. </p>

In [1]:
from PIL import Image, ImageGrab
from PIL import ImageFilter
import os.path
import win32api, win32con
import pytesseract
import time
pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR/tesseract'
import pandas as pd
AEX = pd.DataFrame([], columns=["TIME", "AEX"])
AEX.set_value(1, 'AEX', 0)

Unnamed: 0,TIME,AEX
1,,0


<p style="text-align: justify;"> The block of code below sets up a small function that prints the coordinates of the mouse position relative to the screen that is used when the code is executed. With these coordinates, we can pinpoint the location where the real time values for the AEX-index are presented. To minimize the chance of faulty data readings, we maximally zoom our webbrowser so that the presented number is large. When this function is executed, it prints the left-upper coordinates, then waits three seconds for us to reposition to the mouse position to the right-lower corner of the area and saves this values. The coordinates are the coordinates of the the screenshot that we will later turn into a number. </p>

In [12]:
x_pad = 0
y_pad = 0

def get_cords():
    x,y = win32api.GetCursorPos()
    upper_x_cord = x
    left_y_cord = y
    print x, y
    time.sleep(3)
    x,y = win32api.GetCursorPos()
    lower_c_cord = x
    right_y_cord = y
    print x, y

<p style="text-align: justify;"> The coordinates of the screenshot that we are going to use are printed below. </p>

In [19]:
get_cords()

69 196
238 261


<p style="text-align: justify;"> The block of code below is a function that retrieves the value for the AEX-index presented on [one.iex.nl](https://one.iex.nl), converts them to float values and sets them in a DataFrame. The main idea is that this function makes a screenshot of the area of where the value is presented, then the OCR engine converts the image to strings (due to limitiations of the library), and finally we convert the string to a float. The values on [one.iex.nl](https://one.iex.nl) have a comma instead of a dot, so this needs to be replaced as well. Finally, we save the result of the OCR engine plus the time at which the process happened in the already specified data frame. </p>

In [30]:
def ONE():
    box = (69, 196, 238, 261)
    im = ImageGrab.grab(box)
    t = pytesseract.image_to_string(im)
    save = str(int(time.time()))
    path = os.getcwd() + '\\full_snap__' + save + '.png'
    aex_input = float(t.replace(',', '.'))
    AEX.set_value(len(AEX)+1, 'AEX', aex_input)
    AEX.index.names = ['TIME']
    AEX.set_value(len(AEX), 'TIME', time.strftime('%H:%M:%S', time.localtime(time.time())))

<p style="text-align: justify;"> A nice property of collecting data this way is that we could implement our model **LINK** while we are collecting data. However, this is not the purpose of this research, we therefore ignore this implementation. The following block of code completes the data import by letting the functions needed to import the values run for a time period that needs to be specified (in minutes) and automatically saves the data frame to an Excel file when the time has passed. </p>

In [31]:
date = str(time.strftime("%Y-%m-%d")) + ".xlsx"

def new_day():
    global date
    if date != str(time.strftime("%Y-%m-%d")) + ".xlsx":
        date = str(time.strftime("%Y-%m-%d")) + ".xlsx"

def start():
    new_day()
    time_to_run  = input() * 60
    stop_time = int(time.time()) + time_to_run
    main()
    while int(time.time()) < stop_time:
        main()
    else:
        date = str(time.strftime("%Y-%m-%d")) + ".xlsx"
        writer = pd.ExcelWriter(date)
        AEX.to_excel(writer, sheet_name='Sheet1')
        writer.save()

def main():
    ONE()
    
if __name__ == '__main__':
    main()

<p style="text-align: justify;"> Since the OCR engine tesseract is hard to download and it takes time for data to be collected, we use an Excel file which has the data of the AEX-index which has already been collected. This can be seen below where the Excel-file is imported and the first five observations are shown. The Excel file has 35599 observations over a period of 332 minutes. This corresponds to approximately 1.84 observations per second.

</p> $$\dfrac{35599}{322 * 60} \approx 1.84$$

In [32]:
AEX = pd.read_excel(open("AEX_test.xlsx",'rb'))
AEX = AEX.set_index(AEX['TIME'])
del AEX["TIME"]
AEX.head()

Unnamed: 0,AEX
2017-01-31 12:47:20,485.88
2017-01-31 12:48:37,485.82
2017-01-31 12:48:59,485.84
2017-01-31 12:49:00,485.84
2017-01-31 12:49:00,485.84


<p style="text-align: justify;"> To have an idea what the data looks like, we have created an interactive graph below.

</p>

In [33]:
import plotly.plotly as py
from plotly.graph_objs import *
import plotly.graph_objs as go
from plotly.tools import FigureFactory as FF
%matplotlib inline
data = [
    go.Scatter(
        y=AEX['AEX'],
        x=AEX.index
    )
]

layout = go.Layout(
    title='AEX-INDEX',
    yaxis=dict(title='Price AEX'),
    xaxis=dict(title='Time')
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

The draw time for this plot will be slow for clients without much RAM.



Estimated Draw Time Slow



<p style="text-align: justify;"> The graph above shows the price of the AEX-index during a time period of a few hours on a regular day. It illustrates what happens when news enters the market around 14:30, and how the market reacts. There is a massive drop an hour later, but later the market reverts back to the original maximum which was reached at 15:00. Now we have the data, we proceed to the methodology and the construction of the model. </p>

## Methodology & Model

<p style="text-align: justify;"> In this part of the notebook, we develop the methodology that results in either a buy or sell signal. The method is straightforward, we classify a set of qualifications that determine a movement. By processing we will end up with a binary classifier, either a "1" for a buy/sell signal, or a "0" when nothing is bound to happen. The first step is to add relevant measures to the data frame, namely a moving average (MA in the remainder of this notebook) and a moving standard deviation (SD for the remainder of this notebook). Both have a variable that specifies the lag. The formulas for the MA and the moving SD are presented below. </p>

$$ MA_{lag} = \dfrac{\sum_{t=T-lag}^{lag} AEX_t}{lag} $$ 

$$ SD_{lag} = \sqrt{\frac1{lag} \sum_{t = T-lag}^{lag} (AEX_t - \mu)^2} $$ 

<p style="text-align: justify;"> We construct the values for the MA and the moving SD in the code below. Complementary, we also construct 95% confidence intervals. This can be interpreted as the interval in which 95% of the lag-amount of observations are located. The graph below includes these newly constructed lines and gives context to our data. We start with a simple 50-lag model to illustrate the idea. </p>

In [34]:
def measurements(name, lag1):
    name["SD"] = pd.rolling_std(name.AEX, lag1)
    name["MA"] = pd.rolling_mean(name.AEX, lag1)
    name["-95%_interval"] = name['MA'] - 1.959964 * name["SD"]
    name["+95%_interval"] = name['MA'] + 1.959964 * name["SD"]

In [35]:
measurements(AEX, 50)

data = [
    go.Scatter(
        y=AEX['AEX'], 
        x=AEX.index,
        name='AEX'
    ),
    go.Scatter(
        y=AEX["-95%_interval"], 
        x=AEX.index,
        name='-95%_interval'
    ),
    go.Scatter(
        y=AEX["+95%_interval"],
        x=AEX.index,
        name='+95%_interval'
    )
]

layout = go.Layout(
    title='AEX-INDEX & 95% intervals',
    yaxis=dict(title='Price AEX'),
    xaxis=dict(title='Time')
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)


pd.rolling_std is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=50,center=False).std()


pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=50,center=False).mean()



The draw time for this plot will be slow for all clients.



Estimated Draw Time Too Long



<p style="text-align: justify;"> By zooming in, it can be seen that the AEX-index mostly stays within the 95% confidence interval. There are points where the difference between the confidence intervals and the AEX-index is fairly big. This is our indication for a price movement that is happening or bound to happen. As can be distilled from the graph, when the difference between the +95% interval and the AEX-index is large, a downward price movement is happening/continuing. The opposite holds for an upward price movement. The code below constructs those signals and converts them to a binary number. The last two lines of the code create a so called 'confidence level' so we filter the real price movements out of the anomalies. </p>

In [36]:
def calibration(name, diff, lag2, critical_point):  #diff, lag_buy, lag_sell, critical_point
    name["diff_buy"] = name['AEX'] - name["-95%_interval"] 
    name["diff_sell"] = name["+95%_interval"] - name['AEX']
    name["buy_signal"] = pd.rolling_mean(name.diff_buy, lag2)
    name["sell_signal"] = pd.rolling_mean(name.diff_sell, lag2)
    
    for x in range(len(name)):
        if name.iloc[x]['buy_signal'] >= diff:
            name.ix[x, 'buy'] = 1
        else:
            name.ix[x, 'buy'] = 0
        if name.iloc[x]['sell_signal'] >= diff:
            name.ix[x, 'sell'] = 1
        else:
            name.ix[x, 'sell'] = 0

    name["BUY"] = pd.rolling_mean(name.buy, critical_point)
    name["SELL"] = pd.rolling_mean(name.sell, critical_point)

<p style="text-align: justify;"> The last piece of the puzzel is to put the previous data to the test. The code below runs over the complete dataset and 'buys' and 'sells' when the criteria above are matched. There are two types of positions, a long position and a short position. A long position is buying the security with the expectation that the price is going to increase.  A short position means that we expect a drop in the price of the security, so we will sell the secutity in *t = 0*, and buy it at a lower price in *t = 1*. From the previous graph we observe that we should close our position when there is an intersection with a confidence interval. In the code below, we only have one open position at the time, which can be a long or short position. Lastly, we construct a column in our dataframe which captures the potential profit. <p\>

In [37]:
def profit(name):
    profit = 0
    sold = 0
    bought = 0

    for x in range(len(name)):
        if name.ix[x, 'BUY'] == 1:
            if (bought + sold) == 0:
                bought = name.ix[x, 'AEX']
            else:
                name.ix[x, 'profit'] = 0
        else:
            name.ix[x, 'profit'] = 0
    
        if bought > 0:
            if name.iloc[x]["-95%_interval"] >= name.iloc[x]["AEX"]:
                profit = (name.ix[x, 'AEX'] - bought)
                name.ix[x, 'profit'] = profit
                bought = 0
    
        if name.ix[x, 'SELL'] == 1:
            if (sold + bought) == 0:
                sold = name.ix[x, 'AEX']
            else:
                name.ix[x, 'profit'] = 0
        else:
            name.ix[x, 'profit'] = 0
    
        if sold > 0:
            if name.iloc[x]["+95%_interval"] <= name.iloc[x]["AEX"]:
                profit = (name.ix[x, 'AEX'] - sold)
                name.ix[x, 'profit'] = profit
                sold = 0

<p style="text-align: justify;"> Note that the only input for these previous functions is "name". We have done this in order to be able to use this code on other datasets. We can give the name of that dataset as input and the code will run. Now that we have our complete dataset, we have to optimize the process. The standard optimization library of python cannot optimize our parameter values since we have created new column values after each iteration. So, in order to find a local optimum, we are going to itterate over a series of values for we think that it entails an optimum. An important assumption here is that the order in which we find an optimal value for a parameter is irrelevant (while in reality this is likely not true). The bandwith of the series is not too wide since the code takes a lot of computational time. The used series value for the parameter and the outcome (profit) is then put into a dictionary. <p\>

In [38]:
list_lag1 = [50, 75, 100, 125, 150, 175, 200]
list_diff = [0.005, 0.01, 0.015, 0.02, 0.025, 0.03, 0.035]
list_lag2 = [50, 75, 100, 125, 150, 175, 200]
list_critical_point = [50, 60, 70, 80, 90, 100]

dlag1 = {}
dlag2 = {}
ddiff = {}
dcritical_point = {}

def full(name, lag1, diff, lag2, critical_point): 
    measurements(name, lag1)
    calibration(name, diff, lag2, critical_point)
    profit(name)

<p style="text-align: justify;"> The function below first finds the value, which lies in the initial series, for the parameter that produces the highest profit. It then creates a new series closer to the local optimum to check whether there are other parameter values that produce a higher profit. All of the results are saved into the corresponding dictionary. When the code for one parameter has run, we use the "operator" library to find the best value. That value is saved and used for the next parameter optimization. When this process has run for every parameter, we print the results. The results of the local optimiziation are shown in four subplots. (Note that changing the initial values in line 6 might result in the other best values, therefore we call the result the local optimum) <p\>

In [63]:
import operator
import numpy as np

def local_optimizer(name):
    for lag1 in list_lag1:
        full(AEX, lag1, 0.02, 125, 75)
        result = name["profit"].sum()
        dlag1[lag1] = result
    max_lag1 = max(dlag1.iteritems(), key=operator.itemgetter(1))[0] 
    opt_list_lag1 = np.arange(int(max_lag1 * 0.95), int(max_lag1 * 1.05), 1)
    for opt_lag1 in opt_list_lag1:
        full(AEX,opt_lag1, 0.02, 125, 75)
        result = name["profit"].sum()
        dlag1[opt_lag1] = result
    max_lag1 = max(dlag1.iteritems(), key=operator.itemgetter(1))[0] 
    
    for lag2 in list_lag2:
        full(AEX, max_lag1, 0.02, lag2, 75)
        result = name["profit"].sum()
        dlag2[lag2] = result
    max_lag2 = max(dlag2.iteritems(), key=operator.itemgetter(1))[0]
    opt_list_lag2 = np.arange(int(max_lag2 * 0.95), int(max_lag2 * 1.05), 1)
    for opt_lag2 in opt_list_lag2:
        full(AEX, max_lag1, 0.02, opt_lag2, 75)
        result = name["profit"].sum()
        dlag2[opt_lag2] = result
    max_lag2 = max(dlag2.iteritems(), key=operator.itemgetter(1))[0] 
    
    for diff in list_diff:
        full(AEX, max_lag1, diff, max_lag2, 75)
        result = name["profit"].sum()
        ddiff[diff] = result
    max_diff = max(ddiff.iteritems(), key=operator.itemgetter(1))[0]
    opt_list_diff = np.arange((max_diff * 0.97), (max_diff * 1.03), 0.0001)
    for opt_diff in opt_list_diff:
        full(AEX, max_lag1, opt_diff, max_lag2, 75)
        result = name["profit"].sum()
        ddiff[opt_diff] = result
    max_diff = max(ddiff.iteritems(), key=operator.itemgetter(1))[0] 
    
    for critical_point in list_critical_point:
        full(name, max_lag1, max_diff, max_lag2, critical_point)
        result = name["profit"].sum()
        dcritical_point[critical_point] = result
    max_critical_point = max(dcritical_point.iteritems(), key=operator.itemgetter(1))[0]
    opt_list_critical_point = np.arange(int(max_critical_point * 0.95), int(max_critical_point * 1.05), 1)
    for opt_critical_point in opt_list_critical_point:
        full(name, max_lag1, max_diff, max_lag2, opt_critical_point)
        result = name["profit"].sum()
        dcritical_point[opt_critical_point] = result
    max_critical_point = max(dcritical_point.iteritems(), key=operator.itemgetter(1))[0]

In [None]:
local_optimizer(AEX)

In [61]:
print max_lag1, max_lag2, max_diff, max_critical_point

167 118 0.0195 57


In [58]:
df_lag1 = pd.DataFrame(dlag1.items(), columns=['Parameter value', 'Profit'])
df_lag1 = df_lag1.sort_values(['Parameter value'], ascending=1)
df_lag2 = pd.DataFrame(dlag2.items(), columns=['Parameter value', 'Profit'])
df_lag2 = df_lag2.sort_values(['Parameter value'], ascending=1)
df_diff = pd.DataFrame(ddiff.items(), columns=['Parameter value', 'Profit'])
df_diff = df_diff.sort_values(['Parameter value'], ascending=1)
df_critical_point = pd.DataFrame(dcritical_point.items(), columns=['Parameter value', 'Profit'])
df_critical_point = df_critical_point.sort_values(['Parameter value'], ascending=1)

In [59]:
from plotly import tools
import plotly.plotly as py
import plotly.graph_objs as go

lag1_plot = go.Scatter(x=df_lag1["Parameter value"], y=df_lag1["Profit"])
lag2_plot = go.Scatter(x=df_lag2["Parameter value"], y=df_lag2["Profit"])
diff_plot = go.Scatter(x=df_diff["Parameter value"], y=df_diff["Profit"])
critical_point_plot = go.Scatter(x=df_critical_point["Parameter value"], y=df_critical_point["Profit"])

fig = tools.make_subplots(rows=2, cols=2, subplot_titles=('lag1 optimal value', 'lag2 optimal value',
                                                          'diff optimal value', 'critical_point optimal value'))

fig.append_trace(lag1_plot, 1, 1)
fig.append_trace(lag2_plot, 1, 2)
fig.append_trace(diff_plot, 2, 1)
fig.append_trace(critical_point_plot, 2, 2)

fig['layout']['xaxis1'].update(title='Parameter value')
fig['layout']['xaxis2'].update(title='Parameter value')
fig['layout']['xaxis3'].update(title='Parameter value')
fig['layout']['xaxis4'].update(title='Parameter value')

fig['layout']['yaxis1'].update(title='Profit')
fig['layout']['yaxis2'].update(title='Profit')
fig['layout']['yaxis3'].update(title='Profit')
fig['layout']['yaxis4'].update(title='Profit')

fig['layout'].update(title='Approximated optimal parameter values')
py.iplot(fig)

This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]
[ (2,1) x3,y3 ]  [ (2,2) x4,y4 ]



<p style="text-align: justify;"> We find a profit value of 2.41 with the parameter values: *lag1 = 167*, *lag2 = 118*, *diff = 0.0195*, *critical_point = 57*. This is a desired result, given that we have indicators that take time to produce a buy or sell signal. However, our model is based on one dataset only. To verify whether our model also works on other datasets that are collected in the same way, we test this in the following section.  <p\>

###  Other datasets

<p style="text-align: justify;"> The code below opens three new datasets. AEX_test1 has 4294 observations, AEX_test2 has 4274 observations and AEX_test1 has 28476 observations. We run the same code as our trainingset to check whether there the earlier parameter values also produce a profit. The profits are presented after the block of code. <p\>

In [62]:
AEX_test1 = pd.read_excel(open("AEX_test1.xlsx",'rb'))
AEX_test1 = AEX_test1.set_index(AEX_test1['TIME'])
del AEX_test1["TIME"]
AEX_test2 = pd.read_excel(open("AEX_test2.xlsx",'rb'))
AEX_test2 = AEX_test2.set_index(AEX_test2['TIME'])
del AEX_test2["TIME"]
AEX_test3 = pd.read_excel(open("AEX_test3.xlsx",'rb'))
AEX_test3 = AEX_test3.set_index(AEX_test3['TIME'])
del AEX_test3["TIME"]
    
full(AEX_test1, max_lag1, max_diff, max_lag2, max_critical_point)
full(AEX_test2, max_lag1, max_diff, max_lag2, max_critical_point)
full(AEX_test3, max_lag1, max_diff, max_lag2, max_critical_point)
    
print "Profit for AEX_test1 is {0}" .format(AEX_test1["profit"].sum()) + "\n" + "Profit for AEX_test2 is {0}" .format(AEX_test2["profit"].sum()) + "\n" + "Profit for AEX_test3 is {0}" .format(AEX_test3["profit"].sum())


pd.rolling_std is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=167,center=False).std()


pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=167,center=False).mean()


pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=118,center=False).mean()


pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=118,center=False).mean()


pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=57,center=False).mean()


pd.rolling_mean is deprecated for Series and will be removed in a future version, replace with 
	Series.rolling(window=57,center=False).mean()



Profit for AEX_test1 is -0.46
Profit for AEX_test2 is 0.36
Profit for AEX_test3 is 0.19


<p style="text-align: justify;"> The result is ambiguous, two datasets show a small profit while one shows a loss. Our approach does seem to create a profit in the majority of the data we have used, but we do not have enough data to fully conclude that our approach is working.
<p\>

## Conclusion 

Is it possible to formulate a model that can invest using market-movement indicators and real-time AEX-index data that generates a positive profit?

<p style="text-align: justify;"> Since human emotion leads to suboptimal outcomes in the universe of investmenting, we have tried to make a model that optimizes our return in the training dataset. Using moving averages and moving confidence intervals, we have managed to produce a positive return in three of the four datasets. However, the result is not convincing enough for us to conclude that we have made a model that generates a consistent profit.  <p\>

## Discussion

<p style="text-align: justify;"> To be able to draw a more solid conclusion concerning the research question, future research should include more data to trim the parameters and test the results. Also, in our notebook, we do not assume that there are transaction costs. However, in real life, brokers do charge a transaction fee. Another limitation is that our model is time consuming. When our model would be implemented to actually trade secruties, it would deteriorate profits. <p\>

## Reference list

<p style="text-align: justify;">  [Bondt, W. F., & Thaler, R. (1985). Does the stock market overreact?. The Journal of finance, 40(3), 793-805.](http://onlinelibrary.wiley.com/doi/10.1111/j.1540-6261.1985.tb05004.x/full)

[Fan, M., Stallaert, J., & Whinston, A. B. (2000). The Internet and the future of financial markets. Communications of the ACM, 43(11), 82-88.](http://dl.acm.org/citation.cfm?id=353368)

[Tversky, A., & Kahneman, D. (1991). Loss aversion in riskless choice: A reference-dependent model. The quarterly journal of economics, 106(4), 1039-1061.](http://www.jstor.org/stable/pdf/2937956.pdf)


 <p\>
