# TABLE OF CONTENTS

* [1. INTRODUCTION](#section-one)
* [2. SETUP](#section-two)
    - [2.1 Install Yahoo Finance Package](#subsection-two-one)
    - [2.2 Draw Packages](#subsection-two-two)
    - [2.3 Import/Wrangle Data](#subsection-two-three)
* [3. EVENT STUDY METHOD](#section-three)
    - [3.1 Question 1: How do we run an event study (1 event)?](#subsection-three-one)
        - [3.1.1 Event Date](#subsection-three-one-one)
        - [3.1.2 Collect returns for sample firms and market index](#subsection-three-one-two)
        - [3.1.3 Estimate the expected returns during the estimation period](#subsection-three-one-three)
        - [3.1.4 Predict the expected returns during the event period](#subsection-three-one-four)
        - [3.1.5 Measure abnormal stock returns](#subsection-three-one-five)
    - [3.2 Question 2: How do we run an event study (N events)?](#subsection-three-two)
        - [3.2.1 Generate a function to iteratively run an event study for all events (N events)](#subsection-three-two-one)
        - [3.2.2 Import earnings surprises from Zacks](#subsection-three-two-two)
        - [3.2.3 Run event studies on positive earnings surprises](#subsection-three-two-three)
    - [3.3 Question 3: How stock prices behave around positive earnings surprises?](#subsection-three-three)
    - [3.4 How stock prices behave around negative earnings surprises?](#subsection-three-four)
    - [3.5 Repeat the analyses but include 3 more companies, NFLX, AIG, and DIS, in the earnings surprises data](#subsection-three-five)
    
* [4. CONCLUSION](#section-four)
* [5. REFERENCES](#section-five)

<a id="section-one"></a>
# 1. INTRODUCTION

## Team members
### Lucas Sebastian A0112080B
### Sekson Ounsaengchan (Beer) A0227885M
### Zhao Mengyu (Jessica) A0227914B

“Security prices accurately reflect all available information, and respond rapidly to new information as soon as it becomes available” Richard Brealey, Stewart Myers, & Franklin Allen, Principles of Corporate Finance, 2016.

## Ball and Brown (1968)

In the seminal paper, "An Empirical Evaluation of Accounting Income Numbers" Ball and Brown (1968) design an empirical test (i.e., the event study methodology) to assess whether security prices adjust rapidly to relevant and important news (i.e., earnings announcement and its information content).

## Objective

- Implement event study methodology to study the behavior of security prices around events
  [Ball and Brown (1968), An Empirical Evaluation of Accounting Income Numbers](https://www.jstor.org/stable/2490232?seq=1).
- An event study attempts to measure the valuation effects of a corporate event, such as an earnings announcement, by examining the response of the stock price around the announcement of the event.  
- Assumption: market is efficient. 


## Motivation

- Do security prices adjust rapidly to value-relevant news?
  

## Key Steps

1. Event Dates: To obtain precise announcement dates for a sample of firms.  
2. Collect returns for sample firms and market index.  
3. Estimate the expected returns during the event period.
4. Predict the expected returns during the event period.
5. Measure abnormal stock returns.

<a id="section-two"></a>
# 2. SETUP

<a id="subsection-two-one"></a>
## 2.1 Install Yahoo Finance Package

In [1]:
# Anaconda Prompt > "$ pip install yfinance --upgrade --no-cache-dir"
# https://pypi.org/project/fix-yahoo-finance/

In [2]:
## Optional: Install packages
!pip install yfinance --upgrade --no-cache-dir



<a id="subsection-two-two"></a>
## 2.2 Draw Packages

In [3]:
import pandas as pd
import numpy as np
import datetime as dt
from sklearn import linear_model
import scipy.stats as st
import yfinance as yf

<a id="subsection-two-three"></a>
## 2.3 Import/Wrangle Data

In [4]:
symbols_list = ["^GSPC", "TSLA", "GOOGL", "WMT", "DIS", "XOM", "NFLX", "DIS", "AIG"]
start = dt.datetime(2015, 1, 1)
end = dt.datetime(2020, 12, 31)

data = yf.download(symbols_list, start=start, end=end)

[*********************100%***********************]  8 of 8 completed


In [5]:
data

Unnamed: 0_level_0,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Close,Close,...,Open,Open,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume
Unnamed: 0_level_1,AIG,DIS,GOOGL,NFLX,TSLA,WMT,XOM,^GSPC,AIG,DIS,...,XOM,^GSPC,AIG,DIS,GOOGL,NFLX,TSLA,WMT,XOM,^GSPC
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2014-12-31,47.036343,87.786339,530.659973,48.801430,44.481998,73.371559,65.755577,2058.899902,56.009998,94.190002,...,92.419998,2082.110107,5054100,4797000,1232400,8627500,11487500,4151400,11337200,2606070000
2015-01-02,47.120323,87.376251,529.549988,49.848572,43.862000,73.388687,66.025871,2058.199951,56.110001,93.750000,...,92.250000,2058.899902,6608300,5865400,1324000,13475000,23822000,4501800,10220400,2708700000
2015-01-05,46.188160,86.099396,519.460022,47.311428,42.018002,73.175072,64.219276,2020.579956,55.000000,92.379997,...,92.099998,2054.439941,10103500,7789400,2059100,18165000,26842500,6979000,18502400,3799120000
2015-01-06,45.533138,85.642715,506.640015,46.501431,42.256001,73.738968,63.877876,2002.609985,54.220001,91.889999,...,90.239998,2022.150024,15406400,6793100,2722800,16037700,31309500,8205100,16670700,4460110000
2015-01-07,45.751476,86.518806,505.149994,46.742859,42.189999,75.695404,64.525108,2025.900024,54.480000,92.830002,...,90.650002,2005.550049,8762000,6589500,2345900,9849700,14842000,8498400,13590700,3805480000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020-12-23,36.531433,173.550003,1728.229980,514.479980,645.979980,140.978104,38.918858,3690.010010,37.439999,173.550003,...,41.509998,3693.419922,3890600,9994000,1148700,2393200,33173000,6810200,19085900,3772630000
2020-12-24,36.463135,173.729996,1734.160034,513.969971,661.770020,141.253693,38.760456,3703.060059,37.369999,173.729996,...,41.650002,3694.030029,1613800,2721000,465600,1144000,22865600,3018200,8039000,1885090000
2020-12-28,36.170414,178.860001,1773.959961,519.119995,663.690002,142.946777,38.890903,3735.360107,37.070000,178.860001,...,41.689999,3723.030029,3837300,13145400,1382500,2891900,32278600,6448300,23877500,3527460000
2020-12-29,36.258224,177.300003,1757.760010,530.869995,665.989990,142.041199,38.452980,3727.040039,37.160000,177.300003,...,42.040001,3750.010010,4047300,6875400,986300,4022400,22910800,5979400,20287700,3387030000


<a id="section-three"></a>
# 3. EVENT STUDY METHOD

<a id="subsection-three-one"></a>
## 3.1 How do we run an event study (1 event)?

<a id="subsection-three-one-one"></a>
### 3.1.1 Event Date

In [39]:
# Specify ticker
# Event: ticker and eventdate
# Tesla on 10/24/2018 announces its earnings. The earnings surprise is +627.27% (i.e., 6X the estimated EPS).
# ((Actual EPS - Estimated EPS) / absolute Estimated EPS) * 100 = EPS Surprise %

eventdate = dt.datetime(2020, 8, 3)
ticker = "AIG"

estimation_period = 252
before_event = 20
event_window_start = -20
event_window_end = 20

** Is identifying the event date always clear-cut?  

<a id="subsection-three-one-two"></a>
### 3.1.2 Collect returns for sample firms and market index

In [40]:
# Calculate returns
main_data = data["Adj Close"] / data["Adj Close"].shift(1) - 1
main_data = main_data.dropna()
main_data = main_data.reset_index()
data_ret = main_data.copy()

In [41]:
data_ret.head()

Unnamed: 0,Date,AIG,DIS,GOOGL,NFLX,TSLA,WMT,XOM,^GSPC
0,2015-01-02,0.001785,-0.004671,-0.002092,0.021457,-0.013938,0.000233,0.004111,-0.00034
1,2015-01-05,-0.019783,-0.014613,-0.019054,-0.050897,-0.042041,-0.002911,-0.027362,-0.018278
2,2015-01-06,-0.014182,-0.005304,-0.024679,-0.017121,0.005664,0.007706,-0.005316,-0.008893
3,2015-01-07,0.004795,0.01023,-0.002941,0.005192,-0.001562,0.026532,0.010132,0.01163
4,2015-01-08,-0.004405,0.010342,0.003484,0.022188,-0.001564,0.021106,0.016645,0.017888


<a id="subsection-three-one-three"></a>
### 3.1.3 Estimate the expected returns during the estimation period.

In [42]:
# Identify post-event dates
data_ret['post_event'] = (data_ret['Date'] >= eventdate).astype(int)

# We will use the index to identify trading days relative to the event
data_ret = data_ret.reset_index()

In [43]:
data_ret

Unnamed: 0,index,Date,AIG,DIS,GOOGL,NFLX,TSLA,WMT,XOM,^GSPC,post_event
0,0,2015-01-02,0.001785,-0.004671,-0.002092,0.021457,-0.013938,0.000233,0.004111,-0.000340,0
1,1,2015-01-05,-0.019783,-0.014613,-0.019054,-0.050897,-0.042041,-0.002911,-0.027362,-0.018278,0
2,2,2015-01-06,-0.014182,-0.005304,-0.024679,-0.017121,0.005664,0.007706,-0.005316,-0.008893,0
3,3,2015-01-07,0.004795,0.010230,-0.002941,0.005192,-0.001562,0.026532,0.010132,0.011630,0
4,4,2015-01-08,-0.004405,0.010342,0.003484,0.022188,-0.001564,0.021106,0.016645,0.017888,0
...,...,...,...,...,...,...,...,...,...,...,...
1505,1505,2020-12-23,0.017391,0.018187,0.004656,-0.024368,0.008808,-0.006796,0.012852,0.000746,1
1506,1506,2020-12-24,-0.001870,0.001037,0.003431,-0.000991,0.024444,0.001955,-0.004070,0.003537,1
1507,1507,2020-12-28,-0.008028,0.029529,0.022951,0.010020,0.002901,0.011986,0.003365,0.008723,1
1508,1508,2020-12-29,0.002428,-0.008722,-0.009132,0.022634,0.003465,-0.006335,-0.011260,-0.002227,1


** Why do we need the variable "post_event"?   
** What are the alternative windows for the estimation period?  

In [44]:
# Identify the index for the event date
event_date_index = data_ret.groupby(['post_event'])['index'].transform('min').max()
data_ret['event_date_index'] = event_date_index
# Create the variable day relative to event
data_ret['rel_day'] = data_ret['index'] - data_ret['event_date_index']

# Check whether relative day 0 corresponds to the event date (2018,10,24) 
# event date is 959 trading days from (2015,1,1)
data_ret[data_ret['rel_day'] == 0]

Unnamed: 0,index,Date,AIG,DIS,GOOGL,NFLX,TSLA,WMT,XOM,^GSPC,post_event,event_date_index,rel_day
1405,1405,2020-08-03,0.0,-0.005045,-0.003488,0.019923,0.03791,-0.000773,0.00404,0.007181,1,1405,0


In [45]:
# Identify estimation period
estimation = data_ret[(data_ret['rel_day'] < -before_event) & (data_ret['rel_day'] >= -estimation_period-before_event)]

# Check the last (relative) day of the estimation period
estimation['rel_day'].max()
# print(estimation)

-21

In [46]:
# convert from row to column vector
# We call .reshape() on x because this array has to be two-dimensional (i.e., one column and as many rows as necessary). 
x_df = estimation['^GSPC'].values.reshape(-1, 1)
# print(x_df)

In [47]:
# Create an empty list to store betas
# Note: for the sake of simplicity, we are ignoring the intercepts here
betas = []
alphas = []

for y in symbols_list:
        
    y_df = estimation[y].values
    # print(y, y_df)
    reg = linear_model.LinearRegression()
    betas.append(reg.fit(x_df, y_df).coef_)
    alphas.append(reg.fit(x_df, y_df).intercept_)

In [48]:
# Convert the list to a Numpy Array
# beta coefficients
beta_np = np.array(betas)
print("beta", beta_np)

# intercepts
# for brevity, we are ignoring the alphas in calculating the expected returns - the values are very small
alpha_np = np.array(alphas)
print("alpha", alpha_np)

beta [[1.        ]
 [1.11603422]
 [0.94069269]
 [0.52075559]
 [1.05033772]
 [1.10108516]
 [0.64709391]
 [1.05033772]
 [1.53801232]]
alpha [ 1.08420217e-19  7.27124472e-03  9.58818880e-04  2.66926428e-04
 -9.50367177e-04 -1.96761303e-03  9.80306303e-04 -9.50367177e-04
 -2.00246311e-03]


<a id="subsection-three-one-four"></a>
### 3.1.4 Predict the expected returns during the event period.

In [49]:
# Identify event period [-20,20]
event = data_ret[(data_ret['rel_day'] <= event_window_end) & (data_ret['rel_day'] >= event_window_start)]

In [50]:
event[["index","Date","AIG","^GSPC","event_date_index","rel_day"]].head(41)

Unnamed: 0,index,Date,AIG,^GSPC,event_date_index,rel_day
1385,1385,2020-07-06,0.027406,0.015882,1405,-20
1386,1386,2020-07-07,-0.051074,-0.010819,1405,-19
1387,1387,2020-07-08,0.023312,0.007827,1405,-18
1388,1388,2020-07-09,-0.061977,-0.005644,1405,-17
1389,1389,2020-07-10,0.075,0.010466,1405,-16
1390,1390,2020-07-13,-0.012625,-0.009363,1405,-15
1391,1391,2020-07-14,0.036003,0.013406,1405,-14
1392,1392,2020-07-15,0.031829,0.009082,1405,-13
1393,1393,2020-07-16,0.024866,-0.003406,1405,-12
1394,1394,2020-07-17,-0.013514,0.002849,1405,-11


In [51]:
# Expected Returns via Beta
# Need Numpy Array to do Calculations!
sp500array = event['^GSPC'].values

# numpy.outer() function compute the outer product of two vectors
expected_returns = np.outer(sp500array, beta_np)
expected_returns = pd.DataFrame(expected_returns, index=event.index)
expected_returns.columns = symbols_list
expected_returns = expected_returns.rename(columns={"AIG": "expected_return"})
expected_returns[['expected_return']]

Unnamed: 0,expected_return
1385,0.024426
1386,-0.016639
1387,0.012039
1388,-0.00868
1389,0.016097
1390,-0.0144
1391,0.020619
1392,0.013968
1393,-0.005239
1394,0.004381


<a id="subsection-three-one-five"></a>
### 3.1.5 Measure abnormal stock returns

In [52]:
# Abnormal Returns
event = pd.concat([event, expected_returns], axis=1)
event ['abnormal_return'] = event['AIG'] - event['expected_return']


In [53]:
event.head()

Unnamed: 0,index,Date,AIG,DIS,GOOGL,NFLX,TSLA,WMT,XOM,^GSPC,...,^GSPC.1,TSLA.1,GOOGL.1,WMT.1,DIS.1,XOM.1,NFLX.1,DIS.2,expected_return,abnormal_return
1385,1385,2020-07-06,0.027406,0.020057,0.020219,0.03548,0.134794,-0.002684,0.007033,0.015882,...,0.015882,0.017725,0.01494,0.00827,0.016681,0.017487,0.010277,0.016681,0.024426,0.00298
1386,1386,2020-07-07,-0.051074,-0.006991,-0.006488,-0.001316,0.013328,0.067794,-0.025907,-0.010819,...,-0.010819,-0.012074,-0.010177,-0.005634,-0.011363,-0.011912,-0.007001,-0.011363,-0.016639,-0.034435
1387,1387,2020-07-08,0.023312,0.026666,0.009182,0.019507,-0.017254,-0.019772,-0.002312,0.007827,...,0.007827,0.008736,0.007363,0.004076,0.008221,0.008619,0.005065,0.008221,0.012039,0.011273
1388,1388,2020-07-09,-0.061977,0.001286,0.010016,0.009905,0.020792,0.026599,-0.041261,-0.005644,...,-0.005644,-0.006298,-0.005309,-0.002939,-0.005928,-0.006214,-0.003652,-0.005928,-0.00868,-0.053297
1389,1389,2020-07-10,0.075,0.021659,0.0134,0.080688,0.107848,0.022935,0.03119,0.010466,...,0.010466,0.011681,0.009845,0.00545,0.010993,0.011524,0.006773,0.010993,0.016097,0.058903


In [54]:
# Event CAR
winar1 = event[(event['rel_day'] <= 1)  & (event['rel_day'] >= -1)]['abnormal_return'].sum() # CAR[-1,+1]
winar2 = event[(event['rel_day'] <= 1)  & (event['rel_day'] >= 0)]['abnormal_return'].sum() # CAR[0,+1]

print("CAR [-1,+1]  = " + str(winar1))
print("CAR [0,1]  = " + str(winar2))
print("*"*80)

# Day-by-day AR
winar3 = event[(event['rel_day'] <= -1)  & (event['rel_day'] >= -1)]['abnormal_return'].sum() # Event Day -1
winar4 = event[(event['rel_day'] <= 0)  & (event['rel_day'] >= 0)]['abnormal_return'].sum() # Event Day 0
winar5 = event[(event['rel_day'] <= 1)  & (event['rel_day'] >= 1)]['abnormal_return'].sum() # Event Day 1

print("AR -1  = " + str(winar3))
print("AR 0  = " + str(winar4))
print("AR +1  = " + str(winar5))
print("*"*80)

# Post Event CAR
winar6 = event[(event['rel_day'] <= 5)  & (event['rel_day'] >= 2)]['abnormal_return'].sum() # CAR[2,5]
winar7 = event[(event['rel_day'] <= 10)  & (event['rel_day'] >= 2)]['abnormal_return'].sum() # CAR[2,10]
winar8 = event[(event['rel_day'] <= 20)  & (event['rel_day'] >= 2)]['abnormal_return'].sum() # CAR[2,20]

print("CAR [+2,+5]  = " + str(winar6))
print("CAR [+2,+10]  = " + str(winar7))
print("CAR [+2,+20]  = " + str(winar8))
print("*"*80)

# Pre Event CAR
winar9 = event[(event['rel_day'] <= -2)  & (event['rel_day'] >= -5)]['abnormal_return'].sum() # CAR[-5,-2]
winar10 = event[(event['rel_day'] <= -2)  & (event['rel_day'] >= -10)]['abnormal_return'].sum() # CAR[-10,-2]
winar11 = event[(event['rel_day'] <= -2)  & (event['rel_day'] >= -20)]['abnormal_return'].sum() # CAR[-20,-2]

print("CAR [-5,-2]  = " + str(winar9))
print("CAR [-10,-2]  = " + str(winar10))
print("CAR [-20,-2]  = " + str(winar11))
print("*"*80)

CAR [-1,+1]  = -0.09236485493051703
CAR [0,1]  = -0.09189529748055782
********************************************************************************
AR -1  = -0.00046955744995920944
AR 0  = -0.01104450248719498
AR +1  = -0.08085079499336284
********************************************************************************
CAR [+2,+5]  = 0.033811280207397695
CAR [+2,+10]  = -0.023757817712691403
CAR [+2,+20]  = -0.10233755436219362
********************************************************************************
CAR [-5,-2]  = 0.0021164283569551044
CAR [-10,-2]  = -0.01867606721051647
CAR [-20,-2]  = 0.013977639631129909
********************************************************************************


<a id="subsection-three-two"></a>
## 3.2 How do we run an event study (N events)?

<a id="subsection-three-two-one"></a>
### 3.2.1 Generate a function to iteratively run an event study for all events (N events)

In [22]:
def do_event_study(
    data_ret,
    eventdate,
    ticker,
    estimation_period=252,
    before_event=20,
    event_window_start=-20,
    event_window_end=20,
    benchmark="^GSPC",
):
    """
    Function takes in the historical returns, an event date of a stock, returns the cumulative abnormal returns (CARS) over
    a specified timeframe 
    
    Parameters:
        data_ret (pd.DataFrame): A dataframe containing daily returns of stock(s) and the specified benchmark. columns: tickers, rows: returns 
        eventdate (datetime): the event date to be studied. eventdate must be within the date frame of data_ret
        ticker (str): ticker or CUSIP code of the stock to be studied. ticker/CUSIP must be found in data_ret columns
        estimation_period (int): number of days used to estimate the beta against the given benchmark 
        before_event (int): number of days before the event to evaluate from 
        event_window_start (int): a negative number specifying the relative number of days before the event date
        event_window_end (int): a positive number specifying the relative number of days after the event date 
        benchmark (str): ticker symbol of the benchmark used. benchmark must be in data_ret.columns
        
    Returns:
        Tuple of the cumulative abnormal returns over different observation days as below
        "CAR[-1, +1]", "CAR[0,+1]", "Event Day -1", "Event Day 0", "Event Day 1",
        CAR[2,5]", "CAR[2,10]", "CAR[2,20]", "CAR[-5,-2]", "CAR[-10,-2]", "CAR[-20,-2]""
    """

    # Generate post-event indicator
    data_ret["post_event"] = (data_ret["Date"] >= eventdate).astype(
        int
    )  # 1 if after event, 0 otherwise
    data_ret = (
        data_ret.reset_index()
    )  # pushes out the current index column and create a new one

    # Identify the index for the event date
    event_date_index = data_ret.groupby(["post_event"])["index"].transform("min").max()
    data_ret["event_date_index"] = event_date_index

    # Create the variable day relative to event
    data_ret["rel_day"] = data_ret["index"] - data_ret["event_date_index"]

    # Identify estimation period
    estimation = data_ret[
        (data_ret["rel_day"] < -before_event)
        & (data_ret["rel_day"] >= -estimation_period - before_event)
    ]

    # Identify event period
    event = data_ret[
        (data_ret["rel_day"] <= event_window_end)
        & (data_ret["rel_day"] >= event_window_start)
    ]

    # Calculate expected returns with the market model
    x_df = estimation[benchmark].values.reshape(-1, 1)

    # Create an empty list to store betas
    betas = []

    # Calculate betas for the market model
    for y in [benchmark, ticker]:
        y_df = estimation[y].values.reshape(-1, 1)
        reg = linear_model.LinearRegression()
        betas.append(reg.fit(x_df, y_df).coef_)

    # Convert the list to a Numpy Array
    beta_np = np.array(betas)
    beta_np

    # Expected Returns via Beta
    # Need Numpy Array to do Calculations!
    sp500array = event[benchmark].values
    expected_returns = np.outer(sp500array, beta_np)
    expected_returns = pd.DataFrame(expected_returns, index=event.index)
    expected_returns.columns = [benchmark, ticker]
    expected_returns = expected_returns.rename(columns={ticker: "expected_return"})
    del expected_returns[benchmark]

    # Abnormal Returns
    event = pd.concat([event, expected_returns], axis=1, ignore_index=False)

    event["abnormal_return"] = event[ticker] - event["expected_return"]

    # Event CAR
    winar1 = event[(event["rel_day"] <= 1) & (event["rel_day"] >= -1)][
        "abnormal_return"
    ].sum()  # CAR[-1,+1]
    winar2 = event[(event["rel_day"] <= 1) & (event["rel_day"] >= 0)][
        "abnormal_return"
    ].sum()  # CAR[0,+1]

    # Day-by-day AR
    winar3 = event[(event["rel_day"] <= -1) & (event["rel_day"] >= -1)][
        "abnormal_return"
    ].sum()  # Event Day -1
    winar4 = event[(event["rel_day"] <= 0) & (event["rel_day"] >= 0)][
        "abnormal_return"
    ].sum()  # Event Day 0
    winar5 = event[(event["rel_day"] <= 1) & (event["rel_day"] >= 1)][
        "abnormal_return"
    ].sum()  # Event Day 1

    # Post Event CAR
    winar6 = event[(event["rel_day"] <= 5) & (event["rel_day"] >= 2)][
        "abnormal_return"
    ].sum()  # CAR[2,5]
    winar7 = event[(event["rel_day"] <= 10) & (event["rel_day"] >= 2)][
        "abnormal_return"
    ].sum()  # CAR[2,10]
    winar8 = event[(event["rel_day"] <= 20) & (event["rel_day"] >= 2)][
        "abnormal_return"
    ].sum()  # CAR[2,20]

    # Pre Event CAR
    winar9 = event[(event["rel_day"] <= -2) & (event["rel_day"] >= -5)][
        "abnormal_return"
    ].sum()  # CAR[-5,-2]
    winar10 = event[(event["rel_day"] <= -2) & (event["rel_day"] >= -10)][
        "abnormal_return"
    ].sum()  # CAR[-10,-2]
    winar11 = event[(event["rel_day"] <= -2) & (event["rel_day"] >= -20)][
        "abnormal_return"
    ].sum()  # CAR[-20,-2]

    return (
        winar1,
        winar2,
        winar3,
        winar4,
        winar5,
        winar6,
        winar7,
        winar8,
        winar9,
        winar10,
        winar11,
    )

<a id="subsection-three-two-two"></a>
### 3.2.2 Import earnings surprises from Zacks

In [23]:
# local_path = "C:/jupyter_workspace/Fintech/Codes/Event Studies/"

data_events = pd.read_csv(
    "C:/Users/sekso/Desktop/MBA/sem4/Fintech/Bootcamp Codes and Data/04 Event Studies/earnings_surprises.csv", na_values=["."], parse_dates=["Date"]
)
data_events.tail()

Unnamed: 0,Ticker,Date,Eps_surprise,Type
95,XOM,2017-01-31,25.00%,1
96,XOM,2016-10-28,5.00%,1
97,XOM,2016-07-29,-35.94%,-1
98,XOM,2016-04-29,53.57%,1
99,XOM,2016-02-02,4.69%,1


In [24]:
pos_events = data_events[data_events["Type"] == 1].set_index("Ticker")
del pos_events["Type"]
neg_events = data_events[data_events["Type"] == -1].set_index("Ticker")
del neg_events["Type"]

<a id="subsection-three-two-three"></a>
### 3.2.3 Run event studies on positive earnings surprises

In [25]:
cars_pos = []

# for ticker, eventdate in pos_events.items():

for index, row in pos_events.iterrows():
    data_ret = main_data[["Date", index, "^GSPC"]].copy()
    cars_pos.append(do_event_study(data_ret, ticker=index, eventdate=row["Date"]))

In [26]:
cars_pos = pd.DataFrame(cars_pos)
cars_pos.columns = [
    "winar1",
    "winar2",
    "winar3",
    "winar4",
    "winar5",
    "winar6",
    "winar7",
    "winar8",
    "winar9",
    "winar10",
    "winar11",
]
cars_pos

Unnamed: 0,winar1,winar2,winar3,winar4,winar5,winar6,winar7,winar8,winar9,winar10,winar11
0,-0.020848,0.005475,-0.026323,0.004301,0.001174,0.017900,-0.004624,0.106787,-0.005455,0.017964,-0.011394
1,-0.074545,-0.027271,-0.047274,0.008969,-0.036240,-0.010510,-0.041705,0.194037,0.066162,0.141583,0.482898
2,-0.033462,-0.002377,-0.031085,0.010333,-0.012710,0.035198,0.058243,-0.005804,0.100586,0.115042,0.388649
3,0.127858,0.124896,0.002963,0.025971,0.098925,0.159350,0.186197,0.243014,0.050631,0.083022,0.270685
4,0.179732,0.166632,0.013100,-0.007426,0.174058,0.038206,0.061280,0.124227,-0.021926,0.002633,0.112584
...,...,...,...,...,...,...,...,...,...,...,...
64,0.007787,0.009974,-0.002187,0.006415,0.003559,-0.004237,0.014684,-0.006373,-0.006462,-0.034180,-0.016409
65,-0.024776,-0.022255,-0.002521,-0.010678,-0.011576,-0.007035,-0.013529,-0.031600,-0.004635,-0.016789,-0.071143
66,-0.038041,-0.038852,0.000811,-0.021744,-0.017108,0.020961,0.020289,0.014104,-0.000530,0.003135,0.012572
67,0.014143,0.009690,0.004453,0.009312,0.000378,0.004886,0.020215,0.009683,0.017374,0.029017,0.039674


<a id="subsection-three-two-three"></a>
### 3.2.4 Run event studies on negative earnings surprises

In [27]:
cars_neg = []

# for ticker, eventdate in neg_events.items():

for index, row in neg_events.iterrows():
    data_ret = main_data[["Date", index, "^GSPC"]].copy()
    cars_neg.append(do_event_study(data_ret, ticker=index, eventdate=row["Date"]))

In [28]:
cars_neg = pd.DataFrame(cars_neg)
cars_neg.columns = [
    "winar1",
    "winar2",
    "winar3",
    "winar4",
    "winar5",
    "winar6",
    "winar7",
    "winar8",
    "winar9",
    "winar10",
    "winar11",
]
cars_neg

Unnamed: 0,winar1,winar2,winar3,winar4,winar5,winar6,winar7,winar8,winar9,winar10,winar11
0,-0.108597,-0.11729,0.008693,0.012032,-0.129322,0.065295,0.073595,0.003053,0.02142,0.105085,0.119836
1,-0.066735,-0.058987,-0.007748,-0.01685,-0.042137,-0.053549,0.014819,-0.206869,-0.0137,-0.043176,-0.036775
2,0.004562,-0.0011,0.005662,0.016479,-0.017579,0.019194,-0.019925,0.004333,-0.013372,-0.142895,-0.188777
3,0.185755,0.165332,0.020423,0.010556,0.154776,0.048407,-0.019176,-0.166324,-0.014206,-0.10028,-0.17677
4,-0.067332,-0.101849,0.034516,-0.033638,-0.068211,0.009908,0.048754,-0.000299,-0.053768,-0.111238,-0.100725
5,-0.087385,-0.074057,-0.013327,-0.023479,-0.050579,0.093998,0.051214,0.130304,0.029124,0.055054,0.06985
6,-0.065421,-0.077264,0.011843,-0.012715,-0.064549,-0.038571,-0.034896,0.007298,-0.040972,0.027071,0.048538
7,0.006717,0.010932,-0.004215,-0.010148,0.02108,-0.027968,-0.042581,-0.087135,0.001634,0.018479,0.026242
8,-0.112788,-0.084091,-0.028697,-0.034821,-0.04927,-0.019993,0.000974,0.004574,-0.04178,-0.010745,-0.073316
9,0.033652,0.031115,0.002537,-0.030672,0.061787,0.050642,0.149838,0.219904,-0.171983,-0.225932,-0.283596


<a id="subsection-three-three"></a>
## 3.3 How do stock prices behave around positive earnings surprises?

In [29]:
# Calculate the Mean and Standard Deviation of the AAR
mean_AAR = cars_pos.mean()
std_AAR = cars_pos.sem()
# Put everything in Dataframes
stats = pd.DataFrame(mean_AAR, columns=['Mean AAR'])
stats['STD AAR'] = std_AAR
stats['T-Test'] = mean_AAR / std_AAR

# Note method sf (survival function) from scipy.stats.t (or st.t) calculates P-values from T-stats
# The method sf takes two arguments: T-statistic and degree of freedom, i.e., sf(absolute value of t-statistic, degree of freedom)
# For one-tail test multiply the function output by 1, for two-tail test multiply it by 2
stats['P-Value']  = st.t.sf(np.abs(stats['T-Test']), len(cars_pos)-1)*2

# Display is a great method to show multiple outputs at once
display(stats)

# Double check the calculation of T-statistics and P-value
# winars = ['winar1', 'winar2', 'winar3', 'winar4', 'winar5', 'winar6', 'winar7', 'winar8', 'winar9', 'winar10', 'winar11']

# for winar in winars:
#     print(st.ttest_1samp(cars_pos[winar],0))

Unnamed: 0,Mean AAR,STD AAR,T-Test,P-Value
winar1,0.012645,0.0062,2.039538,0.045284
winar2,0.013741,0.005371,2.558605,0.012741
winar3,-0.001096,0.002585,-0.42391,0.672969
winar4,0.009125,0.003056,2.98545,0.003933
winar5,0.004616,0.004688,0.984803,0.328212
winar6,0.002038,0.004248,0.4797,0.63298
winar7,0.001641,0.004903,0.334634,0.738931
winar8,0.00726,0.007728,0.939354,0.350874
winar9,0.006232,0.003244,1.921219,0.058897
winar10,0.015498,0.004491,3.451191,0.000964


<a id="subsection-three-four"></a>
## 3.4 How do stock prices behave around negative earnings surprises?

In [30]:
# Calculate the Mean and Standard Deviation of the AAR
mean_AAR = cars_neg.mean()
std_AAR = cars_neg.sem()
# Put everything in Dataframes
stats = pd.DataFrame(mean_AAR, columns=['Mean AAR'])
stats['STD AAR'] = std_AAR
stats['T-Test'] = mean_AAR / std_AAR

# Note method sf (survival function) from scipy.stats.t (or st.t) calculates P-values from T-stats
# The method sf takes two arguments: T-statistic and degree of freedom, i.e., sf(absolute value of t-statistic, degree of freedom)
# For one-tail test multiply the function output by 1, for two-tail test multiply it by 2
stats['P-Value']  = st.t.sf(np.abs(stats['T-Test']), len(cars_neg)-1)*2

# Display is a great method to show multiple outputs at once
display(stats)

# Double check the calculation of T-statistics and P-value
# winars = ['winar1', 'winar2', 'winar3', 'winar4', 'winar5', 'winar6', 'winar7', 'winar8', 'winar9', 'winar10', 'winar11']

# for winar in winars:
#     print(st.ttest_1samp(cars_neg[winar],0))

Unnamed: 0,Mean AAR,STD AAR,T-Test,P-Value
winar1,-0.025378,0.010085,-2.516376,0.017438
winar2,-0.025676,0.009931,-2.585407,0.01483
winar3,0.000298,0.002811,0.105964,0.916316
winar4,-0.012266,0.004361,-2.812642,0.008583
winar5,-0.01341,0.008858,-1.513928,0.140509
winar6,0.000467,0.00592,0.078953,0.937594
winar7,0.006569,0.007969,0.824224,0.416318
winar8,-0.005748,0.015094,-0.380847,0.706
winar9,-0.007601,0.006592,-1.152978,0.258024
winar10,-0.011153,0.011442,-0.974743,0.337482


<a id="subsection-three-five"></a>
## 3.5 Repeat the analyses but include 3 more companies, NFLX, AIG, and DIS, in the earnings surprises data

<a id="subsection-three-five-one"></a>
### 3.5.1 Import earnings surprises from Zacks

In [31]:
# local_path = "C:/jupyter_workspace/Fintech/Codes/Event Studies/"

data_events = pd.read_csv(
    "C:/Users/sekso/Desktop/MBA/sem4/Fintech/Bootcamp Codes and Data/04 Event Studies/earnings_surprises_more_companies.csv", na_values=["."], parse_dates=["Date"]
)
data_events.tail()

Unnamed: 0,Ticker,Date,Eps_surprise,Type
155,AIG,2017-02-14,-32.26%,-1
156,AIG,2016-02-11,-17.36%,-1
157,AIG,2016-02-08,7.69%,1
158,AIG,2016-02-05,-34.34%,-1
159,AIG,2016-11-02,-18.28%,-1


In [32]:
pos_events = data_events[data_events["Type"] == 1].set_index("Ticker")
del pos_events["Type"]
neg_events = data_events[data_events["Type"] == -1].set_index("Ticker")
del neg_events["Type"]

<a id="subsection-three-five-two"></a>
### 3.5.2 Run event studies on positive earnings surprises

In [33]:
cars_pos = []

# for ticker, eventdate in pos_events.items():

for index, row in pos_events.iterrows():
    data_ret = main_data[["Date", index, "^GSPC"]].copy()
    cars_pos.append(do_event_study(data_ret, ticker=index, eventdate=row["Date"]))

In [34]:
cars_pos = pd.DataFrame(cars_pos)
cars_pos.columns = [
    "winar1",
    "winar2",
    "winar3",
    "winar4",
    "winar5",
    "winar6",
    "winar7",
    "winar8",
    "winar9",
    "winar10",
    "winar11",
]
# cars_pos

<a id="subsection-three-five-three"></a>
### 3.5.3 Run event studies on negative earnings surprises

In [35]:
cars_neg = []

# for ticker, eventdate in neg_events.items():

for index, row in neg_events.iterrows():
    data_ret = main_data[["Date", index, "^GSPC"]].copy()
    cars_neg.append(do_event_study(data_ret, ticker=index, eventdate=row["Date"]))

In [36]:
cars_neg = pd.DataFrame(cars_neg)
cars_neg.columns = [
    "winar1",
    "winar2",
    "winar3",
    "winar4",
    "winar5",
    "winar6",
    "winar7",
    "winar8",
    "winar9",
    "winar10",
    "winar11",
]
# cars_neg

<a id="subsection-three-five-four"></a>
### 3.5.4 How do stock prices behave around positive earnings surprises?

In [37]:
# Calculate the Mean and Standard Deviation of the AAR
mean_AAR = cars_pos.mean()
std_AAR = cars_pos.sem()
# Put everything in Dataframes
stats = pd.DataFrame(mean_AAR, columns=['Mean AAR'])
stats['STD AAR'] = std_AAR
stats['T-Test'] = mean_AAR / std_AAR

# Note method sf (survival function) from scipy.stats.t (or st.t) calculates P-values from T-stats
# The method sf takes two arguments: T-statistic and degree of freedom, i.e., sf(absolute value of t-statistic, degree of freedom)
# For one-tail test multiply the function output by 1, for two-tail test multiply it by 2
stats['P-Value']  = st.t.sf(np.abs(stats['T-Test']), len(cars_pos)-1)*2

# Display is a great method to show multiple outputs at once
display(stats)

# Double check the calculation of T-statistics and P-value
# winars = ['winar1', 'winar2', 'winar3', 'winar4', 'winar5', 'winar6', 'winar7', 'winar8', 'winar9', 'winar10', 'winar11']

# for winar in winars:
#     print(st.ttest_1samp(cars_pos[winar],0))

Unnamed: 0,Mean AAR,STD AAR,T-Test,P-Value
winar1,0.006726,0.005568,1.20806,0.22979
winar2,0.006998,0.005079,1.37794,0.171208
winar3,-0.000272,0.001925,-0.141247,0.887951
winar4,0.006955,0.002632,2.642587,0.009511
winar5,4.2e-05,0.004445,0.00956,0.992391
winar6,-0.002384,0.003687,-0.646612,0.519322
winar7,-0.001191,0.00493,-0.241664,0.809521
winar8,-0.000486,0.007039,-0.069105,0.94504
winar9,0.004337,0.002785,1.557608,0.122394
winar10,0.014454,0.0041,3.525614,0.000632


<a id="subsection-three-five-five"></a>
### 3.5.5 How do stock prices behave around negative earnings surprises?

In [38]:
# Calculate the Mean and Standard Deviation of the AAR
mean_AAR = cars_neg.mean()
std_AAR = cars_neg.sem()
# Put everything in Dataframes
stats = pd.DataFrame(mean_AAR, columns=['Mean AAR'])
stats['STD AAR'] = std_AAR
stats['T-Test'] = mean_AAR / std_AAR

# Note method sf (survival function) from scipy.stats.t (or st.t) calculates P-values from T-stats
# The method sf takes two arguments: T-statistic and degree of freedom, i.e., sf(absolute value of t-statistic, degree of freedom)
# For one-tail test multiply the function output by 1, for two-tail test multiply it by 2
stats['P-Value']  = st.t.sf(np.abs(stats['T-Test']), len(cars_neg)-1)*2

# Display is a great method to show multiple outputs at once
display(stats)

# Double check the calculation of T-statistics and P-value
# winars = ['winar1', 'winar2', 'winar3', 'winar4', 'winar5', 'winar6', 'winar7', 'winar8', 'winar9', 'winar10', 'winar11']

# for winar in winars:
#     print(st.ttest_1samp(cars_neg[winar],0))

Unnamed: 0,Mean AAR,STD AAR,T-Test,P-Value
winar1,-0.016293,0.007587,-2.147622,0.036166
winar2,-0.016283,0.007352,-2.214919,0.030926
winar3,-1e-05,0.002207,-0.004508,0.996419
winar4,-0.008029,0.002841,-2.826213,0.006554
winar5,-0.008254,0.006834,-1.207664,0.232344
winar6,-0.000647,0.005756,-0.11245,0.910876
winar7,0.000693,0.00779,0.088953,0.929442
winar8,-0.00771,0.011556,-0.667232,0.507414
winar9,-0.00773,0.005336,-1.448485,0.153159
winar10,-0.005089,0.007922,-0.642376,0.523298


<a id="section-four"></a>
# 4. CONCLUSION

We look at positive earnings surprises, where the reported earnings per share (EPS) exceeds the estimated EPS. The market reaction to positive earnings surprises are statistically significant. CAR[-1,+1] and CAR [0,+1] is 1.26% and 1.37%, respectively; both significant at 5% significance level. The day -1 to day +1 AR breakdown shows that most market reaction is on day 0 (i.e., earnings announcement date). This is consistent with EMH, specifically, semistrong form EMH. There is no evidence of return drift in this small sample (69 positive earnings surprises) that we look at; the post-event CARs are near zero and statistically insignificant. However, there is evidence of pre-event market reaction. The CARs one-month to two-week before events are 2.59% and 1.55% and statistically significant; this may be attributable to confounding events/news. The CARs one-week before earnings announcement is statistically and economically weaker. In the absence of major confounding events/news, this is suggestive evidence of potential information leakage or insider trading.
## 4.1 Optional Q1
Do you find evidence suggesting underreaction for negative earnings surprises? Compare and contrast this with that of the positive earnings surprises. Expand the sample (i.e., include more companies). Do the results for earnings surprises change?

### Answer
According to our observation:

•	Market’s reaction on negative earning surprises happened within mostly on the event day, similar to how market reacts during positive earning surprises. 

•	CAR[+2,+20] of the original dataset is statistically insignificant at -0.57%, suggesting that there is no clear sign of underreaction.

•	When we include 3 more companies into the observation, signs of underreaction are prevalent within the movement of individual stocks. For example, AIG announced an EPS surprise of -1.49% on 3 Aug 2020. The CAR[+2,+10] dived to -2.3% and continued to plunge further to -10.23% by CAR[+2,+20].

In [54]:
# Event CAR
winar1 = event[(event['rel_day'] <= 1)  & (event['rel_day'] >= -1)]['abnormal_return'].sum() # CAR[-1,+1]
winar2 = event[(event['rel_day'] <= 1)  & (event['rel_day'] >= 0)]['abnormal_return'].sum() # CAR[0,+1]

print("CAR [-1,+1]  = " + str(winar1))
print("CAR [0,1]  = " + str(winar2))
print("*"*80)

# Day-by-day AR
winar3 = event[(event['rel_day'] <= -1)  & (event['rel_day'] >= -1)]['abnormal_return'].sum() # Event Day -1
winar4 = event[(event['rel_day'] <= 0)  & (event['rel_day'] >= 0)]['abnormal_return'].sum() # Event Day 0
winar5 = event[(event['rel_day'] <= 1)  & (event['rel_day'] >= 1)]['abnormal_return'].sum() # Event Day 1

print("AR -1  = " + str(winar3))
print("AR 0  = " + str(winar4))
print("AR +1  = " + str(winar5))
print("*"*80)

# Post Event CAR
winar6 = event[(event['rel_day'] <= 5)  & (event['rel_day'] >= 2)]['abnormal_return'].sum() # CAR[2,5]
winar7 = event[(event['rel_day'] <= 10)  & (event['rel_day'] >= 2)]['abnormal_return'].sum() # CAR[2,10]
winar8 = event[(event['rel_day'] <= 20)  & (event['rel_day'] >= 2)]['abnormal_return'].sum() # CAR[2,20]

print("CAR [+2,+5]  = " + str(winar6))
print("CAR [+2,+10]  = " + str(winar7))
print("CAR [+2,+20]  = " + str(winar8))
print("*"*80)

# Pre Event CAR
winar9 = event[(event['rel_day'] <= -2)  & (event['rel_day'] >= -5)]['abnormal_return'].sum() # CAR[-5,-2]
winar10 = event[(event['rel_day'] <= -2)  & (event['rel_day'] >= -10)]['abnormal_return'].sum() # CAR[-10,-2]
winar11 = event[(event['rel_day'] <= -2)  & (event['rel_day'] >= -20)]['abnormal_return'].sum() # CAR[-20,-2]

print("CAR [-5,-2]  = " + str(winar9))
print("CAR [-10,-2]  = " + str(winar10))
print("CAR [-20,-2]  = " + str(winar11))
print("*"*80)

CAR [-1,+1]  = -0.09236485493051703
CAR [0,1]  = -0.09189529748055782
********************************************************************************
AR -1  = -0.00046955744995920944
AR 0  = -0.01104450248719498
AR +1  = -0.08085079499336284
********************************************************************************
CAR [+2,+5]  = 0.033811280207397695
CAR [+2,+10]  = -0.023757817712691403
CAR [+2,+20]  = -0.10233755436219362
********************************************************************************
CAR [-5,-2]  = 0.0021164283569551044
CAR [-10,-2]  = -0.01867606721051647
CAR [-20,-2]  = 0.013977639631129909
********************************************************************************


<a id="section-five"></a>
# 5. REFERENCES

Ray Ball and Philip Brown (1968), [An Empirical Evaluation of Accounting Income Numbers](https://www.jstor.org/stable/2490232?seq=1). Journal of Accounting Research, 6: 159-178. 

This notebook is adapted from code written by [Jeroen Bouma](https://gist.github.com/JerBouma/56a3be80ce02392c4e4fcae2763c5bcf), ALM Advisor, and [Roland Gemayel](https://www.kcl.ac.uk/people/roland-gemayel), King's College, London. 