## Table of Contents
- [Introduction](#introduction)
- [Data Wrangling](#wrangling)
    - [Gather](#gather)
    - [Assess](#assess)
    - [Clean](#clean)
    - [Analyze](#analyze)
    - [Visualize](#visualize)
- [Conclusions](#conclusions)

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sb

## I) Introduction <a id = "introduction">

**Aim:** Analyze absolute difference and (possibly) margin of error between stock market forecast of price returns and actual stock market price returns.

I will be analyzing quarterly price returns within the past 20 years for the firms present in the S&P 500 2019 Index.

> At first, I wanted to analyze the forecasted vs. actual price earnings of the S&P in its entirety for the past 20 years. However, considering that firms continuously enter and leave stock indices every year, there would be varying levels of inconsistencies and marginal errors when comparing annual S&P returns alone. To combat this problem, I have isolated these two approaches:
- Analyze the historical earnings of *only* the firms present in the S&P 2019 Index
- Keep track of all firms that were present in the S&P for the past 20 years. Keep track of how many times each firm appeared in the Index and for those with the least count, analyze them individually on how they differ from the firms that stayed for longer.


## II) Data Wrangling <a id="wrangling"></a>

To gather the data depicted under the `./data` folder, I used Bloomberg Excel functions.

### A) Gather <a id = "gather"></a>
> **APPROACH 1:** Focus on the firms that appear in the 2019 S&P Index and analyze their forecasted vs. actual price earnings for the last 20 years.

To ensure consistency in analysis among multiple firms, I divide both the forecasted and actual price earning dates by *calendar period* instead of fiscal period. This is because fiscal period differs by firm whereas calendar period is consistent by dates. 

### Forecasted Stock Returns (201? - 201?)

### Historic Stock Returns

In [16]:
sp_2019_path = './data/'
df_sp_2019 = pd.read_csv(sp_2019_path + 'sp-2019.csv')

## B) Assess

### Forecasted Stock Returns

### Historic Stock Returns

In [13]:
#generate 10 random samples 
df_sp_2019.sample(10)

Unnamed: 0,date,A UN Equity,AAL UW Equity,AAP UN Equity,AAPL UW Equity,ABBV UN Equity,ABC UN Equity,ABMD UW Equity,ABT UN Equity,ACN UN Equity,...,XEL UW Equity,XLNX UW Equity,XOM UN Equity,XRAY UW Equity,XRX UN Equity,XYL UN Equity,YUM UN Equity,ZBH UN Equity,ZION UW Equity,ZTS UN Equity
30,9/29/2006,22.0256,,32.94,11.0043,,21.9296,14.79,23.2347,31.71,...,,21.95,67.1,30.11,40.9943,,18.7134,67.5,79.81,
1,6/30/1999,,,,1.654,,6.1859,6.875,20.363,,...,,28.625,38.5625,9.625,155.6057,,9.7297,,63.5,
45,6/30/2010,20.3299,,50.18,35.9329,,31.75,9.68,22.383,38.65,...,,25.26,57.07,29.91,21.1821,,28.0719,54.05,21.57,
76,3/29/2018,66.9,51.96,118.55,167.78,94.65,86.21,290.99,59.92,153.5,...,45.48,72.24,74.61,50.31,28.78,76.92,85.13,109.04,52.73,83.51
49,6/30/2011,36.5481,,58.49,47.9529,,41.4,16.2,25.1773,60.42,...,,36.47,81.38,38.08,27.4261,,39.7206,63.2,24.01,
22,9/30/2004,14.5332,,22.933,2.7679,,13.0292,8.85,20.2681,27.05,...,,27.0,48.33,25.97,37.0951,,14.6184,79.04,61.04,
79,12/31/2018,67.46,32.11,157.46,157.74,92.19,74.4,325.04,72.33,141.01,...,49.27,85.17,68.19,37.21,19.76,66.72,91.92,103.72,40.74,85.54
42,9/30/2009,19.9009,,39.28,26.4814,,22.38,9.71,23.6701,37.27,...,,23.42,68.61,34.54,20.3918,,24.2753,53.45,17.97,
73,6/30/2017,59.31,50.32,116.59,144.02,72.51,94.53,143.3,48.61,123.68,...,,64.32,80.73,64.84,28.73,55.43,73.76,128.4,43.91,62.38
19,12/31/2003,19.701,,27.133,1.5264,,13.6211,6.91,20.8553,26.32,...,,38.74,41.0,22.585,36.3574,,12.3677,70.4,61.33,


In [15]:
df_sp_2019.shape

(84, 506)

**Observation:** There are 506 firms encompassing 84 




### Quality

**Missing Data**
- 


### Tidiness

## C) Cleaning

# III) Store Data

# IV) Explore Data

## Univariate

## Bivariate

## Multivariate

# V) Visualize Data