# 1. Introduccion

### Finanzas Cuantitativas y Ciencia de Datos
#### Rodrigo Lugo Frias y León Berdichevsky Acosta
#### ITAM Primavera 2019


Jugando con este notebook pueden familiarizarse un poco con python y su uso para Ciencia de Datos.

---

_INSTRUCCIONES:_
* Todas las celdas se corren haciendo __Shift + Enter__ o __Ctrl + Enter__

_NOTAS:_
* _Notebook adaptado de distintas fuentes y proyectos_

Primero importamos las librerias que vamos a utilizar 

In [None]:
%matplotlib inline

# Librerias importantes

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(font_scale=1.5)
import datetime as dt

#Silence all warnings
import warnings
warnings.filterwarnings('ignore')

* Leemos una base de datos (archivo .csv)

In [None]:
data = pd.read_csv("Data/GAPB MM Equity.csv")

* Informacion general de la base de datos

In [None]:
data.info()

Notemos que los elementos de la columna 'Date' es de tipo int

In [None]:
type(data.Date[0])

Cambiemos formato int a datetime

In [None]:
new_date  = pd.to_datetime(data.Date, format='%Y%m%d')
data.Date = new_date
type(data.Date[0])

## Informacion estadistica de la base de datos

In [None]:
data.describe()

* Convertimos la columna de fechas al indice

In [None]:
data.index = data.Date
data.tail()

* Graficamos el desempeño anual de GAP

In [None]:
fig, ax = plt.subplots()
data[['Last','Open','High','Low']].tail(252).plot(figsize = (9,5), ax = ax)
ax.set_xlabel('Date')
ax.set_ylabel('Precio')
ax.set_title('GAPB MM Equity (201801 - Present)')
plt.show()

* Leemos una base de datos de Quandl

In [None]:
import quandl
#######################################
#-------------------------------------#
# Get data from quandl                #
#-------------------------------------#
#######################################
token = 'cRFQueKc_6WsrdAU8GaH'
strt  = "2018-01-21"
endd  = "2019-01-21" 
stocks = ["EOD/AAPL"]
ql_data   = quandl.get(stocks, authtoken=token, start_date=strt, end_date=endd)
ql_data.columns.tolist()
adj_close = [s for s in ql_data.columns.tolist() if "Adj_Close" in s]

In [None]:
ql_data.columns

In [None]:
new_columns = [i.replace('EOD/AAPL - ','') for i in ql_data.columns]
ql_data.columns = new_columns
ql_data.columns

In [None]:
fig, ax = plt.subplots()
ql_data['Open'].plot(figsize = (9,5), ax = ax)
ax.set_title('AAPL Quote')
ax.set_xlabel('Date')
ax.set_ylabel('Price')
plt.show()

## Precios y rendimientos

In [None]:
stocks = ["EOD/MSFT"]
ql_data = quandl.get(stocks, authtoken=token, start_date=strt, end_date=endd)
ql_data.columns.tolist()
adj_close = [s for s in ql_data.columns.tolist() if "Adj_Close" in s]
new_columns = [i.replace('EOD/MSFT - ','') for i in ql_data.columns]
ql_data.columns = new_columns
ql_data.columns
fig, ax = plt.subplots()
ql_data['Open'].plot(figsize = (9,5), ax = ax)
ax.set_title('MSFT Quote')
ax.set_xlabel('Date')
ax.set_ylabel('Price')
plt.show()

In [None]:
ql_data = ql_data[['Open', 'High', 'Low', 'Close']]
ql_data.head()

In [None]:
fig, ax = plt.subplots()
ql_data.plot.density(figsize = (9,5), ax = ax)
ax.set_title('MSFT Quote')
ax.set_xlabel('Expected return')
ax.set_ylabel('Prob Density')
plt.show()

### Rendimientos

In [None]:
ret = ql_data.diff().dropna()
ret[['Open', 'High', 'Low', 'Close']].head()

In [None]:
mean = ret.Close.describe()['mean']
std  = ret.Close.describe()['std']

ret.describe()

In [None]:
import matplotlib.mlab as mlab
import math

sigma = math.sqrt(variance)
x = np.linspace(mean - 3*std, mean + 3*std, 100)
fig, ax = plt.subplots()
ax.set_title('MSFT Quote PX_Close')
ax.set_xlabel('Expected return')
ax.set_ylabel('Prob Density')
plt.plot(x,mlab.normpdf(x, mean, std))
plt.show()

In [None]:
fig, ax = plt.subplots()
ret.plot.density(figsize = (9,5), ax = ax)
ax.set_title('MSFT Quote')
ax.set_xlabel('Expected return')
ax.set_ylabel('Prob Density')
plt.show()

### Mas momentos de la distribucion

In [None]:
# Skewness
ret.skew()

In [None]:
# Kurtosis
ret.kurt()

### Rendimientos mensuales

In [None]:
month_ret = ret['Close'].groupby(pd.Grouper(freq='M'))
month_ret.describe()

#### Rendimientos por trimestre

In [None]:
quart_ret = ret['Close'].groupby(pd.Grouper(freq='Q'))
quart_ret.describe()