# Bolsa de valores

Prevendo o volume de ações utilizando o modelo Random Forest Regressor com inclusão de indicadores financeiros como CMO, DX e MFI

* Utilizaremos dados do Yahoo Finance da PETR4 num intervalo de tempo específico

* Dados serão obtido através da biblioteca Pandas DataReader

Importando bibliotecas

In [None]:
import datetime as dt
import pandas_datareader.data as web
import matplotlib.pyplot as plt
import numpy as np

Definindo janela de tempo

In [None]:
start = dt.datetime(2018,1,1)
end = dt.datetime(2020,9,30)

Obtendo dados da PETR4

In [None]:
PETR4 = web.DataReader('PETR4.SA',"yahoo",start,end)

Exibindo as cinco primeiras linhas

In [None]:
PETR4.head()

Adicionando indicadores

In [None]:
High = PETR4['High'].values
Low = PETR4['Low'].values
Open = PETR4['Open'].values
Close = PETR4['Close'].values
Volume = PETR4['Volume'].values

In [None]:
from talib._ta_lib import ADX, APO, CCI, CMO, DX, RSI, DX, MACD, MFI, ROC, RSI, ULTOSC

Indicador ADX

In [None]:
PETR4['ADX'] = ADX(High, Low, Close, timeperiod=14)

Indicador APO

In [None]:
PETR4['APO'] = APO(Close, fastperiod=12, slowperiod=26, matype=0)

Indicador CCI

In [None]:
PETR4['CCI'] = CCI(High, Low, Close, timeperiod=14)

Indicador CMO

In [None]:
PETR4['CMO'] = CMO(Close, timeperiod=14)

Indicador DX

In [None]:
PETR4['DX'] = DX(High, Low, Close, timeperiod=14)

Indicador MACD

In [None]:
macd, macdsignal, macdhist = MACD(Close, fastperiod=12, slowperiod=26, signalperiod=9)

In [None]:
PETR4['MACD'] = macd

Indicador MFI

In [None]:
PETR4['MFI'] = MFI(High, Low, Close, Volume, timeperiod=14)

Indicador ROC

In [None]:
PETR4['ROC'] = ROC(Close, timeperiod=10)

Indicador RSI

In [None]:
PETR4['RSI'] = RSI(Close, timeperiod=14)

Indicador ULTOSC

In [None]:
PETR4['ULTOSC'] = ULTOSC(High, Low, Close, timeperiod1=7, timeperiod2=14, timeperiod3=28)

Verificando amostra

In [None]:
PETR4.head()

Removendo NaNs

In [None]:
PETR4 = PETR4.dropna()

In [None]:
PETR4.head()

* Removendo High, Low, Open, Close, Adj Close

In [None]:
PETR4 = PETR4.drop(['Close', 'Adj Close'],axis=1)

* Normalizando dados do Volume

In [None]:
from sklearn.preprocessing import RobustScaler

In [None]:
scaler = RobustScaler()

In [None]:
PETR4['Volume'] = scaler.fit_transform(PETR4['Volume'].values.reshape(-1, 1))

Definindo variáveis X e Y

In [None]:
X = PETR4.drop(['Volume'],axis=1)
Y = PETR4['Volume']

* Criando amostra de treino e teste

In [None]:
X_treino = X[X.index<'2020-01-01']
X_teste = X[X.index>='2020-01-01']

Y_treino = Y[X.index<'2020-01-01']
Y_teste = Y[X.index>='2020-01-01']

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X_treino, X_teste, Y_treino, Y_teste = train_test_split(X,Y,test_size=0.25,random_state=42)

* Prevendo volume utilizando modelo Random Forest Regressor

In [None]:
from sklearn.ensemble import RandomForestRegressor

In [None]:
rfr = RandomForestRegressor()

In [None]:
rfr.fit(X_treino,Y_treino)

In [None]:
Y_previsto = rfr.predict(X_teste)

* Desnormalizando dados

In [None]:
Y_previsto = scaler.inverse_transform(Y_previsto.reshape(-1, 1))

In [None]:
Y_teste = scaler.inverse_transform(Y_teste.values.reshape(-1, 1))

* Graficando Y_previsto em função de Y_teste

In [None]:
plt.scatter(Y_teste,Y_previsto)
plt.xlabel('Y_teste')
plt.ylabel('Y_previsto')
plt.tight_layout()

Calculando métricas de erro

In [None]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

In [None]:
def mean_absolute_percentage_error(y_true, y_pred): 
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

In [None]:
MAE = mean_absolute_error(Y_teste,Y_previsto)
MAPE = mean_absolute_percentage_error(Y_teste,Y_previsto)
MSE = mean_squared_error(Y_teste,Y_previsto)
RMSE = np.sqrt(MSE)

In [None]:
print("MAE = {:0.2e}".format(MAE))
print("MAPE = {:0.2f}%".format(MAPE))
print("MSE = {:0.2e}".format(MSE))
print("RMSE = {:0.2e}".format(RMSE))

Podemos prever o valor com uma incerteza de 24.26%. Um pouco melhor do que o caso sem indicadores. 