# Teste ANOVA para o mercado financeiro

O objetivo dessa tarefa é fazer um teste de média entre as ações dentro da carteira montada no exercício de análise do mercado financeiro. Os detalhes dessa análise podem ser vistos aqui: https://github.com/demetriusjube/ppca-aedi/tree/master/mercado-financeiro.

Como definido lá, montamos uma carteira com as empresas que compõem o FAANG (Facebook, Amazon, Apple, Netflix e Google), juntamente com a Tesla, e utilizamos o índice NASDAQ para comparação. 

Os identificadores das empresas e do índice são os seguintes:
* Meta (antigo Facebook) - META
* Amazon - AMZN
* Apple - AAPL
* Netflix - NFLX
* Google - GOOGL
* Tesla - TSLA
* NASDAQ - ^NDX

## Montagem dos dados para realização do teste

O primeiro passo para isso é montar os dados da carteira. Os detalhes dessa montagem estão descritos no trabalho anterior, e não serão replicados aqui. O resultado final que buscamos é uma relação entre a taxa de retorno médio das ações e o retorno do índice NASDAQ, pois essa comparação que será alvo da ANOVA.

Importação das bibliotecas que serão usadas:

In [1]:
import pandas as pd
from pandas_datareader import data
import numpy as np
import math
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.io as pio
import seaborn as sns
from scipy import stats
from scipy import optimize

pio.renderers.default = 'notebook_connected'

# Define as ações
acoes = ['META','AMZN', 'AAPL', 'NFLX', 'GOOGL', 'TSLA', '^NDX']



Recuperação dos dados das ações da carteira:

In [2]:
# Recupera os dados das ações de 2015 até agora
acoes_df = pd.DataFrame()
for acao in acoes:
    acoes_df[acao] = data.DataReader(acao,
                                     data_source='yahoo', start='2015-01-01')['Close']

# Inclui um índice artificial no resultado das ações
acoes_df.reset_index(inplace=True)

acoes_df

Unnamed: 0,Date,META,AMZN,AAPL,NFLX,GOOGL,TSLA,^NDX
0,2015-01-02,78.449997,15.426000,27.332500,49.848572,26.477501,43.862000,4230.240234
1,2015-01-05,77.190002,15.109500,26.562500,47.311428,25.973000,42.018002,4160.959961
2,2015-01-06,76.150002,14.764500,26.565001,46.501431,25.332001,42.256001,4110.830078
3,2015-01-07,76.150002,14.921000,26.937500,46.742859,25.257500,42.189999,4160.000000
4,2015-01-08,78.180000,15.023000,27.972500,47.779999,25.345501,42.124001,4240.549805
...,...,...,...,...,...,...,...,...
1907,2022-08-01,159.929993,135.389999,161.509995,226.210007,114.860001,891.830017,12940.780273
1908,2022-08-02,160.190002,134.160004,160.009995,221.419998,115.129997,901.760010,12901.599609
1909,2022-08-03,168.800003,139.520004,166.130005,226.729996,118.080002,922.190002,13253.259766
1910,2022-08-04,170.570007,142.570007,165.809998,229.910004,118.190002,925.900024,13311.040039


Montagem das taxas de retorno das ações:

In [3]:
dataset = acoes_df.copy()
# Retirando a coluna data, que não será usada no cálculo
dataset.drop(labels = ['Date'], axis=1, inplace=True)
# Inserção de uma linha para permitir a divisão da taxa de retorno do dia pelo dia anterior
taxas_retorno = np.log(dataset / dataset.shift(1))
taxas_retorno

Unnamed: 0,META,AMZN,AAPL,NFLX,GOOGL,TSLA,^NDX
0,,,,,,,
1,-0.016191,-0.020731,-0.028576,-0.052238,-0.019238,-0.042950,-0.016513
2,-0.013565,-0.023098,0.000094,-0.017269,-0.024989,0.005648,-0.012121
3,0.000000,0.010544,0.013925,0.005178,-0.002945,-0.001563,0.011890
4,0.026309,0.006813,0.037703,0.021946,0.003478,-0.001566,0.019178
...,...,...,...,...,...,...,...
1907,0.005203,0.003255,-0.006172,0.005808,-0.012631,0.000426,-0.000555
1908,0.001624,-0.009126,-0.009331,-0.021402,0.002348,0.011073,-0.003032
1909,0.052354,0.039175,0.037534,0.023699,0.025300,0.022403,0.026892
1910,0.010431,0.021625,-0.001928,0.013928,0.000931,0.004015,0.004350


Montagem das taxas de retorno com as respectivas datas:

In [4]:
dataset_date = acoes_df.copy()
date = dataset_date.filter(["Date"]) 
taxas_retorno_date = pd.concat([date, taxas_retorno], axis=1)
taxas_retorno_date

Unnamed: 0,Date,META,AMZN,AAPL,NFLX,GOOGL,TSLA,^NDX
0,2015-01-02,,,,,,,
1,2015-01-05,-0.016191,-0.020731,-0.028576,-0.052238,-0.019238,-0.042950,-0.016513
2,2015-01-06,-0.013565,-0.023098,0.000094,-0.017269,-0.024989,0.005648,-0.012121
3,2015-01-07,0.000000,0.010544,0.013925,0.005178,-0.002945,-0.001563,0.011890
4,2015-01-08,0.026309,0.006813,0.037703,0.021946,0.003478,-0.001566,0.019178
...,...,...,...,...,...,...,...,...
1907,2022-08-01,0.005203,0.003255,-0.006172,0.005808,-0.012631,0.000426,-0.000555
1908,2022-08-02,0.001624,-0.009126,-0.009331,-0.021402,0.002348,0.011073,-0.003032
1909,2022-08-03,0.052354,0.039175,0.037534,0.023699,0.025300,0.022403,0.026892
1910,2022-08-04,0.010431,0.021625,-0.001928,0.013928,0.000931,0.004015,0.004350


Calculando a média das taxas de retorno da carteira para comparação com o Índice NASDAQ:

In [5]:
taxas_retorno_date["CARTEIRA"] = (taxas_retorno_date['META'] + taxas_retorno_date['AMZN'] + 
                                   taxas_retorno_date['AAPL'] + taxas_retorno_date['NFLX'] + 
                                   taxas_retorno_date['GOOGL'] +taxas_retorno_date['TSLA'] )/6
taxas_retorno_date
taxas_retorno_port = taxas_retorno_date.filter(["Date", "CARTEIRA", '^NDX'])
taxas_retorno_port

Unnamed: 0,Date,CARTEIRA,^NDX
0,2015-01-02,,
1,2015-01-05,-0.029987,-0.016513
2,2015-01-06,-0.012196,-0.012121
3,2015-01-07,0.004190,0.011890
4,2015-01-08,0.015780,0.019178
...,...,...,...
1907,2022-08-01,-0.000685,-0.000555
1908,2022-08-02,-0.004136,-0.003032
1909,2022-08-03,0.033411,0.026892
1910,2022-08-04,0.008167,0.004350


Com esses dados, poderemos checar as condições para saber se o teste ANOVA é viável.