# Séries Temporais com Plotly

## Imports

In [32]:
from pandas_datareader import data as web
import yfinance as yf

import plotly.express as px
from plotly import graph_objects

import pandas as pd
import numpy as np

## Dataframe com Pandas Reader

Vamos utilizar a função ```DataReader()``` do pacote ```pandas_datareader```. Essa função já está preparada para estabelecer contato com algumas fontes e solicitar conjuntos de dados. Vamos nos conectar ao Yahoo Finance, e solicitar dados de ações ou de índices. Cada uma dessas coisas estão armazenados em um conjunto de dados. Índices no Yahoo Finance estão em conjuntos de dados cujo nome começa com um circunflexo (```^```) como o IBOVESPA (```^BVSP```) e ações **brasileiras** o nome termina com ```.SA``` como ```PETR4.SA```.

Parâmetros:

- **name** - um string com o nome do conjunto de dados - um ativo ou índice no nosso caso
- **data_source** - o nome da fonte dos dados
- **start** - a data inicial da solicitação
- **end** - a data final da solicitação (se vazio, faz um *slicing* até a data mais recente)

#### Extraindo dados IBOVESPA

In [39]:
%%time 
ativos = '^BVSP'
dt_ini, dt_fim = '2016-01-01', datetime.now()

#Lib data-reader com incompatibilidade então usar o comando abaixo:
df_ibovespa = yf.download('^BVSP', start=dt_ini, end=dt_fim) 
df_ibovespa.head()

[*********************100%%**********************]  1 of 1 completed

CPU times: total: 15.6 ms
Wall time: 657 ms





Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,43349.0,43349.0,42125.0,42141.0,42141.0,2976300
2016-01-05,42139.0,42534.0,42137.0,42419.0,42419.0,2557200
2016-01-06,42410.0,42410.0,41590.0,41773.0,41773.0,3935900
2016-01-07,41772.0,41772.0,40695.0,40695.0,40695.0,4032300
2016-01-08,40695.0,41218.0,40463.0,40612.0,40612.0,3221600


#### Extraindo dados Conjunto

Podemos também usar a função ```get_data_yahoo```, que faz o mesmo, mas não precisamos mais indicar a fonte. 

Se não indicarmos a data de fim, a função traz os dados até a data mais recente.

No caso do Yahoo Finance, podemos colocar um alista de nomes no parâmetro *name*. Vamos testar com algumas das principais ações brasileiras. Olha como fica:

In [44]:
%%time
ativos = ['^BVSP', 'ITUB3.SA', 'PETR4.SA', 'ABEV3.SA', 'VALE3.SA']
dt_ini = '2016-01-01'

yf = web.get_data_yahoo(ativos, start=dt_ini)
yf.head()

[*********************100%%**********************]  5 of 5 completed

CPU times: total: 125 ms
Wall time: 713 ms





Price,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Close,Close,Close,Close,Close,...,Open,Open,Open,Open,Open,Volume,Volume,Volume,Volume,Volume
Ticker,ABEV3.SA,ITUB3.SA,PETR4.SA,VALE3.SA,^BVSP,ABEV3.SA,ITUB3.SA,PETR4.SA,VALE3.SA,^BVSP,...,ABEV3.SA,ITUB3.SA,PETR4.SA,VALE3.SA,^BVSP,ABEV3.SA,ITUB3.SA,PETR4.SA,VALE3.SA,^BVSP
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2016-01-04,12.837883,9.017251,2.50427,7.919021,42141.0,17.209999,14.0,6.87,12.69,42141.0,...,17.73,14.769696,6.57,12.49,43349.0,13206900,278520,45962100,4587900,2976300.0
2016-01-05,13.039292,9.208522,2.435011,7.812937,42419.0,17.48,14.296969,6.68,12.52,42419.0,...,17.25,14.212121,6.92,12.67,42139.0,10774200,99825,29446700,2693500,2557200.0
2016-01-06,12.912477,9.153873,2.332945,7.238822,41773.0,17.309999,14.212121,6.4,11.6,41773.0,...,17.360001,14.236363,6.53,12.08,42410.0,7739100,181995,67507200,6758900,3935900.0
2016-01-07,12.569341,9.044572,2.281912,6.808238,40695.0,16.85,14.042424,6.26,10.91,40695.0,...,17.17,14.036363,6.19,11.26,41772.0,15316400,221925,57387900,6450400,4032300.0
2016-01-08,12.73345,9.185097,2.285557,6.577343,40612.0,17.07,14.260606,6.27,10.54,40612.0,...,16.93,14.175757,6.38,11.07,40695.0,10684000,122100,52100300,4429400,3221600.0


In [45]:
yf.shape

(2025, 30)

* As colunas são tuplas:

In [46]:
yf.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2025 entries, 2016-01-04 to 2024-02-21
Data columns (total 30 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   (Adj Close, ABEV3.SA)  2025 non-null   float64
 1   (Adj Close, ITUB3.SA)  2025 non-null   float64
 2   (Adj Close, PETR4.SA)  2025 non-null   float64
 3   (Adj Close, VALE3.SA)  2025 non-null   float64
 4   (Adj Close, ^BVSP)     2017 non-null   float64
 5   (Close, ABEV3.SA)      2025 non-null   float64
 6   (Close, ITUB3.SA)      2025 non-null   float64
 7   (Close, PETR4.SA)      2025 non-null   float64
 8   (Close, VALE3.SA)      2025 non-null   float64
 9   (Close, ^BVSP)         2017 non-null   float64
 10  (High, ABEV3.SA)       2025 non-null   float64
 11  (High, ITUB3.SA)       2025 non-null   float64
 12  (High, PETR4.SA)       2025 non-null   float64
 13  (High, VALE3.SA)       2025 non-null   float64
 14  (High, ^BVSP)          2017 non-null  

In [47]:
yf.isna().sum()

Price      Ticker  
Adj Close  ABEV3.SA    0
           ITUB3.SA    0
           PETR4.SA    0
           VALE3.SA    0
           ^BVSP       8
Close      ABEV3.SA    0
           ITUB3.SA    0
           PETR4.SA    0
           VALE3.SA    0
           ^BVSP       8
High       ABEV3.SA    0
           ITUB3.SA    0
           PETR4.SA    0
           VALE3.SA    0
           ^BVSP       8
Low        ABEV3.SA    0
           ITUB3.SA    0
           PETR4.SA    0
           VALE3.SA    0
           ^BVSP       8
Open       ABEV3.SA    0
           ITUB3.SA    0
           PETR4.SA    0
           VALE3.SA    0
           ^BVSP       8
Volume     ABEV3.SA    0
           ITUB3.SA    0
           PETR4.SA    0
           VALE3.SA    0
           ^BVSP       8
dtype: int64