# Capítulo 5 – Desenvolvimento de web crawlers (hands-on)

## Aula 5.1. Coleta de dados de mercado

### Exemplo 1 - Coleta de dados da API Yahoo! Finance

No exemplo a seguir vamos fazer requisições para a série de preços e volume (OHLCV) da ação MBLY3

Insatalação da biblioteca yfinance

In [None]:
!pip install yfinance

Importar a biblioteca yfinance

In [None]:
import yfinance as yf

Definimos o instrumento a ser requisitado

Para isso, primeiro conferir o ticker do instrumento acessando https://finance.yahoo.com/

In [None]:
MBLY3 = yf.Ticker("MBLY3.SA")

In [None]:
df_hist = MBLY3.history(period="5y", interval="1d")

In [None]:
df_hist.head()

In [None]:
df_hist.tail()

### Exemplo 2 - Coleta de dados de mercado de outros provedores via API

#### Exemplo 2.1 - Marketstack (https://marketstack.com/)

In [None]:
key = ""

In [None]:
import requests

url = f"http://api.marketstack.com/v1/eod?access_key={key}&symbols=MBLY3.BVMF"

r = requests.get(url)

In [None]:
r.json()

In [None]:
import pandas as pd

In [None]:
df = pd.DataFrame(r.json()["data"])
df

#### Exemplo 2.2 - Alphavantage (https://www.alphavantage.co/)

In [None]:
key = ""

In [None]:
url = f"https://www.alphavantage.co/query?function=SYMBOL_SEARCH&keywords=mbly&apikey={key}"

In [None]:
r = requests.get(url)

In [None]:
r.json()

In [None]:
url = f"https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=MBLY3.SAO&apikey={key}"

In [None]:
r = requests.get(url)

In [None]:
r.json()

In [None]:
pd.DataFrame(r.json()['Time Series (Daily)'])

In [None]:
pd.DataFrame(r.json()['Time Series (Daily)']).T

#### Exemplo 2.3 - EOD (https://eodhistoricaldata.com/)

In [None]:
key = ""

In [None]:
url = f"https://eodhistoricaldata.com/api/exchanges-list/?api_token={key}&fmt=json"

In [None]:
r = requests.get(url)

In [None]:
r.json()

In [None]:
EXCHANGE_CODE = "SA"

In [None]:
url = f"https://eodhistoricaldata.com/api/exchange-symbol-list/{EXCHANGE_CODE}?api_token={key}&fmt=json"

In [None]:
r = requests.get(url)

In [None]:
r.json()

In [None]:
for dict_company in r.json():    
    if "mobly" in dict_company["Name"].lower():
        code = dict_company["Code"]

In [None]:
url = f"https://eodhistoricaldata.com/api/eod/{code}.{EXCHANGE_CODE}?api_token={key}&fmt=json"

In [None]:
r = requests.get(url)

In [None]:
r.json()

In [None]:
pd.DataFrame(r.json())

### Exemplo 3 - Coleta de dados de derivativos da B3

Primeiro vamos inspecionar o site https://www.b3.com.br/pt_br/market-data-e-indices/servicos-de-dados/market-data/historico/derivativos/ajustes-do-pregao/

In [None]:
url = "https://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-ajustes-do-pregao-ptBR.asp"

In [None]:
query = {
    "dData1": "10/03/2023"
}

In [None]:
import requests

In [None]:
r = requests.post(
    url, 
    params=query
)

In [None]:
r.content

In [None]:
!pip install requests-html

In [None]:
import requests_html

In [None]:
table_html = r.html.xpath("//table[contains(@id,'tblDadosAjustes')]")

In [None]:
r = requests_html.HTMLSession().post(
    url, 
    params=query
)

In [None]:
table_html = r.html.xpath("//table[contains(@id,'tblDadosAjustes')]")

In [None]:
len(table_html)

In [None]:
table_html[0].full_text

In [None]:
for element_ in table_html[0].xpath("//tr"):
    print(element_.text.split("\n"))

In [None]:
list_linhas = []

for element_ in table_html[0].xpath("//tr")[1:]:
    if len(element_.text.split("\n")) == 6:
        instrumento = element_.text.split("\n")[0]
        linha = element_.text.split("\n")
    if len(element_.text.split("\n")) == 5:
        linha = [instrumento] + element_.text.split("\n")        
    list_linhas.append(linha)

In [None]:
list_linhas

In [None]:
table_html[0].xpath("//tr")[0].text.split("\n")

In [None]:
import pandas as pd

In [None]:
df_futuros = pd.DataFrame(list_linhas, columns=table_html[0].xpath("//tr")[0].text.split("\n"))

In [None]:
df_futuros.head()

In [None]:
df_futuros.tail()