![thiago-2.png](attachment:thiago-2.png)
# <font color='green'>Thiago Ribeiro</font>
### <font color='green'>Projeto - Business Analytics</font>
### <font color='blue'>AI Bot Trader - Robô Investidor</font>

**Contatos** - [Linkedin](https://www.linkedin.com/in/thiagoribeirorj/) **e**
[WhatsApp](https://api.whatsapp.com/send?phone=5522998834213&text=Ol%C3%A1%2C%20tudo%20bem%3F)


_**Pacotes utilizados neste projeto**_

In [1]:
# eXchange para importação dos dados
#!pip install -q ccxt

In [2]:
# Para otimização bayesiana
#! pip install -q bayesian-optimization==1.2

In [3]:
# Versões dos pacotes deste projeto
#!pip install pandas==1.2.1
#!pip install numpy==1.20.0
#!pip install seaborn==0.11.1
#!pip install ccxt==1.42.1
#!pip install matplotlib==3.3.4
#!pip install csv==1.0

In [4]:
# Pacotes gerais
import csv
import ccxt
import time
import random
import types
import pkg_resources
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from bayes_opt import BayesianOptimization
from pprint import pprint
from datetime import datetime
sns.set()

In [5]:
# Versões utilizadas neste projeto
%reload_ext watermark
%watermark -a "Thiago Ribeiro" --iversions 

Author: Thiago Ribeiro

numpy     : 1.20.0
pandas    : 1.2.1
ccxt      : 1.42.1
seaborn   : 0.11.1
csv       : 1.0
matplotlib: 3.3.4



_**Extração de dados em Tempo Real**_

_Função para Gravar dados_

In [6]:
def grava_csv(arquivo, dados):
    with open(arquivo, mode ='w') as arquivo_saida:
        arquivo_saida.write("Date, Open, Hight, Low, Close, Adj Close, Volume\n")
        csv_writer = csv.writer(arquivo_saida, delimiter = ',', quotechar = '"', quoting = csv.QUOTE_MINIMAL)
        csv_writer.writerows(dados)

_Função para conexão à eXchange_

In [7]:
def conecta_exchange(exchange, max_retries, symbol, timeframe, since, limit):
    num_retries = 0
    
    try:
        num_retries += 1
        ohlcv = exchange.fetch_ohlcv(symbol, timeframe, since)
        return ohlcv
    except Exception:
        if num_retries > max_retries:
            raise
        

_Função para Extração de Dados_

In [8]:
# Função
def extrai_dados(exchange, max_retries, symbol, timeframe, since, limit):
    
    
    earliest_timestamp = exchange.milliseconds()
    
    
    timeframe_duration_in_seconds = exchange.parse_timeframe(timeframe)
    
    
    timeframe_duration_in_ms = timeframe_duration_in_seconds * 1000
    
    
    timedelta = limit * timeframe_duration_in_ms
    
    # Lista para os dados
    all_ohlcv = []
    
    # Loop
    while True:
        
        # Data de início para extração
        fetch_since = earliest_timestamp - timedelta
        
        # Conecta na exchange 
        ohlcv = conecta_exchange(exchange, max_retries, symbol, timeframe, fetch_since, limit)
        
        # Tratando limite de conexão
        if ohlcv[0][0] >= earliest_timestamp:
            break
        
        # Atualiza o tempo mais cedo
        earliest_timestamp = ohlcv[0][0]
        
        # Atualizando os dados
        all_ohlcv = ohlcv + all_ohlcv
        
        # Print do andamento
        print(len(all_ohlcv), 'registros extraídos de', exchange.iso8601(all_ohlcv[0][0]), 'a', exchange.iso8601(all_ohlcv[-1][0]))
        
        if fetch_since < since:
            break
            
    return all_ohlcv

_Função para extração dos dados e salvar em CSV_

In [9]:
def extrai_dados_para_csv(filename, exchange_id, max_retries, symbol, timeframe, since, limit):
    
    exchange = getattr(ccxt, exchange_id)({'enableRateLimt': True,})
    
    # Checando a consistência
    if isinstance(since, str):
        since = exchange.parse8601(since)
        
        #Extrai o que está sendo comercializado
        exchange.load_markets()
        
        #Extração dos dados
        ohlcv = extrai_dados(exchange, max_retries, symbol, timeframe, since, limit)
        
        key = 0
        
        # Loop
        for item in ohlcv:
            epoch = int(item[0]) /  1000
            ohlcv[key][0] = datetime.utcfromtimestamp(epoch).strftime('%Y-%m-%d')
            ohlcv[key][5] = int(item[5])
            ohlcv[key].append(ohlcv[key][5])
            ohlcv[key][5] = ohlcv[key][4]
            key += 1
            
            # Verificação dos dados
            ohlen = len(ohlcv)
            
            pprint("Números de Registros:" + str(ohlen))
            
        if ohlen > 399:
            ohrem = ohlen - 399
            pprint ("Removendo: " + str(ohrem))
            ohlcv = ohlcv[ohrem:]
            
            # Gravando dos dados em csv
            grava_csv(filename, ohlcv)
            
            # Exibindo a gravação
            print('Salvos', len(ohlcv), 'registros no arquivo', filename)
    

_Definindo os parâmetros para extração dos dados_

In [10]:
exchange = "bitmex"
simbolo = "BTC/USD"
janela = "1d"
data_inicio = "2018-01-01T00:00:00Z"
outfile = "dados/dataset.csv"

_Extração de dados_

In [11]:
extrai_dados_para_csv(outfile, exchange, 3, simbolo, janela, data_inicio, 100)

100 registros extraídos de 2023-11-17T00:00:00.000Z a 2024-02-24T00:00:00.000Z
200 registros extraídos de 2023-08-09T00:00:00.000Z a 2024-02-24T00:00:00.000Z
300 registros extraídos de 2023-05-01T00:00:00.000Z a 2024-02-24T00:00:00.000Z
400 registros extraídos de 2023-01-21T00:00:00.000Z a 2024-02-24T00:00:00.000Z
500 registros extraídos de 2022-10-13T00:00:00.000Z a 2024-02-24T00:00:00.000Z
600 registros extraídos de 2022-07-05T00:00:00.000Z a 2024-02-24T00:00:00.000Z
700 registros extraídos de 2022-03-27T00:00:00.000Z a 2024-02-24T00:00:00.000Z
800 registros extraídos de 2021-12-17T00:00:00.000Z a 2024-02-24T00:00:00.000Z
900 registros extraídos de 2021-09-08T00:00:00.000Z a 2024-02-24T00:00:00.000Z
1000 registros extraídos de 2021-05-31T00:00:00.000Z a 2024-02-24T00:00:00.000Z
1100 registros extraídos de 2021-02-20T00:00:00.000Z a 2024-02-24T00:00:00.000Z
1200 registros extraídos de 2020-11-12T00:00:00.000Z a 2024-02-24T00:00:00.000Z
1300 registros extraídos de 2020-08-04T00:00:00.0

_**Carregando e Explorando os dados**_

In [12]:
df = pd.read_csv(outfile)

In [13]:
df.head()

Unnamed: 0,Date,Open,Hight,Low,Close,Adj Close,Volume
0,2023-01-22,22778.5,23066.0,22265.0,22697.5,22697.5,367080200
1,2023-01-23,22697.5,23175.0,22478.5,22910.5,22910.5,374543500
2,2023-01-24,22910.5,23154.5,22454.0,22629.5,22629.5,360087600
3,2023-01-25,22629.5,23903.0,22333.5,23055.0,23055.0,502615700
4,2023-01-26,23055.0,23277.5,22847.5,23004.0,23004.0,361910300


In [14]:
df.shape

(399, 7)

In [16]:
# Fechamento dos dados
close = df.Close.values.tolist()

AttributeError: 'DataFrame' object has no attribute 'Close'

In [None]:
# Outros parâmetros da versão do modelo
window_size = 30
skip = 5
l = len(close) -1