# Notebook dedicado à seleção das cryptos que serão candidatas a indicação ao cliente e ao carregamento dos dados relativos a preços e volumes

# 1. Bibliotecas

In [12]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import yfinance as yf
import datetime as dt

In [13]:
#Vamos precisar também da API da Coin Market Cap para baixar a lista de IDs disponíveis:
#!pip install python-coinmarketcap

In [14]:
import coinmarketcapapi

# 2. Carregamento dos dados gerais

Vamos usar a API do CoinMarketCap para puxar os criptoativos disponíveis hoje. Porém, como o uso da plataforma free é limitado a alguns créditos por mês e os valores históricos são acessíveis somente com API paga, vamos deixar essa segunda parte com o Yfinance.

In [38]:
key = "f07f99c4-b668-44d9-8ccd-08107b1eecac"
cmc = coinmarketcapapi.CoinMarketCapAPI(key)

In [44]:
data = cmc.cryptocurrency_info(symbol='BTC')


In [50]:
data

RESPONSE: 475ms OK: {'BTC': {'id': 1, 'name': 'Bitcoin', 'symbol': 'BTC', 'category': 'coin', 'description': 'Bitcoin (BTC) is a cryptocurrency . Users are able to generate BTC through the process of mining. Bitcoin has a current supply of 18,803,343. The last known price of Bitcoin is 47,594.92127839 USD and is down -0.03 over the last 24 hours. It is currently trading on 8849 active market(s) with $37,219,978,665.92 traded over the last 24 hours. More information can be found at https://bitcoin.org/.', 'slug': 'bitcoin', 'logo': 'https://s2.coinmarketcap.com/static/img/coins/64x64/1.png', 'subreddit': 'bitcoin', 'notice': '', 'tags': ['mineable', 'pow', 'sha-256', 'store-of-value', 'state-channels', 'coinbase-ventures-portfolio', 'three-arrows-capital-portfolio', 'polychain-capital-portfolio', 'binance-labs-portfolio', 'arrington-xrp-capital', 'blockchain-capital-portfolio', 'boostvc-portfolio', 'cms-holdings-portfolio', 'dcg-portfolio', 'dragonfly-capital-portfolio', 'electric-capit

In [16]:
data_id_map = cmc.cryptocurrency_map()


In [17]:
df_info = pd.DataFrame(data_id_map.data, columns =['name','symbol','rank','is_active','first_historical_data'])
df_info.set_index('symbol',inplace=True)
df_info

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
LTC,Litecoin,16,1,2013-04-28T18:47:22.000Z
NMC,Namecoin,750,1,2013-04-28T18:47:22.000Z
TRC,Terracoin,1876,1,2013-04-28T18:47:22.000Z
PPC,Peercoin,707,1,2013-04-28T18:47:23.000Z
...,...,...,...,...
RHYTHM,Rhythm,3371,1,2021-08-31T19:42:22.000Z
UJENNY,Jenny Metaverse DAO Token,3186,1,2021-08-31T18:30:27.000Z
KURAI,Kurai MetaVerse,3063,1,2021-08-31T19:03:36.000Z
BGLG,BIG League,3541,1,2021-09-01T05:48:31.000Z


# 3. Seleção das Cryptos

Temos os símbolos, nomes, rank por MarketCap, informação sobre atividade ou não e também a data da primeira cotação, para mais de 6k ativos.

Vamos filtrar apenas pelos assets ainda ativos hoje:

In [18]:
df_info[df_info['is_active']==1]

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
LTC,Litecoin,16,1,2013-04-28T18:47:22.000Z
NMC,Namecoin,750,1,2013-04-28T18:47:22.000Z
TRC,Terracoin,1876,1,2013-04-28T18:47:22.000Z
PPC,Peercoin,707,1,2013-04-28T18:47:23.000Z
...,...,...,...,...
RHYTHM,Rhythm,3371,1,2021-08-31T19:42:22.000Z
UJENNY,Jenny Metaverse DAO Token,3186,1,2021-08-31T18:30:27.000Z
KURAI,Kurai MetaVerse,3063,1,2021-08-31T19:03:36.000Z
BGLG,BIG League,3541,1,2021-09-01T05:48:31.000Z


Todos os assets estão ativos (é muita moeda...). Como temos o rank por MarketCap, vamos selecionar as 50 moedas com maior valor de mercado hoje:

In [19]:
df_info_top = df_info[df_info['rank'] <= 50 ].sort_values(by='rank', ascending=True)
df_info_top

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
ETH,Ethereum,2,1,2015-08-07T14:49:30.000Z
ADA,Cardano,3,1,2017-10-01T20:34:25.000Z
BNB,Binance Coin,4,1,2017-07-25T04:30:05.000Z
USDT,Tether,5,1,2015-02-25T13:34:26.000Z
XRP,XRP,6,1,2013-08-04T18:51:05.000Z
DOGE,Dogecoin,7,1,2013-12-15T14:42:34.000Z
SOL,Solana,8,1,2020-04-10T04:59:18.000Z
DOT,Polkadot,9,1,2020-08-20T03:29:22.000Z
USDC,USD Coin,10,1,2018-10-08T18:49:28.000Z


Agora, vamos admitir somente aquelas cryptos que possuem disponibilidade de dados maior do que 3 anos, ou seja, que foram listadas antes de 01/08/2018:

In [20]:
df_eligible = df_info_top[df_info_top['first_historical_data']<="2018-08-01"]
df_eligible

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
ETH,Ethereum,2,1,2015-08-07T14:49:30.000Z
ADA,Cardano,3,1,2017-10-01T20:34:25.000Z
BNB,Binance Coin,4,1,2017-07-25T04:30:05.000Z
USDT,Tether,5,1,2015-02-25T13:34:26.000Z
XRP,XRP,6,1,2013-08-04T18:51:05.000Z
DOGE,Dogecoin,7,1,2013-12-15T14:42:34.000Z
LINK,Chainlink,14,1,2017-09-20T20:54:59.000Z
BCH,Bitcoin Cash,15,1,2017-07-23T16:29:27.000Z
LTC,Litecoin,16,1,2013-04-28T18:47:22.000Z


In [21]:
df_eligible.shape

(22, 4)

Para preparar o df para o carregamento das informações de preço através do yahoo finance, precisamos que as moedas terminem com o sufixo -USD. Vamos fazer isso em nova coluna:

In [57]:
df_eligible['ticker'] = df_eligible.index + "-USD"
df_eligible

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_eligible['ticker'] = df_eligible.index + "-USD"


Unnamed: 0_level_0,name,rank,is_active,first_historical_data,ticker,marketCap_1E9,description
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
BTC,Bitcoin,1,1,2013-04-28 18:47:21+00:00,BTC-USD,895.967101,Bitcoin (BTC) is a cryptocurrency . Users are ...
ETH,Ethereum,2,1,2015-08-07 14:49:30+00:00,ETH-USD,416.989282,Ethereum (ETH) is a cryptocurrency . Users are...
ADA,Cardano,3,1,2017-10-01 20:34:25+00:00,ADA-USD,90.905608,Cardano (ADA) is a cryptocurrency . Users are ...
BNB,Binance Coin,4,1,2017-07-25 04:30:05+00:00,BNB-USD,79.769387,Binance Coin (BNB) is a cryptocurrency . Binan...
USDT,Tether,5,1,2015-02-25 13:34:26+00:00,USDT-USD,65.645732,Tether (USDT) is a cryptocurrency launched in ...
XRP,XRP,6,1,2013-08-04 18:51:05+00:00,XRP-USD,55.544893,XRP (XRP) is a cryptocurrency . XRP has a curr...
DOGE,Dogecoin,7,1,2013-12-15 14:42:34+00:00,DOGE-USD,36.868809,Dogecoin (DOGE) is a cryptocurrency . Users ar...
LINK,Chainlink,14,1,2017-09-20 20:54:59+00:00,LINK-USD,12.443192,Chainlink (LINK) is a cryptocurrency and opera...
BCH,Bitcoin Cash,15,1,2017-07-23 16:29:27+00:00,BCH-USD,12.077183,Bitcoin Cash (BCH) is a cryptocurrency . Users...
LTC,Litecoin,16,1,2013-04-28 18:47:22+00:00,LTC-USD,11.649904,Litecoin (LTC) is a cryptocurrency . Users are...


## Resumo da estratégia de seleção:

- Escolhemos apenas cryptoassets que estejam ativos ainda hoje (6.124);
- Filtramos pelos 50 maiores ativos, por ordem de tamanho de mercado;
- Excluímos aqueles com menos de 3 anos de mercado, para termos dados suficientes para fazer as projeções.
- Terminamos com 22 ativos.

# 4. Carregamento de dados

## Carregamento de market cap, após seleção das cryptos - YFinance

In [54]:
for ativo in df_eligible.index:
    df_eligible.loc[ativo,'marketCap'] = yf.Ticker(ativo+"-USD").info['marketCap']
    df_eligible.loc[ativo,'description'] = yf.Ticker(ativo+"-USD").info['description']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[key] = infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_column(loc, value, pi)


In [55]:
#Vamos deixar o marketCap em bilhões de dólares:
df_eligible['marketCap_1E9'] = df_eligible['marketCap']/1e9
df_eligible.drop(columns = ['marketCap'], inplace=True)
df_eligible['first_historical_data'] = pd.to_datetime(df_eligible['first_historical_data'])
df_eligible

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_eligible['marketCap_1E9'] = df_eligible['marketCap']/1e9
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_eligible['first_historical_data'] = pd.to_datetime(df_eligible['first_historical_data'])


Unnamed: 0_level_0,name,rank,is_active,first_historical_data,ticker,marketCap_1E9,description
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
BTC,Bitcoin,1,1,2013-04-28 18:47:21+00:00,BTC-USD,895.967101,Bitcoin (BTC) is a cryptocurrency . Users are ...
ETH,Ethereum,2,1,2015-08-07 14:49:30+00:00,ETH-USD,416.989282,Ethereum (ETH) is a cryptocurrency . Users are...
ADA,Cardano,3,1,2017-10-01 20:34:25+00:00,ADA-USD,90.905608,Cardano (ADA) is a cryptocurrency . Users are ...
BNB,Binance Coin,4,1,2017-07-25 04:30:05+00:00,BNB-USD,79.769387,Binance Coin (BNB) is a cryptocurrency . Binan...
USDT,Tether,5,1,2015-02-25 13:34:26+00:00,USDT-USD,65.645732,Tether (USDT) is a cryptocurrency launched in ...
XRP,XRP,6,1,2013-08-04 18:51:05+00:00,XRP-USD,55.544893,XRP (XRP) is a cryptocurrency . XRP has a curr...
DOGE,Dogecoin,7,1,2013-12-15 14:42:34+00:00,DOGE-USD,36.868809,Dogecoin (DOGE) is a cryptocurrency . Users ar...
LINK,Chainlink,14,1,2017-09-20 20:54:59+00:00,LINK-USD,12.443192,Chainlink (LINK) is a cryptocurrency and opera...
BCH,Bitcoin Cash,15,1,2017-07-23 16:29:27+00:00,BCH-USD,12.077183,Bitcoin Cash (BCH) is a cryptocurrency . Users...
LTC,Litecoin,16,1,2013-04-28 18:47:22+00:00,LTC-USD,11.649904,Litecoin (LTC) is a cryptocurrency . Users are...


## Carregando e salvando os dados de preço e volume por ativo após seleção das cryptos

In [26]:
#Vamos acessar os dados de cada ativo, exportando-os para arquivos csv específicos e juntando tudo num dataframe único, o df_ativos.
#O df_ativos também será salvo ao final.

final = dt.datetime(2021,8,20)
path = r"C:\Users\Alexandre\OneDrive\Documentos\1. PRO\Data Science\Projeto Integrador\Dados\cryptos"
extension = ".csv"

df_ativos = pd.DataFrame(columns = ['Ativo', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'])

for ativo in df_eligible.ticker:
     
    inicio = df_eligible.loc[ativo[:-4],['first_historical_data']][0]
    print(f"Iniciando coleta do ativo {ativo} desde {inicio}")
    
    dados = yf.download(ativo, inicio, final)
    dados['Ativo'] = ativo
    
    file = str(ativo)
    
    dados.to_csv(path + "\\" + file + extension)
    
    df_ativos = pd.concat([df_ativos, dados])

    

Iniciando coleta do ativo BTC-USD desde 2013-04-28 18:47:21+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo ETH-USD desde 2015-08-07 14:49:30+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo ADA-USD desde 2017-10-01 20:34:25+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo BNB-USD desde 2017-07-25 04:30:05+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo USDT-USD desde 2015-02-25 13:34:26+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo XRP-USD desde 2013-08-04 18:51:05+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo DOGE-USD desde 2013-12-15 14:42:34+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo LINK-USD desde 2017-09-20 2

In [27]:
df_ativos

Unnamed: 0,Ativo,Open,High,Low,Close,Adj Close,Volume
2014-09-17,BTC-USD,465.864014,468.174011,452.421997,457.334015,457.334015,21056800
2014-09-18,BTC-USD,456.859985,456.859985,413.104004,424.440002,424.440002,34483200
2014-09-19,BTC-USD,424.102997,427.834991,384.532013,394.795990,394.795990,37919700
2014-09-20,BTC-USD,394.673004,423.295990,389.882996,408.903992,408.903992,36863600
2014-09-21,BTC-USD,408.084991,412.425995,393.181000,398.821014,398.821014,26580100
...,...,...,...,...,...,...,...
2021-08-16,MIOTA-USD,1.173197,1.224370,1.100885,1.110911,1.110911,76953109
2021-08-17,MIOTA-USD,1.107984,1.142450,1.000799,1.020915,1.020915,75964472
2021-08-18,MIOTA-USD,1.019151,1.058414,0.970220,1.002860,1.002860,61076512
2021-08-19,MIOTA-USD,1.001436,1.059275,0.952945,1.059275,1.059275,62111103


# 5. Armazenamento dos dados das cryptos escolhidas

Vamos salvar o df_eligible, com dados globais dos 22 ativos:

In [56]:
df_eligible.to_csv(r"C:\Users\Alexandre\OneDrive\Documentos\1. PRO\Data Science\Projeto Integrador\Dados\df_eligible.csv")

In [29]:
df_ativos.to_csv(r"C:\Users\Alexandre\OneDrive\Documentos\1. PRO\Data Science\Projeto Integrador\Dados\df_ativos.csv", index=True)