# Notebook dedicado à seleção das cryptos que serão candidatas a indicação ao cliente e ao carregamento dos dados relativos a preços e volumes

# 1. Bibliotecas

In [111]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import yfinance as yf
import datetime as dt

In [4]:
#Vamos precisar também da API da Coin Market Cap para baixar a lista de IDs disponíveis:
#!pip install python-coinmarketcap

Collecting python-coinmarketcap
  Downloading python_coinmarketcap-0.2-py3-none-any.whl (7.8 kB)
Installing collected packages: python-coinmarketcap
Successfully installed python-coinmarketcap-0.2


In [5]:
import coinmarketcapapi

# 2. Carregamento dos dados gerais

Vamos usar a API do CoinMarketCap para puxar os criptoativos disponíveis hoje. Porém, como o uso da plataforma free é limitado a alguns créditos por mês e os valores históricos são acessíveis somente com API paga, vamos deixar essa segunda parte com o Yfinance.

In [16]:
key = "f07f99c4-b668-44d9-8ccd-08107b1eecac"
cmc = coinmarketcapapi.CoinMarketCapAPI(key)

In [17]:
data_id_map = cmc.cryptocurrency_map()


In [27]:
df_info = pd.DataFrame(data_id_map.data, columns =['name','symbol','rank','is_active','first_historical_data'])
df_info.set_index('symbol',inplace=True)
df_info

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
LTC,Litecoin,14,1,2013-04-28T18:47:22.000Z
NMC,Namecoin,740,1,2013-04-28T18:47:22.000Z
TRC,Terracoin,1859,1,2013-04-28T18:47:22.000Z
PPC,Peercoin,657,1,2013-04-28T18:47:23.000Z
...,...,...,...,...
POLP,PolkaParty,3581,1,2021-08-23T08:33:03.000Z
GRIM,GrimToken,6127,1,2021-08-23T08:27:03.000Z
ROBODOGE,RoboDoge Coin,3267,1,2021-08-23T08:39:03.000Z
CSWAP,CardSwap,2944,1,2021-08-23T08:42:03.000Z


# 3. Seleção das Cryptos

Temos os símbolos, nomes, rank por MarketCap, informação sobre atividade ou não e também a data da primeira cotação, para mais de 6k ativos.

Vamos filtrar apenas pelos assets ainda ativos hoje:

In [28]:
df_info[df_info['is_active']==1]

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
LTC,Litecoin,14,1,2013-04-28T18:47:22.000Z
NMC,Namecoin,740,1,2013-04-28T18:47:22.000Z
TRC,Terracoin,1859,1,2013-04-28T18:47:22.000Z
PPC,Peercoin,657,1,2013-04-28T18:47:23.000Z
...,...,...,...,...
POLP,PolkaParty,3581,1,2021-08-23T08:33:03.000Z
GRIM,GrimToken,6127,1,2021-08-23T08:27:03.000Z
ROBODOGE,RoboDoge Coin,3267,1,2021-08-23T08:39:03.000Z
CSWAP,CardSwap,2944,1,2021-08-23T08:42:03.000Z


Todos os assets estão ativos (é muita moeda...). Como temos o rank por MarketCap, vamos selecionar as 50 moedas com maior valor de mercado hoje:

In [33]:
df_info_top = df_info[df_info['rank'] <= 50 ].sort_values(by='rank', ascending=True)
df_info_top

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
ETH,Ethereum,2,1,2015-08-07T14:49:30.000Z
ADA,Cardano,3,1,2017-10-01T20:34:25.000Z
BNB,Binance Coin,4,1,2017-07-25T04:30:05.000Z
USDT,Tether,5,1,2015-02-25T13:34:26.000Z
XRP,XRP,6,1,2013-08-04T18:51:05.000Z
DOGE,Dogecoin,7,1,2013-12-15T14:42:34.000Z
DOT,Polkadot,8,1,2020-08-20T03:29:22.000Z
USDC,USD Coin,9,1,2018-10-08T18:49:28.000Z
SOL,Solana,10,1,2020-04-10T04:59:18.000Z


Agora, vamos admitir somente aquelas cryptos que possuem disponibilidade de dados maior do que 3 anos, ou seja, que foram listadas antes de 01/08/2018:

In [53]:
df_eligible = df_info_top[df_info_top['first_historical_data']<="2018-08-01"]
df_eligible

Unnamed: 0_level_0,name,rank,is_active,first_historical_data
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z
ETH,Ethereum,2,1,2015-08-07T14:49:30.000Z
ADA,Cardano,3,1,2017-10-01T20:34:25.000Z
BNB,Binance Coin,4,1,2017-07-25T04:30:05.000Z
USDT,Tether,5,1,2015-02-25T13:34:26.000Z
XRP,XRP,6,1,2013-08-04T18:51:05.000Z
DOGE,Dogecoin,7,1,2013-12-15T14:42:34.000Z
LINK,Chainlink,12,1,2017-09-20T20:54:59.000Z
BCH,Bitcoin Cash,13,1,2017-07-23T16:29:27.000Z
LTC,Litecoin,14,1,2013-04-28T18:47:22.000Z


In [54]:
df_eligible.shape

(22, 4)

Para preparar o df para o carregamento das informações de preço através do yahoo finance, precisamos que as moedas terminem com o sufixo -USD. Vamos fazer isso em nova coluna:

In [55]:
df_eligible['ticker'] = df_eligible.index + "-USD"
df_eligible

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_eligible['ticker'] = df_eligible.index + "-USD"


Unnamed: 0_level_0,name,rank,is_active,first_historical_data,ticker
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
BTC,Bitcoin,1,1,2013-04-28T18:47:21.000Z,BTC-USD
ETH,Ethereum,2,1,2015-08-07T14:49:30.000Z,ETH-USD
ADA,Cardano,3,1,2017-10-01T20:34:25.000Z,ADA-USD
BNB,Binance Coin,4,1,2017-07-25T04:30:05.000Z,BNB-USD
USDT,Tether,5,1,2015-02-25T13:34:26.000Z,USDT-USD
XRP,XRP,6,1,2013-08-04T18:51:05.000Z,XRP-USD
DOGE,Dogecoin,7,1,2013-12-15T14:42:34.000Z,DOGE-USD
LINK,Chainlink,12,1,2017-09-20T20:54:59.000Z,LINK-USD
BCH,Bitcoin Cash,13,1,2017-07-23T16:29:27.000Z,BCH-USD
LTC,Litecoin,14,1,2013-04-28T18:47:22.000Z,LTC-USD


## Resumo da estratégia de seleção:

- Escolhemos apenas cryptoassets que estejam ativos ainda hoje (6.124);
- Filtramos pelos 50 maiores ativos, por ordem de tamanho de mercado;
- Excluímos aqueles com menos de 3 anos de mercado, para termos dados suficientes para fazer as projeções.
- Terminamos com 22 ativos.

# 4. Carregamento de dados

## Carregamento de market cap, após seleção das cryptos - YFinance

In [125]:
for ativo in df_eligible.index:
    df_eligible.loc[ativo,'marketCap'] = yf.Ticker(ativo+"-USD").info['marketCap']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[key] = infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_column(loc, value, pi)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_column(loc, value, pi)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using 

In [126]:
#Vamos deixar o marketCap em bilhões de dólares:
df_eligible['marketCap_1E9'] = df_eligible['marketCap']/1e9
df_eligible.drop(columns = ['marketCap'], inplace=True)
df_eligible['first_historical_data'] = pd.to_datetime(df_eligible['first_historical_data'])
df_eligible

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_eligible['marketCap_1E9'] = df_eligible['marketCap']/1e9
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_eligible['first_historical_data'] = pd.to_datetime(df_eligible['first_historical_data'])


Unnamed: 0_level_0,name,rank,is_active,first_historical_data,ticker,marketCap_1E9
symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
BTC,Bitcoin,1,1,2013-04-28 18:47:21+00:00,BTC-USD,924.158198
ETH,Ethereum,2,1,2015-08-07 14:49:30+00:00,ETH-USD,386.169176
ADA,Cardano,3,1,2017-10-01 20:34:25+00:00,ADA-USD,92.408947
BNB,Binance Coin,4,1,2017-07-25 04:30:05+00:00,BNB-USD,83.848086
USDT,Tether,5,1,2015-02-25 13:34:26+00:00,USDT-USD,64.845746
XRP,XRP,6,1,2013-08-04 18:51:05+00:00,XRP-USD,57.242149
DOGE,Dogecoin,7,1,2013-12-15 14:42:34+00:00,DOGE-USD,41.228456
LINK,Chainlink,12,1,2017-09-20 20:54:59+00:00,LINK-USD,12.712209
BCH,Bitcoin Cash,13,1,2017-07-23 16:29:27+00:00,BCH-USD,12.542516
LTC,Litecoin,14,1,2013-04-28 18:47:22+00:00,LTC-USD,12.314231


## Carregando e salvando os dados de preço e volume por ativo após seleção das cryptos

In [133]:
ativo

'USDT-USD'

In [139]:
#Vamos acessar os dados de cada ativo, exportando-os para arquivos csv específicos e juntando tudo num dataframe único, o df_ativos.
#O df_ativos também será salvo ao final.

final = dt.datetime(2021,8,20)
path = r"C:\Users\Alexandre\OneDrive\Documentos\1. PRO\Data Science\Projeto Integrador\Data\cryptos"
extension = ".csv"

df_ativos = pd.DataFrame(columns = ['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume'])

for ativo in df_eligible.ticker:
     
    inicio = df_eligible.loc[ativo[:-4],['first_historical_data']][0]
    print(f"Iniciando coleta do ativo {ativo} desde {inicio}")
    
    dados = yf.download(ativo, inicio, final)
    
    file = str(ativo)
    
    dados.to_csv(path + "\\" + file + extension)
    
    df_ativos = pd.concat([df_ativos, dados])

    

Iniciando coleta do ativo BTC-USD desde 2013-04-28 18:47:21+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo ETH-USD desde 2015-08-07 14:49:30+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo ADA-USD desde 2017-10-01 20:34:25+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo BNB-USD desde 2017-07-25 04:30:05+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo USDT-USD desde 2015-02-25 13:34:26+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo XRP-USD desde 2013-08-04 18:51:05+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo DOGE-USD desde 2013-12-15 14:42:34+00:00
[*********************100%***********************]  1 of 1 completed
Iniciando coleta do ativo LINK-USD desde 2017-09-20 2

In [140]:
df_ativos

Unnamed: 0,Open,High,Low,Close,Adj Close,Volume
2014-09-17,465.864014,468.174011,452.421997,457.334015,457.334015,21056800
2014-09-18,456.859985,456.859985,413.104004,424.440002,424.440002,34483200
2014-09-19,424.102997,427.834991,384.532013,394.795990,394.795990,37919700
2014-09-20,394.673004,423.295990,389.882996,408.903992,408.903992,36863600
2014-09-21,408.084991,412.425995,393.181000,398.821014,398.821014,26580100
...,...,...,...,...,...,...
2021-08-16,215.874359,216.735474,199.934326,202.093277,202.093277,355403656
2021-08-17,201.299622,211.345093,186.345810,194.603912,194.603912,424686878
2021-08-18,194.697891,213.366608,187.295181,210.221527,210.221527,432819051
2021-08-19,209.888916,220.769531,199.233871,220.021484,220.021484,397627914


# 5. Armazenamento dos dados das cryptos escolhidas

Vamos salvar o df_eligible, com dados globais dos 22 ativos:

In [136]:
df_eligible.to_csv(r"C:\Users\Alexandre\OneDrive\Documentos\1. PRO\Data Science\Projeto Integrador\Data\df_eligible.csv")

In [142]:
df_ativos.to_csv(r"C:\Users\Alexandre\OneDrive\Documentos\1. PRO\Data Science\Projeto Integrador\Data\df_ativos.csv")