# Predicting 100x crypto coins

## Time-series machine-learning model for cryptocurrencies

**Goal:** to train a predictor which is capable of identyfing "100x" coins and tokens basing solely on the past trace data. 

**Definitions:**

* The "100x" term means an asset which price (in USD) grows hundred times fold during short period of data. 
* Let's assume by short period we mean 3 months, 6 months or 12 months. 

**Data:** past data on token price is taken from CoinGecko

## Import libraries

In [1]:
import numpy as np
import pandas as pd

In [2]:
from pycoingecko import CoinGeckoAPI

In [3]:
import random

In [4]:
from tqdm import tqdm

In [5]:
import time
import datetime

In [6]:
import defi.defi_tools as dft

## Data

### CoinGecko API client

In [7]:
cg = CoinGeckoAPI()

#### Test with simple example

In [5]:
cg.get_price(ids='bitcoin', vs_currencies='usd')

{'bitcoin': {'usd': 27549}}

In [6]:
cg.get_price(ids='ethereum', 
             vs_currencies='usd,btc', 
             include_market_cap=True, 
             include_24hr_vol=True, 
             include_24hr_change=True, 
             include_last_updated_at=True)

{'ethereum': {'usd': 1868.69,
  'usd_market_cap': 225013094531.53198,
  'usd_24h_vol': 7352961352.311256,
  'usd_24h_change': 1.074688914060943,
  'btc': 0.0678303,
  'btc_market_cap': 8167591.1429964965,
  'btc_24h_vol': 266899.9426054162,
  'btc_24h_change': -0.07842932474885035,
  'last_updated_at': 1682246998}}

### Parse

In [7]:
derivatives = cg.get_derivatives()

In [8]:
f'Number of derivatives downloaded: {len(derivatives)}'

'Number of derivatives downloaded: 4175'

In [9]:
random.sample(derivatives, 1)

[{'market': 'BIT (Futures)',
  'symbol': 'USD-M:RSS3-USD-PERPETUAL',
  'index_id': 'RSS3',
  'price': '0.15',
  'price_percentage_change_24h': 6.9124424,
  'contract_type': 'perpetual',
  'index': 0.1508866,
  'basis': 0.0574271,
  'spread': 0.4,
  'funding_rate': 0.01,
  'open_interest': None,
  'volume_24h': 1168829.8388,
  'last_traded_at': 1682246850,
  'expired_at': None}]

#### Supported coins

In [8]:
all_coins_list = cg.get_coins_list()

In [13]:
random.sample(all_coins_list, 3)

[{'id': 'exodusext', 'symbol': 'ext', 'name': 'ExodusExt'},
 {'id': 'cryptopunk-7171-hoodie',
  'symbol': 'hoodie',
  'name': 'CryptoPunk #7171'},
 {'id': 'x42-protocol', 'symbol': 'x42', 'name': 'X42 Protocol'}]

In [12]:
f'Number of all known coin symbols: {len(all_coins_list)}'

'Number of all known coin symbols: 10723'

In [None]:
# TODO: check if this holds "tokens" (as categorized by CMC) as well 

In [16]:
cg.get_coin_by_id('shiba-world-cup')

# Looks there is a lot of cool features around a single coin
# Many attributes possibly usefull for some machine learning

{'id': 'shiba-world-cup',
 'symbol': 'swc',
 'name': 'Shiba World Cup',
 'asset_platform_id': 'binance-smart-chain',
 'platforms': {'binance-smart-chain': '0x27dcc73cbbbe57d006303316dd3e91a0d5d58eea'},
 'detail_platforms': {'binance-smart-chain': {'decimal_place': 18,
   'contract_address': '0x27dcc73cbbbe57d006303316dd3e91a0d5d58eea'}},
 'block_time_in_minutes': 0,
 'hashing_algorithm': None,
 'categories': [],
 'public_notice': None,
 'additional_notices': ["Kindly be aware of <a href='https://www.coingecko.com/en/glossary/rug-pulled' target='_blank'>liquidity-related risks</a>. This notice is not directed at any project in particular, and is more of a cautionary reminder.",
  'The following token has a variable tax function on the smart contract to <a href="https://support.coingecko.com/hc/en-us/articles/4499153900185-What-are-variable-taxes-on-Smart-Contracts-">change tax rates post deployment</a>. <br>\nDo your own research and be careful if you are trading this token.\n'],
 'loca

In [18]:
cg.get_search_trending()

# unfortuanetly no historical data for this

{'coins': [{'item': {'id': 'arbitrum',
    'coin_id': 16547,
    'name': 'Arbitrum',
    'symbol': 'ARB',
    'market_cap_rank': 34,
    'thumb': 'https://assets.coingecko.com/coins/images/16547/thumb/photo_2023-03-29_21.47.00.jpeg?1680097630',
    'small': 'https://assets.coingecko.com/coins/images/16547/small/photo_2023-03-29_21.47.00.jpeg?1680097630',
    'large': 'https://assets.coingecko.com/coins/images/16547/large/photo_2023-03-29_21.47.00.jpeg?1680097630',
    'slug': 'arbitrum',
    'price_btc': 5.325666542074022e-05,
    'score': 0}},
  {'item': {'id': 'chromaway',
    'coin_id': 5000,
    'name': 'Chromia',
    'symbol': 'CHR',
    'market_cap_rank': 299,
    'thumb': 'https://assets.coingecko.com/coins/images/5000/thumb/Chromia.png?1559038018',
    'small': 'https://assets.coingecko.com/coins/images/5000/small/Chromia.png?1559038018',
    'large': 'https://assets.coingecko.com/coins/images/5000/large/Chromia.png?1559038018',
    'slug': 'chromia',
    'price_btc': 6.1431521

### Query API for historical prices

#### Test single execution

In [26]:
btc_prices = cg.get_coin_market_chart_range_by_id(id='bitcoin', 
                                                  vs_currency='usd', 
                                                  from_timestamp='1661990400', 
                                                  to_timestamp='1664582399')  # from September 1, 2022 to September 30, 2022

In [36]:
btc_prices.keys()

dict_keys(['prices', 'market_caps', 'total_volumes'])

In [37]:
len(btc_prices['prices'])

722

In [42]:
f'A single day has {int(722/30)} price entries on average'

'A single day has 24 price entries on average'

In [30]:
btc_coin_ohlc = cg.get_coin_ohlc_by_id(id='bitcoin',
                                       vs_currency='usd',
                                       days=1)

In [32]:
coin_ohlc_df = pd.DataFrame(
    btc_coin_ohlc, columns = ['Time', 'Open', 'High', 'Low', 'Close'])
coin_ohlc_df.head() 

Unnamed: 0,Time,Open,High,Low,Close
0,1682168400000,27335.45,27364.24,27335.45,27364.24
1,1682170200000,27370.72,27378.83,27346.68,27346.68
2,1682172000000,27317.43,27337.31,27277.85,27277.85
3,1682173800000,27276.35,27298.31,27268.28,27298.31
4,1682175600000,27363.04,27423.62,27323.74,27423.62


In [40]:
f'A single day holds {len(coin_ohlc_df)} entries'

'A single day holds 48 entries'

#### Date ranges

In [7]:
from_date = '2020-03-01'  # let's discard stuff pre-COVID
to_date = '2023-04-15'

In [46]:
from_date_timestamp = time.mktime(datetime.datetime.strptime(from_date, "%Y-%m-%d").timetuple())
from_date_timestamp

1583017200.0

In [47]:
to_date_timestamp = time.mktime(datetime.datetime.strptime(to_date, "%Y-%m-%d").timetuple())
to_date_timestamp

1681509600.0

In [43]:
# Initialize an empty dictionary to store the historical data for each coin
# coin_data = {}

In [44]:
# coin_features = {}

In [45]:
# for coin in tqdm(all_coins_list):
#     coin_id = coin['id']
#     coin_history = cg.get_coin_market_chart_range_by_id(id=coin_id, 
#                                                         vs_currency='usd', 
#                                                         from_timestamp=from_date_timestamp, 
#                                                         to_timestamp=to_date_timestamp)
    
#     coin_data[coin_id] = coin_history
#     coin_details = cg.get_coin_by_id(coin_id)
    
#     coin_features[coin_id] = coin_details

In [18]:
# coin_features['1inch']

In [None]:
# Takes lof of time to compute, let's switch to something smarter!

### Async query for prices

In [14]:
btc_defi_df = dft.geckoHistorical('bitcoin')

In [15]:
btc_defi_df

Unnamed: 0_level_0,price,market_caps,total_volumes
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2013-04-28 00:00:00,135.300000,1.500518e+09,0.000000e+00
2013-04-29 00:00:00,141.960000,1.575032e+09,0.000000e+00
2013-04-30 00:00:00,135.300000,1.501657e+09,0.000000e+00
2013-05-01 00:00:00,117.000000,1.298952e+09,0.000000e+00
2013-05-02 00:00:00,103.430000,1.148668e+09,0.000000e+00
...,...,...,...
2023-04-20 00:00:00,28833.217501,5.587965e+11,2.413662e+10
2023-04-21 00:00:00,28255.578249,5.461411e+11,2.169590e+10
2023-04-22 00:00:00,27300.157129,5.283573e+11,2.042033e+10
2023-04-23 00:00:00,27861.640663,5.394547e+11,1.170023e+10


### Store the results