<h2>CryptoCompare API: An Introduction</h2>

While there are several website's out there that provide public API access to get at cyrptocurrency data, I found CryptoCompare's API the easiest to work with for my needs with this project. You can use this API to download and keep up-to-date with all sorts of interesting data on thousands of cryptocurrencies and most of the popular Exchanges. I've included a link to their API documentation <a href="https://www.cryptocompare.com/api/">here</a> and also at the end of this post. 

 For this post however I will focus on getting Open-High-Low-Close (OHLC) Price data and then creating simple interactive plots with Plotly. Before we dive into the writing some functions, I first want to introduce you to the API itself and how we can send requests to it to get data. Start by importing the necessary libraries, I will explain their use throughout the post.

In [1]:
'''import libraries to work with the data'''
import numpy as np
import pandas as pd
import os
import requests
import pickle
import datetime as datetime

'''import plotly packages for visualization'''
from plotly import tools
import plotly.offline as py
import plotly.graph_objs as go
import plotly.figure_factory as ff
py.init_notebook_mode(connected=True)

The <i>requests</i> library is going to help us send HTTP requests to the API in order to get the data. Check out the exmaple below for a GET request from the API. Of all the different methods provided by the API, we will be using the <i>HistoDay</i> method for historical OHLC data for a given coin pair which in our case here is BTC to USD.

In [2]:
myData = requests.get("https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=USD&limit=1&aggregate=1&e=CCCAGG")
myData

<Response [200]>

In [3]:
myData.json()

{'Aggregated': False,
 'ConversionType': {'conversionSymbol': '', 'type': 'direct'},
 'Data': [{'close': 8218.05,
   'high': 9400.99,
   'low': 7889.83,
   'open': 9251.27,
   'time': 1517702400,
   'volumefrom': 164609.06,
   'volumeto': 1413207410.82},
  {'close': 6954.47,
   'high': 8391.29,
   'low': 6930.13,
   'open': 8218.05,
   'time': 1517788800,
   'volumefrom': 208341.25,
   'volumeto': 1596635224.44}],
 'FirstValueInArray': True,
 'Response': 'Success',
 'TimeFrom': 1517702400,
 'TimeTo': 1517788800,
 'Type': 100}

Okay, so let's make sense of what we did here. The url within the get request has some parameters that are passed to the API and it is crucial that these parameters are correctly enterd in. The <i>fysm</i> parameter which means from symbol is BTC (Bitcoin) while <i>tysm</i> meaning to symbol is the USD, <i>aggregate</i> is how many days you want the OHLC price data to be aggregated for which is 1 here and <i>limit</i> is how many data points you need. Here we're getting 1-day's worth of data only. The response from this request - <i>&lt;Response [200]&gt;</i> means that we successfully communicated with the API. But it is important to confirm this by looking at the json data that is returned. If the value of the 'Response' key is 'Success' then all is well and you can see the required data assigned to the 'Data'key.

<h2>Define Helper Functions</h2>

The first helper function <i>cryptocompare_data()</i> will download or update data depending on wether a cached version of the data is available or not. The parameters of the <i>HistoDay</i> API method are passed as parameters to this function which are then used to customize URL (and ultimately the data download) based on the User's wishes. Note that this function will only update the data if the there atleast 1 full day has passed since the last cache.

In [4]:
def cryptocompare_data(symbol, convert_symbol = 'USD', limit= 1, aggregate= 1, exchange= 'CCCAGG'):
    
    '''this helper function will download data if not already downloaded and will update it if an earlier outdated 
       download is already available'''
    
    cache_path = '{}.pkl'.format(symbol.upper()+'_'+convert_symbol.upper())
    try:
        f = open(cache_path, 'rb')
        '''pickle module is used to serialize the data for caching locally'''
        df = pickle.load(f)
        print('Loaded {} from cache'.format(symbol.upper()+'_'+convert_symbol.upper()))
        
        '''the outer if block will check to see if the last day in the downloaded file is the same as now. If not, 
            it assigns the new limit as the number of days since the last update and allData is set to false.'''
        
        if (datetime.datetime.now()-df['timestamp'].iloc[-1]).days > 0:
            limit = (datetime.datetime.now()-df['timestamp'].iloc[-1]).days
            print('{} day(s) since last update'.format(limit))
            print('Updating now...')
            url = "https://min-api.cryptocompare.com/data/histoday?fsym={}&tsym={}&limit={}&aggregate={}&e={}&allData=false"\
            .format(symbol.upper(), convert_symbol.upper(), limit, aggregate, exchange)
            json_dump2 = requests.get(url).json()
            
            '''this inner if block will run only if the update request was successfull in downloading data. Then it 
            downloads data only for those days since last update. We also drop the first row of the dataframe due to 
            an extra row being downloaded. We also concat the new DataFrame to the one with the last update(remember 
            to ignore the index so that a new index is formed). It then caches the data and updates it locally.'''
            
            if json_dump2['Response'] == 'Success':
                df2 = pd.DataFrame(json_dump2['Data'])
                df2.drop(df2.index[0], inplace = True)
                df2['timestamp'] = [datetime.datetime.fromtimestamp(d) for d in df2['time']]
                updated_df = pd.concat([df, df2], ignore_index = True)
                updated_df.to_pickle(cache_path)
                print('{} successfully updated on {}\n'.format(symbol.upper()+'_'+convert_symbol.upper(),\
                                                             datetime.datetime.now()))
                
                return updated_df
            
            else:
                print('Error while updating {} data from CryptoCompare.\n Message: {}'.format(symbol.upper(),\
                                                                                         json_dump2['Message']))      
            
        else:
            '''displays the timestamp for the last available OLHC price from the CryptoCompare API'''
            print('Data is up-to-date as of {}\n'.format(df['timestamp'].iloc[-1]))
            return df
            
    except(OSError, IOError) as e:
        print('Downloading {} data from {} exchange...'.format(symbol.upper(), exchange))
        url = "https://min-api.cryptocompare.com/data/histoday?fsym={}&tsym={}&limit={}&aggregate={}&e={}&allData=true"\
           .format(symbol.upper(), convert_symbol.upper(), limit, aggregate, exchange)
        
        json_dump = requests.get(url).json()
        if json_dump['Response'] == 'Success':
            df = pd.DataFrame(json_dump['Data'])
            df['timestamp'] = [pd.datetime.fromtimestamp(d) for d in df.time]
            df.to_pickle(cache_path)
            print('Cached {} data from {} exchange at {}\n'\
                  .format(symbol.upper(), exchange, cache_path))
            return df
        
        else:
            print('Error in downloading {} data from CryptoCompare.\n Message: {}'.format(symbol.upper(),\
                                                                                         json_dump['Message']))

The <i>downloader()</i> function uses the <i>cryptocompare_data()</i> function to download/update the data for a given list of coins that the User passes to it. The User can also specify what month and year the OHLC data should begin from. The function returns a dictionary of DataFrames for each coin.

In [5]:
def downloader(symbol_list, month_onward= 'JAN', year_onward= '2015'):
    
    '''this helper function will call cryptocompare_data() to download data for coin symbols that you pass as a list. 
       Also note that by default it will return historical price data for the month-year JAN-2015 onward. You can change this by
       passing different month and year string values''' 
    
    coin_store = {}
    '''date_start is the date from which you want to filter the data'''
    date_start = datetime.datetime.strptime("01{}{}".format(month_onward.upper(), year_onward), "%d%b%Y")
    
    '''to make it easier to visualize close prices downstream in our analysis, we rename the close price 
       column in for each DataFrame to the symbol of the coin'''
    for symbol in symbol_list:
        coin_store[symbol] = cryptocompare_data(symbol)[['timestamp', 'open', 'high', 'low', 'close', 'volumefrom']]\
                                            .rename(columns = {'close':symbol})
            
    for symbol in symbol_list:
        coin_store[symbol] = coin_store[symbol][coin_store[symbol].timestamp >= date_start]
    
    return coin_store
        

The <i>line_plotter()</i> function abstracts away the lenghty code that is required to build an interactive multi-line plot of the OHLC prices. In a nutshell, we create a list of 'traces' for each coin's price data and then pass it to plotly's plotting functions to create the plot. You can read more about the multi-line plot <a href= "https://plot.ly/python/line-charts/">here</a> on Plotly's website. To account for the huge difference in the price of different cryptocurrencies, the plot here uses a logarithmic axis to represent the prices.

In [6]:
def line_plotter(coin_store, title= 'Cryptocurrency Historical Daily Prices', x_label= 'Date', y_label= 'Close Price (USD)'):
    
    '''this helper function makes it easier to use the plotly library to plot a line graph to visualize
       the Cryptocurrency data that we have stored'''
    
    data = []
    '''the for loop below is used to create a list of traces for each coins pricing data'''
    for symbol in list(coin_store.keys()):
        trace= go.Scatter(
            x= coin_store[symbol]['timestamp'],
            y= coin_store[symbol][symbol],
            mode= 'lines',
            name= symbol)
        data.append(trace)
        
    layout = dict(width= 1000, height= 800, 
              title= title,
              xaxis= dict(title= x_label,
                          rangeselector= dict(
                                                buttons= list([
                                                    dict(count= 1,
                                                         label= '1m',
                                                         step= 'month',
                                                         stepmode= 'backward'),
                                                    dict(count= 6,
                                                         label= '6m',
                                                         step= 'month',
                                                         stepmode= 'backward'),
                                                    dict(count= 1,
                                                        label= '1y',
                                                        step= 'year',
                                                        stepmode= 'backward'),
                                                    dict(step= 'all')
                                                ])
                                            ),
                                            rangeslider= dict(),
                                            type= 'date'
                          ),
              yaxis= dict(title= y_label, type= 'log', autorange= True),
              )
    fig = dict(data= data, layout= layout)
    py.plot(fig, filename= 'close_prices.html')

The <i>ohlc_chart()</i> function is also used to abstract away some code that is required to create a OHLC Chart for a given cryptocurrency. Note that I've modified this to include a plot of the daily volume of the currency traded. This helps in giving us a better idea of what was going on in the market.

In [7]:
def ohlc_chart(coin_store, symbol):
    
    '''this helper function is used for creating an OHLC chart for a given coin symbol'''
    
    trace_ohlc = go.Ohlc(x= store[symbol.upper()].timestamp,
                open= store[symbol.upper()].open,
                high= store[symbol.upper()].high,
                low= store[symbol.upper()].low,
                close= store[symbol.upper()][symbol.upper()])
    
    trace_line = go.Scatter(           
            x= coin_store[symbol.upper()]['timestamp'],
            y= coin_store[symbol.upper()]['volumefrom'],
            mode= 'line',
            name= 'Volume'
                        )

    fig = tools.make_subplots(rows= 2, cols= 1, 
                              specs= [[{}], [{}]],
                              shared_xaxes= False, 
                              shared_yaxes= False,
                              subplot_titles= ('{} OHLC Chart'.format(symbol.upper()),\
                                               '{} Daily Trade Volume'.format(symbol.upper()))
                             )
    
    
    fig.append_trace(trace_ohlc, 1, 1)
    fig.append_trace(trace_line, 2, 1)

    fig['layout']['xaxis1'].update(title= 'Date')
    fig['layout']['xaxis2'].update(title= 'Date')
    fig['layout']['yaxis1'].update(title= 'Prices (USD)')
    fig['layout']['yaxis2'].update(title= 'Volume')
    
    fig['layout'].update(width= 1000, height= 700, title='')

    py.plot(fig, filename='olhc_line_subplot.html')

**----------------------------------------------------------------------------------------------------------------------**

<h2>Data Caching And Plotting</h2>

Now that we've written all our helper functions, let's download some data. Specefically, let's download some of the most popular cryptocurrencies that are being traded right now by passing their symbols in a list. Also note that I have filtered the data to only from June 2017 onward. Depending on wether you had previously cached/updated data, you will see that the <i>downloader()</i> function will display the appropriate messages.

In [8]:
'''use the downloader function to download or update data'''
store = downloader(['BTC','ETH', 'XRP', 'BCH', 'ADA', 'LTC', 'NEO', 'XEM', 'DASH', 'XMR'], 'JUN','2017')

Downloading BTC data from CCCAGG exchange...
Cached BTC data from CCCAGG exchange at BTC_USD.pkl

Downloading ETH data from CCCAGG exchange...
Cached ETH data from CCCAGG exchange at ETH_USD.pkl

Downloading XRP data from CCCAGG exchange...
Cached XRP data from CCCAGG exchange at XRP_USD.pkl

Downloading BCH data from CCCAGG exchange...
Cached BCH data from CCCAGG exchange at BCH_USD.pkl

Downloading ADA data from CCCAGG exchange...
Cached ADA data from CCCAGG exchange at ADA_USD.pkl

Downloading LTC data from CCCAGG exchange...
Cached LTC data from CCCAGG exchange at LTC_USD.pkl

Downloading NEO data from CCCAGG exchange...
Cached NEO data from CCCAGG exchange at NEO_USD.pkl

Downloading XEM data from CCCAGG exchange...
Cached XEM data from CCCAGG exchange at XEM_USD.pkl

Downloading DASH data from CCCAGG exchange...
Cached DASH data from CCCAGG exchange at DASH_USD.pkl

Downloading XMR data from CCCAGG exchange...
Cached XMR data from CCCAGG exchange at XMR_USD.pkl



<h2>Visualizing Close Prices</h2>

In [9]:
'''plot the close prices'''
line_plotter(store, 'Cryptocurrency Daily Price Data')

It is so interesting to see how the coins roughly form groups, with Bitcoin(BTC) high up there alone while NEM(XEM) and Ripple(XRP) form groups at the lower end of the price spectrum. Also note how the trends of the prices seem to sync towards the later half of the plot. Although this is purely just an observation, it will be very interesting to look at the results of a correlation analysis on all these coins.

<h2>Creating An OHLC Chart</h2>

In [11]:
'''plot an OHLC chart for any coin'''
ohlc_chart(store, 'XRP')

This is the format of your plot grid:
[ (1,1) x1,y1 ]
[ (2,1) x2,y2 ]



To get a more thorough look at the OHLC price data of a single coin, use the <i>ohlc_chart()</i> function. I've modified it to include a plot of the daily traded volume and it gives you an idea for how the trading volume and the prices move over the same timeline.

This brings me to the end of this post. It started out as a curiosity about cryptocurrency mania and the first thing I wanted to do was to be able to get data about these coins and visualize it. I am sure to write more posts based on this one, specifically to test hypothesis and come to some conclusions. Let me know what you think in the comments below!

<h2>Inspiration</h2>

- CryptoCompare API: https://www.cryptocompare.com/api/
- CCCAGG Index: https://www.cryptocompare.com/coins/guides/how-does-our-cryptocurrecy-index-work/
- Alex Galea's Blog: https://medium.com/@agalea91/cryptocompare-api-quick-start-guide-ca4430a484d4
- Patrick Triest's Blog: https://blog.patricktriest.com/analyzing-cryptocurrencies-python/
- Plotly For Python: https://plot.ly/python/