# CoinMetrics Case Study

Objective - to evaluate skills and abilities in multiple ways:
1. importing data
2. wrangling data
3. exploring data
4. analysis
5. modelling
6. communicating results

Provide:
1. A written explanation of how to approach the problem
2. Present the beginning phases of implementation using coin metrics data

Of the four options made available in the case study, option 3 was chosen

### Advocating for CoinMetric's data

Produce quality research that is of value to potential clients (doesn’t have to be complete) with a particular focus on network data

### Initial ideas

My first rough ideas were: 
1. comparing different Bitcoin based chains, (BTC, BCH, LTC, BSV) to test the influence of whales and compare this to their respective claims to be a SoV  or alternative to cash.
2. developing some of the research by Willy Woo. I find his research particularly interesting. Particularly, 
    1. days destroyed, 
    2. hodl waves, 
    3. thermo cap, 
    4. average cap. 
 
I think the following ideas are also interesting and worth investigating, but not possible within the scope of this exercise:
 
1. Tracking the number of twitter followers of various crypto-twitter thought leaders and celebrities to test the hypothesis that *"an increase in follower numbers shows that new retail investors are entering crypto-markets, and an increase in price is expected soon"*
 
 Thought leaders / crypto celebrities could be further grouped by what types of coins they speak about most  - SoV, smart contracts, DeFi, etc. 
 
 Weibo could be analysed as well as Twitter to understand Chinese markets, Korean twitter could be analysed for the Korean retail market, etc. 

2. I have an existing side project which has the goal of using a recurrent neural net to predict BTC price movements. The app (model, stored data, data pipeline, visualization of results) will run autonomously on Google Cloud Platform. Candle data is consumed from CoinAPI.io and stored in BigQuery. 

 Technical indicators will be calculated and used as additional factors to the model. Sentiment analysis from news outlets (Bloomberg, FT) would be added later. 

 The model would be written using TensorFlow, and the BigQuery tables names would use BQ's date format capabilites. This would make the project faster and cheaper. 

### 1. Testing the influence of whales on BTC forks and comparing to each chains claims e.g. as a store of value or alternative to cash

If a country has a much lower median income than mean income, it probably has high income inequality. 

Similarly, if a chain has a much smaller median transaction size than mean transaction size, it probably is not used by regular users and is controlled mostly be whales. 

This would contradict any claims the fork makes to being a form of digital cash. 

In [2]:
# TxTfrValMeanUSD - The sum USD value of native units transferred divided by the count of transfers (i.e., the mean "size" in USD of a transfer) that interval.

# TxTfrValMedUSD - The median USD value transferred per transfer (i.e., the median "size" in USD of a transfer) that interval.

# TxTfrValUSD - The sum USD value of all native units transferred (i.e., the aggregate size in USD of all transfers) that interval.

# Chains: BTC, BCH, BSV, LTC, DOGE

In [66]:
import requests
import json

import pandas as pd

In [61]:
def get_metricdata(asset_id, payload):
    url = f'https://community-api.coinmetrics.io/v2/assets/{asset_id}/metricdata'
    response = requests.get(
        url=url,
        params=payload
    )
    
    if response.status_code == 200:
        print(f'{asset_id} - success!')
        return json.loads(response.content.decode('utf-8'))
    else:
        print(f'status_code: {response.status_code}')
        return None

In [62]:
payload = {
    'metrics':  'PriceUSD,'+
                'TxTfrValMeanUSD,'+
                'TxTfrValMedUSD,'+
                'TxTfrValUSD',
    'start': '2018-09-01',
}

asset_list = ['btc', 'ltc', 'bch', 'bsv', 'doge']
data = {}
for asset in asset_list:
    data[asset] = get_metricdata(asset, payload)

btc - success!
ltc - success!
bch - success!
bsv - success!
doge - success!


In [124]:
data.keys()

dict_keys(['btc', 'ltc', 'bch', 'bsv', 'doge'])

In [125]:
dataframes = {}
cols = ['PriceUSD', 'TxTfrValMeanUSD', 'TxTfrValMedUSD', 'TxTfrValUSD']
for asset in data.keys():
    values = [ each['values'] for each in data[asset]['metricData']['series']]
    index = [ each['time'] for each in data[asset]['metricData']['series']]
    
    df = pd.DataFrame.from_records(values, columns = cols)
    df.index = pd.to_datetime(index, infer_datetime_format=True).date
    
    dataframes[asset] = df

In [132]:
dataframes['bsv'].sample(20)

Unnamed: 0,PriceUSD,TxTfrValMeanUSD,TxTfrValMedUSD,TxTfrValUSD
2019-05-28,122.08763368430292,5632.949328400788,0.0688769594193363,252553283.13884935
2018-11-19,61.035951584272745,14.44247478764306,0.0021881388642961,24730557.26014584
2019-02-18,67.63729227663508,5641.244216931395,410.1803697294224,144968693.886703
2019-01-22,75.53427260030611,9088.983412861171,0.000755342726003,91671486.70211805
2019-07-09,204.7523694682079,1849.8307323628744,0.0327726642570813,191790450.36170873
2019-02-01,64.05142613902203,7080.53702680855,0.6405142613902203,37965839.53774729
2018-12-23,106.9936690615481,47955.55864171543,3.397604824814927,191438590.09772775
2019-08-23,135.367137927592,210.54398729948292,0.947569965493144,26559492.325366028
2019-03-13,64.39860281528094,4734.27351323925,0.0515188822522247,41997740.33594539
2019-06-22,238.59589105252016,7506.694096529635,23.859589105252017,482627883.5231751
