In [1]:
import pandas as pd

<h2>Importing Data</h2>

Data is obtained from the author of the following paper:

>Laura Alessandretti, Abeer ElBahrawy, Luca Maria Aiello, Andrea Baronchelli, "Anticipating Cryptocurrency Prices Using Machine Learning", Complexity, vol. 2018, Article ID 8983590, 16 pages, 2018. https://doi.org/10.1155/2018/8983590

In [5]:
data = pd.read_csv("./data/crypto-historical-data.csv")

In [6]:
data.head()

Unnamed: 0.1,Unnamed: 0,market cap,name,price,sym,time,volume
0,0,20461600.0,Viberate,0.114889,VIB,2018-04-01,4702470.0
1,1,19204400.0,Viberate,0.124845,VIB,2018-04-02,3688650.0
2,2,20825800.0,Viberate,0.133474,VIB,2018-04-03,3681530.0
3,3,22260000.0,Viberate,0.121755,VIB,2018-04-04,5583970.0
4,4,20086900.0,Viberate,0.118312,VIB,2018-04-05,2824800.0


<h2>Data Processing</h2>

The aim of this part is create the following features:
1. Market Capitalization
2. Price
3. Rank: the rank of the cryptocurrency based on market capitalization
4. Market Share: ratio of market capitalization of the cryptocurrency and the total market capitalization over the entire cryptocurrency universe
5. Volume
6. Age: the number of days since the first time the cryptocurrency trades on the market
7. ROI: return on investment which is return of investment relative to cost of investment. Formula is shown below.

\begin{equation*}
ROI_{c, t} = \frac{P_{c, t} - P_{c, t-1}}{P_{c, t-1}}
\end{equation*}


In [8]:
# Remove first index column
processed_data = data.copy()
processed_data = processed_data.iloc[:, 1:]

In [10]:
# Determine rank for each day
processed_data['rank'] = processed_data.groupby("time")["market cap"].rank("dense", ascending=False)

In [23]:
# Determine market share
total_market_cap = processed_data.groupby("time")['market cap'].sum().reset_index()
processed_data = pd.merge(processed_data, total_market_cap, how='left', on='time')

In [17]:
processed_data.head()

Unnamed: 0,market cap,name,price,sym,time,volume,rank,total market cap
0,20461600.0,Viberate,0.114889,VIB,2018-04-01,4702470.0,250.0,
1,19204400.0,Viberate,0.124845,VIB,2018-04-02,3688650.0,260.0,
2,20825800.0,Viberate,0.133474,VIB,2018-04-03,3681530.0,253.0,
3,22260000.0,Viberate,0.121755,VIB,2018-04-04,5583970.0,250.0,
4,20086900.0,Viberate,0.118312,VIB,2018-04-05,2824800.0,251.0,
