# Preparing Bitcoin Data

The bitcoin data is taken from bitcoincharts.com (https://bitcoincharts.com/).
It represents data from the exchange Bitstamp (https://www.bitstamp.net/), one of the most liquid bitcoin exchanges in the world.

Import used libraries

In [1]:
import numpy as np
import pandas as pd

Import the CSV file (bitcoin_price_USD_bitstamp.csv) containing the bitcoin
data

In [2]:
bitcoin_price_daily_raw = pd.DataFrame.from_csv("bitcoin_price_USD_bitstamp.csv")

In [3]:
bitcoin_price_daily_raw = bitcoin_price_daily_raw.iloc[::-1]
bitcoin_price_daily_raw = bitcoin_price_daily_raw.reset_index()

In [4]:
bitcoin_price_daily_raw.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
0,2011-09-13,5.8,6.0,5.65,5.97,58.371382,346.097389,5.929231
1,2011-09-14,5.58,5.72,5.52,5.53,61.145984,341.854813,5.590798
2,2011-09-15,5.12,5.24,5.0,5.13,80.140795,408.259002,5.094272
3,2011-09-16,4.82,4.87,4.8,4.85,39.914007,193.763147,4.854515
4,2011-09-17,4.87,4.87,4.87,4.87,0.3,1.461,4.87


# 1. Price Data

## 1.1. Preparing Bitcoin Price Data for Google search volume

### 1.1.1 Delete unused rows
My time horizon for the analysis is 04/14/2013 - 04/14/2018, so I delete all other rows and reset the index

In [5]:
dates_to_delete = list(range(0, 579))
dates_to_delete_2 = list(range(2406, 2411))
bitcoin_price_daily_adjusted_rows_1 = bitcoin_price_daily_raw.drop(dates_to_delete)
bitcoin_price_daily_adjusted_rows = bitcoin_price_daily_adjusted_rows_1.drop(dates_to_delete_2)
bitcoin_price_daily_adjusted_rows = bitcoin_price_daily_adjusted_rows.reset_index(drop=True)
bitcoin_price_daily_adjusted_rows.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
0,2013-04-14,90.44,99.99,81.94,91.11,5777.573331,549492.1,95.107761
1,2013-04-15,91.79,99.8,60.0,74.0,19578.128716,1685980.0,86.115488
2,2013-04-16,73.99,82.0,49.8,68.09,35706.586294,2335121.0,65.397494
3,2013-04-17,67.85,101.12,60.0,89.98,30854.424119,2607702.0,84.516292
4,2013-04-18,89.98,111.23,78.92,109.3,16527.459717,1599321.0,96.7675


### 1.1.2 Delete unused columns
I am interested in the daily closing prices for bitcoin, so I delete the other columns

In [6]:
bitcoin_price_daily_finished = bitcoin_price_daily_adjusted_rows.drop(["Open", "High", "Low", "Volume (BTC)", "Volume (Currency)", "Weighted Price"], axis=1)
bitcoin_price_daily_finished = bitcoin_price_daily_finished.rename(columns={"Close" : "Daily Closing Price"})

### 1.1.3 Bitcoin Daily Price

I arrive at a dataframe with the Daily Closing Price

In [7]:
bitcoin_price_daily_finished.head()

Unnamed: 0,Date,Daily Closing Price
0,2013-04-14,91.11
1,2013-04-15,74.0
2,2013-04-16,68.09
3,2013-04-17,89.98
4,2013-04-18,109.3


### 1.1.4 Calculate Bitcoin Weekly Average Price

For the analysis with the Google Trends search data I need to calculate the weekly average prices from the daily prices

In [8]:
bitcoin_price_daily_finished['Weekly Average Price'] = bitcoin_price_daily_finished['Daily Closing Price'].groupby(np.arange(len(bitcoin_price_daily_finished)) // 7).transform('mean')
bitcoin_price_finished = bitcoin_price_daily_finished

I arrive at a dataframe with the Daily Closing Prices and Weekly Average Prices

In [9]:
bitcoin_price_finished.head(n=8)

Unnamed: 0,Date,Daily Closing Price,Weekly Average Price
0,2013-04-14,91.11,96.677143
1,2013-04-15,74.0,96.677143
2,2013-04-16,68.09,96.677143
3,2013-04-17,89.98,96.677143
4,2013-04-18,109.3,96.677143
5,2013-04-19,117.71,96.677143
6,2013-04-20,126.55,96.677143
7,2013-04-21,118.81,133.834286


### 1.1.5 Create Dataframe with only the Average Weekly Price

I create a new dataframe with only every 7th row and I delete the "Daily Closing Price" column

In [10]:
bitcoin_price_weekly_finished = bitcoin_price_finished[bitcoin_price_finished.index % 7 == 0]
bitcoin_price_weekly_finished = bitcoin_price_weekly_finished.reset_index(drop=True)
bitcoin_price_weekly_finished = bitcoin_price_weekly_finished.drop(["Daily Closing Price"], axis=1)
bitcoin_price_weekly_finished.head()

Unnamed: 0,Date,Weekly Average Price
0,2013-04-14,96.677143
1,2013-04-21,133.834286
2,2013-04-28,121.818571
3,2013-05-05,113.905714
4,2013-05-12,113.572857


### 1.1.6 Safe to CSV for Usage in Data Analysis:

Safe the dataframe to a csv file ("Bitcoin Weekly Price Data.csv") for usage in Data Analysis (see Folder: Data Analysis). Code line is marked as a comment here, so that it is not executed (and a csv file created) everytime the notebook is run.

In [11]:
#bitcoin_price_weekly_finished.to_csv("Bitcoin Weekly Price Data.csv")

## 1.2. Preparing Bitcoin Price Data for Wikipedia

### 1.2.1 Delete unused rows
My time horizon for the analysis is 07/01/2015 - 04/14/2018, so I delete all other rows and reset the index

In [12]:
dates_to_delete_wiki = list(range(0, 1387))
dates_to_delete_wiki_2 = list(range(2406, 2411))
bitcoin_price_daily_adjusted_rows_wiki_1 = bitcoin_price_daily_raw.drop(dates_to_delete_wiki)
bitcoin_price_daily_adjusted_rows_wiki = bitcoin_price_daily_adjusted_rows_wiki_1.drop(dates_to_delete_wiki_2)
bitcoin_price_daily_adjusted_rows_wiki = bitcoin_price_daily_adjusted_rows_wiki.reset_index(drop=True)
bitcoin_price_daily_adjusted_rows_wiki.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
0,2015-07-01,262.58,265.25,253.81,257.62,13395.153439,3453026.0,257.781733
1,2015-07-02,257.62,260.77,253.19,254.54,9826.994781,2515779.0,256.006904
2,2015-07-03,254.49,256.44,252.4,255.92,9153.852565,2331332.0,254.68315
3,2015-07-04,256.06,261.28,254.05,260.2,7909.856729,2037201.0,257.552232
4,2015-07-05,260.16,274.74,258.75,271.5,21362.61365,5689203.0,266.315871


### 1.2.2 Delete unused columns
I am only interested in the daily closing prices for bitcoin, so I delete the other columns

In [13]:
bitcoin_price_wiki_finished = bitcoin_price_daily_adjusted_rows_wiki.drop(["Open", "High", "Low", "Volume (BTC)", "Volume (Currency)", "Weighted Price"], axis=1)
bitcoin_price_wiki_finished = bitcoin_price_wiki_finished.rename(columns={"Close" : "Daily Closing Price"})

### 1.2.3 Bitcoin Daily Price

I arrive at a dataframe with the Daily Closing Price

In [14]:
bitcoin_price_wiki_finished.head()

Unnamed: 0,Date,Daily Closing Price
0,2015-07-01,257.62
1,2015-07-02,254.54
2,2015-07-03,255.92
3,2015-07-04,260.2
4,2015-07-05,271.5


### 1.2.4 Safe to CSV for Usage in Data Analysis
Safe the dataframe to a csv file ("Bitcoin Daily Price Data.csv") for usage in Data Analysis (see Folder: Data Analysis).
Code line is marked as a comment here, so that it is not executed (and a csv file created) everytime the notebook is run.

In [15]:
#bitcoin_price_wiki_finished.to_csv("Bitcoin Daily Price Data.csv")

# 2. Trading Volume Data in USD

## 2.1 Preparing Bitcoin Trading Volume Data for Google search volume

In [16]:
bitcoin_price_daily_adjusted_rows.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
0,2013-04-14,90.44,99.99,81.94,91.11,5777.573331,549492.1,95.107761
1,2013-04-15,91.79,99.8,60.0,74.0,19578.128716,1685980.0,86.115488
2,2013-04-16,73.99,82.0,49.8,68.09,35706.586294,2335121.0,65.397494
3,2013-04-17,67.85,101.12,60.0,89.98,30854.424119,2607702.0,84.516292
4,2013-04-18,89.98,111.23,78.92,109.3,16527.459717,1599321.0,96.7675


### 2.1.1 Delete unused columns
I am interested in the trading volume of bitcoin, so I delete the other columns

In [17]:
bitcoin_volume_daily_google = bitcoin_price_daily_adjusted_rows.drop(["Open", "High", "Low", "Close", "Volume (BTC)", "Weighted Price"], axis=1)
bitcoin_volume_daily_google.head()

Unnamed: 0,Date,Volume (Currency)
0,2013-04-14,549492.1
1,2013-04-15,1685980.0
2,2013-04-16,2335121.0
3,2013-04-17,2607702.0
4,2013-04-18,1599321.0


### 2.1.2 Calculate weekly average trading volume

In [18]:
bitcoin_volume_daily_google['Average Volume per Week'] = bitcoin_volume_daily_google['Volume (Currency)'].groupby(np.arange(len(bitcoin_volume_daily_google)) // 7).transform('mean')
bitcoin_volume_google_finished = bitcoin_volume_daily_google

In [19]:
bitcoin_volume_google_finished.head()

Unnamed: 0,Date,Volume (Currency),Average Volume per Week
0,2013-04-14,549492.1,1815588.0
1,2013-04-15,1685980.0,1815588.0
2,2013-04-16,2335121.0,1815588.0
3,2013-04-17,2607702.0,1815588.0
4,2013-04-18,1599321.0,1815588.0


### 2.1.3 Create Dataframe with only the Average Volume per Week

I create a new dataframe with only every 7th row and I delete the "Daily Closing Price" column

In [20]:
bitcoin_volume_google_finished = bitcoin_volume_google_finished[bitcoin_volume_google_finished.index % 7 == 0]
bitcoin_volume_google_finished = bitcoin_volume_google_finished.reset_index(drop=True)
bitcoin_volume_google_finished = bitcoin_volume_google_finished.drop(["Volume (Currency)"], axis=1)

bitcoin_volume_google_finished.head()

Unnamed: 0,Date,Average Volume per Week
0,2013-04-14,1815588.0
1,2013-04-21,1770383.0
2,2013-04-28,1770067.0
3,2013-05-05,1154001.0
4,2013-05-12,822310.2


### 2.1.4 Safe to CSV for Usage in Data Analysis:

Safe the dataframe to a csv file ("Bitcoin Weekly Volume Data.csv") for usage in Data Analysis (see Folder: Data Analysis).
Code line is marked as a comment here, so that it is not executed (and a csv file created) everytime the notebook is run

In [21]:
#bitcoin_volume_google_finished.to_csv("Bitcoin Weekly Volume Data USD.csv")

## 2.2 Preparing Bitcoin Trading Volume Data for Wikipedia

In [22]:
bitcoin_price_daily_adjusted_rows_wiki.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
0,2015-07-01,262.58,265.25,253.81,257.62,13395.153439,3453026.0,257.781733
1,2015-07-02,257.62,260.77,253.19,254.54,9826.994781,2515779.0,256.006904
2,2015-07-03,254.49,256.44,252.4,255.92,9153.852565,2331332.0,254.68315
3,2015-07-04,256.06,261.28,254.05,260.2,7909.856729,2037201.0,257.552232
4,2015-07-05,260.16,274.74,258.75,271.5,21362.61365,5689203.0,266.315871


### 2.2.1 Delete unused columns
I am only interested in the daily volume of bitcoin, so I delete the other columns

In [23]:
bitcoin_volume_wiki_finished = bitcoin_price_daily_adjusted_rows_wiki.drop(["Open", "High", "Low", "Close", "Volume (BTC)", "Weighted Price"], axis=1)
bitcoin_volume_wiki_finished.head()

Unnamed: 0,Date,Volume (Currency)
0,2015-07-01,3453026.0
1,2015-07-02,2515779.0
2,2015-07-03,2331332.0
3,2015-07-04,2037201.0
4,2015-07-05,5689203.0


### 2.2.1 Safe to CSV for Usage in Data Analysis:
Safe the dataframe to a csv file ("Bitcoin Daily Volume Data.csv") for usage in Data Analysis (see Folder: Data Analysis).
Code line is marked as a comment here, so that it is not executed (and a csv file created) everytime the notebook is run.

In [24]:
#bitcoin_volume_wiki_finished.to_csv("Bitcoin Daily Volume Data USD.csv")

# 3. Trading Volume Data in BTC

## 3.1 Preparing Bitcoin Trading Volume Data for Google search volume

In [25]:
bitcoin_price_daily_adjusted_rows.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
0,2013-04-14,90.44,99.99,81.94,91.11,5777.573331,549492.1,95.107761
1,2013-04-15,91.79,99.8,60.0,74.0,19578.128716,1685980.0,86.115488
2,2013-04-16,73.99,82.0,49.8,68.09,35706.586294,2335121.0,65.397494
3,2013-04-17,67.85,101.12,60.0,89.98,30854.424119,2607702.0,84.516292
4,2013-04-18,89.98,111.23,78.92,109.3,16527.459717,1599321.0,96.7675


### 3.1.1 Delete unused columns
I am interested in the trading volume of bitcoin, so I delete the other columns

In [26]:
bitcoin_volume_daily_google = bitcoin_price_daily_adjusted_rows.drop(["Open", "High", "Low", "Close", "Volume (Currency)", "Weighted Price"], axis=1)
bitcoin_volume_daily_google.head()

Unnamed: 0,Date,Volume (BTC)
0,2013-04-14,5777.573331
1,2013-04-15,19578.128716
2,2013-04-16,35706.586294
3,2013-04-17,30854.424119
4,2013-04-18,16527.459717


### 3.1.2 Calculate weekly average trading volume

In [27]:
bitcoin_volume_daily_google['Average Volume per Week (BTC)'] = bitcoin_volume_daily_google['Volume (BTC)'].groupby(np.arange(len(bitcoin_volume_daily_google)) // 7).transform('mean')
bitcoin_volume_google_finished = bitcoin_volume_daily_google

In [28]:
bitcoin_volume_google_finished.head()

Unnamed: 0,Date,Volume (BTC),Average Volume per Week (BTC)
0,2013-04-14,5777.573331,20202.562639
1,2013-04-15,19578.128716,20202.562639
2,2013-04-16,35706.586294,20202.562639
3,2013-04-17,30854.424119,20202.562639
4,2013-04-18,16527.459717,20202.562639


### 3.1.3 Create Dataframe with only the Average Volume per Week

I create a new dataframe with only every 7th row and I delete the "Daily Closing Price" column

In [29]:
bitcoin_volume_google_finished = bitcoin_volume_google_finished[bitcoin_volume_google_finished.index % 7 == 0]
bitcoin_volume_google_finished = bitcoin_volume_google_finished.reset_index(drop=True)
bitcoin_volume_google_finished = bitcoin_volume_google_finished.drop(["Volume (BTC)"], axis=1)

bitcoin_volume_google_finished.head()

Unnamed: 0,Date,Average Volume per Week (BTC)
0,2013-04-14,20202.562639
1,2013-04-21,13204.207
2,2013-04-28,15670.144724
3,2013-05-05,10198.091312
4,2013-05-12,7279.517695


### 3.1.4 Safe to CSV for Usage in Data Analysis:

Safe the dataframe to a csv file ("Bitcoin Weekly Volume Data.csv") for usage in Data Analysis (see Folder: Data Analysis).
Code line is marked as a comment here, so that it is not executed (and a csv file created) everytime the notebook is run

In [30]:
#bitcoin_volume_google_finished.to_csv("Bitcoin Weekly Volume Data BTC.csv")

## 3.2 Preparing Bitcoin Trading Volume Data for Wikipedia

In [31]:
bitcoin_price_daily_adjusted_rows_wiki.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
0,2015-07-01,262.58,265.25,253.81,257.62,13395.153439,3453026.0,257.781733
1,2015-07-02,257.62,260.77,253.19,254.54,9826.994781,2515779.0,256.006904
2,2015-07-03,254.49,256.44,252.4,255.92,9153.852565,2331332.0,254.68315
3,2015-07-04,256.06,261.28,254.05,260.2,7909.856729,2037201.0,257.552232
4,2015-07-05,260.16,274.74,258.75,271.5,21362.61365,5689203.0,266.315871


### 3.2.1 Delete unused columns
I am only interested in the daily volume of bitcoin, so I delete the other columns

In [32]:
bitcoin_volume_wiki_finished = bitcoin_price_daily_adjusted_rows_wiki.drop(["Open", "High", "Low", "Close", "Volume (Currency)", "Weighted Price"], axis=1)
bitcoin_volume_wiki_finished.head()

Unnamed: 0,Date,Volume (BTC)
0,2015-07-01,13395.153439
1,2015-07-02,9826.994781
2,2015-07-03,9153.852565
3,2015-07-04,7909.856729
4,2015-07-05,21362.61365


### 3.2.2 Safe to CSV for Usage in Data Analysis:
Safe the dataframe to a csv file ("Bitcoin Daily Volume Data.csv") for usage in Data Analysis (see Folder: Data Analysis).
Code line is marked as a comment here, so that it is not executed (and a csv file created) everytime the notebook is run.

In [33]:
#bitcoin_volume_wiki_finished.to_csv("Bitcoin Daily Volume Data BTC.csv")