## Historical Prices of Cryptocurrency
Ashwin Jeyaseelan, Evan Kerekanich

### Introduction
Recently, a cryptocurrency called bitcoin has become a hot topic, but what is it? Bitcoin is the first digital currency which allows the transfer of currency between people without a third party. It was created in 2009 by an unknown person with the alias, Satoshi Nakamoto. It can be used to buy merchandice just like normal currency. Surprisingly, bitcoin has built up a large enough community that markets run competitions where participants are rewarded with bitcoins in exchange of solving complex math puzzles. 

The purpose of this tutorial is to understand the trend of bitcoin prices to help users decide if it's worth participating with. We will show how to gather, parse, analyze, and conduct hypothesis testing on the data. Finally, we will use machine learning to provide analysis about the bitcoin prices.

### Getting Started 
First download the bitcoin data from: https://www.kaggle.com/myonin/bitcoin-price-prediction-by-arima/data. Next we will load the bitcoin data file with the library, Pandas. This data contains information about 1-minute bitcoin exchanges from Jan 2012 to October 2017. 

In [14]:
# Import libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib as mpl
from scipy import stats
import statsmodels.api as sm

# Load data:
table = pd.read_csv("bitcoin-historical-data/btceUSD_1-min_data_2012-01-01_to_2017-05-31.csv")
# Print the first 5 rows of our data:
table

Unnamed: 0,Timestamp,Open,High,Low,Close,Volume_(BTC),Volume_(Currency),Weighted_Price
0,1325292180,4.247,4.247,4.247,4.247,0.400000,1.698800,4.247000
1,1325292240,,,,,,,
2,1325292300,,,,,,,
3,1325292360,,,,,,,
4,1325292420,,,,,,,
5,1325292480,,,,,,,
6,1325292540,,,,,,,
7,1325292600,,,,,,,
8,1325292660,,,,,,,
9,1325292720,,,,,,,


Looking at our data, we can see the features: Timestamp (unix time), open (opening or starting price of the bitcoin), high (highest price in the minute), low (lowest price in the minute), close (the closing price of the bitcoin), volume and the weighted price.

#### Tidying Our Data
In the head of the table, notice that only one of the rows has succificent data to properly analyze the features. We can save space and time by getting rid of rows that are missing data for features.

In [16]:
# drop rows with NaN for any of their features
table = table.dropna()
# view the first 5 rows of the tidy table:
table.head()

Unnamed: 0,Timestamp,Open,High,Low,Close,Volume_(BTC),Volume_(Currency),Weighted_Price
0,1325292180,4.247,4.247,4.247,4.247,0.4,1.6988,4.247
138,1325300460,4.1,4.1,4.1,4.1,0.623628,2.556875,4.1
212,1325304900,4.1,4.1,4.1,4.1,6.503072,26.662595,4.1
284,1325309220,4.045,4.045,4.044,4.044,2.3793,9.624254,4.044994
311,1325310840,4.044,4.044,4.011,4.011,0.8964,3.607223,4.024122


### Data Analysis
Now that we've prepared our data, we can analyze it.
