Author: Xinbin Huang
Last updated: Dec 16, 2017
The value of Bitcoin has increased a lot since it was invented. Also, more and more people are interested in investing in Bitcoin. It seems interesting to investigate the factors that affect the price.
This project performs a simple analysis on the effect of two factors on Bitcoin price.
data
: raw data (two CSV filesbitcoin_price.csv
andbitcoin_dataset.csv
)src
: code files and analysis scripts (.R
,.Rmd
)results
: rendered documents and generated analysis resultsdoc
: rendered report (bitcoin_report.md
)
- Does the difficulty to find a new block affect the price of Bitcoin?
- Does the volume of the Bitcoin affect the price of Bitcoin?
- The price of Bitcoin would be higher with increasing difficulty to find a new block because lower supplies (new blocks) makes Bitcoin more valuable.
- The volume of the Bitcoin would positively affect the price of Bitcoin because higher the volume, more investors would like to buy it.
The dataset includes the historical price and features data of the cryptocurrency Bitcoin. It is retrieved from Kaggle Cryptocurrency Historical Prices
- The downloaded files are located in
data
folder.bitcoin_dataset.csv
: include some features describing the Bitcoinbitcoin_price.csv
: include price information about the Bitcoin
- There are two
.csv
files (features.csv
andprice.csv
) in theresults
folder for testing purposes.
Date
record the date from 2013-4-28 to 2017-11-07.Close
is the daily closing price of Bitcoin from 2013-4-28 to 2017-11-07.btc_difficulty
is a relative measure of the difficulty in finding a new block.Volume
is the volume of transactions on the given day.
I generated a pair-plot with the variables Close
, btc_difficulty
and Volume
to first explore their relationship. Then I will run a linear regression model to see if latter two variables affect the Bitcoin price. The following part is the procedure to reproduce the analysis.
Dependencies diagram for the analysis piepline
- Get Docker Image:
docker pull xhuang09/bitcoin-analysis
- Clone the repo:
For HTTPS:
git clone https://github.com/xinbinhuang/bitcoin-analysis.git
For SSH:
git clone git@github.com:xinbinhuang/bitcoin-analysis.git
- Run the Docker Image:
docker run -it --rm -v YOUR_LOCAL_DIRECTORY_OF_CLONED_REPO/:/home/bitcoin-analysis xhuang09/bitcoin-analysis /bin/bash
- Change Directory:
cd home/bitcoin-analysis/
- To run the project analysis:
make all
- To clean previously outputted files:
make clean
Run the following command to regenerate the analysis. All commands should be run in the project root directory. Regardless of the dependency requirements, The following commands will give the same results of running the command make all in the root directory.
This command will download the two required dataframes to the data
folder as bitcoin_dataset.csv
and bitcoin_price.csv
.
# first data frame
Rscript src/download-data.R https://raw.githubusercontent.com/xinbinhuang/data-bitcoin/master/bitcoin_dataset.csv data/bitcoin_dataset.csv
# second data frame
Rscript src/download-data.R https://raw.githubusercontent.com/xinbinhuang/data-bitcoin/master/bitcoin_price.csv data/bitcoin_price.csv
This command will merge the two dataframes into one dataframe for subsequent analysis. The output CSV file will be stored in data/bitcoin_dataset.csv
Rscript src/merge-data.R data/bitcoin_price.csv data/bitcoin_dataset.csv results/merged-data.csv
This command will perform a descriptive analysis on the three variables. The output CSV file will be stored in results/descriptive-result.csv
Rscript src/descriptive.R results/merged-data.csv results/descriptive-result.csv
This command will perform a regression analysis on the three variables. The output CSV file will be stored in results/regression-result.csv
Rscript src/regression.R results/merged-data.csv results/regression-result.csv
This command will generate a pair-plot on the three variables from the merged data. The output png file will be stored in results/figure/analysis-plot.png
.
Rscript src/plot.R results/merged-data.csv results/figure/analysis-plot.png
This command will generate the report in markdown file from a R markdown file.
The generated report can be found in results
.
Rscript -e 'ezknitr::ezknit("src/bitcoin_report.Rmd", out_dir = "doc")'