# Crypto price prediction

In this project, we're going to simply find out how we can do a simple prediction on cryptocurrency prices. This piece of code is heavily inspired by [this video](https://www.youtube.com/watch?v=GFSiL6zEZF0) and it's not a serious project. It's more like some sort of fun project you'd do in a weekend, or a project that shows your abilities in converting your ideas to ML/DL projects. 

## 0. Data Gathering

In [2]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd 
import pandas_datareader as web
import datetime as dt

After importing our very essential dependencies like `numpy` or `matplotlib` it is time to just go ahead and decide about what we're going to do with our project. In this part, we decide which currency is our goal. For this particular project, I used Bitcoin. You can use another one like Ethereum, Ripple or Doge. Also, you can change `against_currency` to what you need more that US Dollars. For example you can put it to `CAD` for Canadian Dollar or `EUR` for Euro. It's completely up to you to decide about these currencies.

In [8]:
crypto_currency = "BTC"
against_currency = "USD"

We also need a _time frame_ for our project. This is some sort of daily time frame we've used here and we're monitoring the price of _BTC_ since 2018. 

In [7]:
start_date = dt.datetime(2018, 1, 1)
end_date = dt.datetime.now()

This part is also for gathering data from _Yahoo Finance API_. For more information about API's, I suggest taking a look at `pandas_datareader` documents.

In [9]:
data = web.DataReader(f'{crypto_currency}-{against_currency}', 'yahoo', start_date, end_date)

In [10]:
data.head(10)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2018-01-01,14112.200195,13154.700195,14112.200195,13657.200195,10291200000.0,13657.200195
2018-01-02,15444.599609,13163.599609,13625.0,14982.099609,16846600000.0,14982.099609
2018-01-03,15572.799805,14844.5,14978.200195,15201.0,16871900000.0,15201.0
2018-01-04,15739.700195,14522.200195,15270.700195,15599.200195,21783200000.0,15599.200195
2018-01-05,17705.199219,15202.799805,15477.200195,17429.5,23840900000.0,17429.5
2018-01-06,17712.400391,16764.599609,17462.099609,17527.0,18314600000.0,17527.0
2018-01-07,17579.599609,16087.700195,17527.300781,16477.599609,15866000000.0,16477.599609
2018-01-08,16537.900391,14208.200195,16476.199219,15170.099609,18413900000.0,15170.099609
2018-01-09,15497.5,14424.0,15123.700195,14595.400391,16660000000.0,14595.400391
2018-01-10,14973.299805,13691.200195,14588.5,14973.299805,18500800000.0,14973.299805


In [11]:
data.tail(10)

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2021-11-03,63516.9375,61184.238281,63254.335938,62970.046875,36124730000.0,62970.046875
2021-11-04,63123.289062,60799.664062,62941.804688,61452.230469,32615850000.0,61452.230469
2021-11-05,62541.46875,60844.609375,61460.078125,61125.675781,30605100000.0,61125.675781
2021-11-06,61590.683594,60163.78125,61068.875,61527.480469,29094930000.0,61527.480469
2021-11-07,63326.988281,61432.488281,61554.921875,63326.988281,24726750000.0,63326.988281
2021-11-08,67673.742188,63344.066406,63344.066406,67566.828125,41125610000.0,67566.828125
2021-11-09,68530.335938,66382.0625,67549.734375,66971.828125,42357990000.0,66971.828125
2021-11-10,68789.625,63208.113281,66953.335938,64995.230469,48730830000.0,64995.230469
2021-11-11,65579.015625,64180.488281,64978.890625,64949.960938,35880630000.0,64949.960938
2021-11-12,65420.230469,64312.015625,64858.25,64986.277344,35246390000.0,64986.277344


## 1. Data Preparation

In [3]:
from sklearn.preprocessing import MinMaxScaler`

Now we need to scale our data. As we're going to use a Neural Network, it's better to scale our data to something between 0 and 1, or -1 to 1. It all depends on what our input data is. In this case, 0 to 1 is preferred, as _price_ is always a positive number.

In [12]:
scaler = MinMaxScaler(feature_range=(0, 1))

What we've chosen here as our goal for prediction is the _Close_ price (you may chose others such as _Open_ or _High_, but _Close_ price is what we consider for making decisions about our future trades or purchases. And I guess it'd be fun to mess around with other parts of the dataset as well

In [15]:
scaled_data = scaler.fit_transform(data["Close"].values.reshape(-1, 1))

## 1.1. Making Neural-Network friendly data

First, we need to find out our chunk of time that we want to put our predictions based on. In this example, I have choses 30 days. The main reason for this is that I do not really want to rely on this as a serious tool. I just wanted to test some ideas, so 30 days for a time period is far more than enough. 

In [16]:
prediction_days = 30 

I highly recommend using `train_test_split` to most of my friends when they do ML/DL projects. But in this particular case, as we wanted a specific chunk of our data, I do it this way: 

In [20]:
x_train, y_train = [], []

for x in range(prediction_days, len(scaled_data)):
    x_train.append(scaled_data[x-prediction_days:x, 0])
    y_train.append(scaled_data[x, 0])

Remember that the nueral network will accept a _numpy array_ as an input, so we need to convert our input data to numpy arrays before doing anything serious with them.

In [23]:
x_train, y_train = np.array(x_train), np.array(y_train)

And finally, we're going to do some reshaping to our `x_train` part of the data. `x` is usually called _independent variable_ and if you pay attention closely, you see that it includes our time chunks. Neural networks need some 3-dimensional type of input for `x` axis. So this is why we add this simple dimension to our data.

In [24]:
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

In [4]:
from tensorflow.keras.layers import Dense, Dropout, LSTM
from tensorflow.keras.models import Sequential