# <strong>Bitcoin Recurrent Neural Network<strong>
### Justin Marlor & Habit Blunk
##### *Colorado State University*

This is our notebook that automatically copies data from [this dataset hosted on Kaggle](https://www.kaggle.com/datasets/mczielinski/bitcoin-historical-data).

To run it:

1. Run the script located in this repository at `./env-script`. This will set up your virtual environment. 
2. Run `source ./venv/bin/activate`. This will put you in the virtual environment we have set up, so this notebook can be run on any machine so long as it has Python 3.x and can install the dependencies at `./dependencies.txt`.
3. Paste this into `~/.config/kaggle/kaggle.json`:
    ```json
    {
      "username": "justinmarlor",
      "key": "b98017f9291bfa83686f6c6780d38e04"
    }
    ```
4. Execute each cell in sequence.

#### Cell 1: imports and grabbing the dataset

In [None]:
import pandas as pd
import subprocess
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
from torch.nn import GRU
from torch.nn import RNN

result = subprocess.run(['bash', './add-run-kaggle-bitcoin'], capture_output=True,text=True)

print(result.stdout)
print(result.stderr)

if result.returncode == 0:
  df = pd.read_csv("kaggle-bitcoin/upload/btcusd_1-min_data.csv", dtype={"Volume": float}, low_memory=False)
  display(df)

#### Cell 2: preprocessing and plotting dataset

In [None]:
df['datetime'] = pd.to_datetime(df['Timestamp'].astype('Int64'), unit='s', errors='coerce')
df['Year'] = df['datetime'].dt.year
df['month'] = df['datetime'].dt.month
df['day'] = df['datetime'].dt.day
display(df)

plt.plot(df['datetime'], df['Open'], label='open', color='blue')
plt.plot(df['datetime'], df['Close'], label='close', color='green')
plt.plot(df['datetime'], df['High'], label='high', color='red')
plt.plot(df['datetime'], df['Low'], label='low', color='orange')

plt.xlabel('datetime')
plt.ylabel('price')
plt.title('ohlc time series')
plt.legend()
plt.xticks(rotation=45)
plt.grid(True)
plt.tight_layout()
plt.show()

#### Cell 3: building and training RNN against dataset

In [None]:

input_size = df.shape[1]

rnn = nn.RNN(input_size=input_size, hidden_size=50, num_layers=2, batch_first=True)