## load the CSV file

In [1]:
import pandas as pd

# Load the Bitcoin data
data = pd.read_csv('data/coinbase.csv')

# Display the first few rows
data.head()


Unnamed: 0,Timestamp,Open,High,Low,Close,Volume_(BTC),Volume_(Currency),Weighted_Price
0,1417411980,300.0,300.0,300.0,300.0,0.01,3.0,300.0
1,1417412040,,,,,,,
2,1417412100,,,,,,,
3,1417412160,,,,,,,
4,1417412220,,,,,,,


## Preprocessing the Data     

The next step in the notebook will be preprocessing the data. Preprocessing might include handling missing values, normalizing data, and removing unwanted columns.       

In [2]:
# Drop any missing values
data = data.dropna()

# Normalize the 'Close' prices to a range of 0-1
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
data['Close'] = scaler.fit_transform(data[['Close']])

# Display the cleaned data
data.head()

Unnamed: 0,Timestamp,Open,High,Low,Close,Volume_(BTC),Volume_(Currency),Weighted_Price
0,1417411980,300.0,300.0,300.0,0.015078,0.01,3.0,300.0
7,1417412400,300.0,300.0,300.0,0.015078,0.01,3.0,300.0
51,1417415040,370.0,370.0,370.0,0.018597,0.01,3.7,370.0
77,1417416600,370.0,370.0,370.0,0.018597,0.026556,9.82555,370.0
1436,1417498140,377.0,377.0,377.0,0.018949,0.01,3.77,377.0


### Explanation:         

- *dropna():* Removes rows with missing values.         
- *MinMaxScaler:* This function from sklearn scales the data to be between 0 and 1, which helps the neural network during training.    

## Setting Up the tf.data.Dataset for Model Inputs     

Next, we’ll use TensorFlow’s tf.data.Dataset to prepare the data for the model. This is how we can structure our time series data.

In [4]:
import tensorflow as tf
import numpy as np

# Create dataset for time series prediction
def create_dataset(data, time_step=1):
    X, y = [], []
    for i in range(len(data)-time_step-1):
        X.append(data[i:(i+time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

# Convert the dataframe to a numpy array
dataset = data['Close'].values.reshape(-1,1)

# Set time step to 60 (meaning we'll look at 60 previous days to predict the next one)
time_step = 60
X, y = create_dataset(dataset, time_step)

### Explanation:    

We’re splitting the data into inputs (X) and outputs (y), where X is a sequence of past values and y is the next value we want to predict.  

## Model Architecture       

We’ll use an LSTM (Long Short-Term Memory) model, which is great for handling time series data.

In [5]:
# Build LSTM Model
model = tf.keras.Sequential([
    tf.keras.layers.LSTM(50, return_sequences=True, input_shape=(time_step, 1)),
    tf.keras.layers.LSTM(50, return_sequences=False),
    tf.keras.layers.Dense(25),
    tf.keras.layers.Dense(1)
])

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Model summary
model.summary()

  super().__init__(**kwargs)


### Explanation:   

- *LSTM:* A type of RNN that helps in capturing patterns in time series data.     
- *Dense:* A fully connected layer that outputs the predicted value.


## Splitting the Data into Training and Test Sets     

Before making predictions, you need to split your dataset into training and testing sets. This allows you to train your model on one part of the data (training set) and test its performance on unseen data (test set).

In [7]:
from sklearn.model_selection import train_test_split

# Splitting data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)


### Explanation:    

- *train_test_split:* This function splits your data into training and testing sets.      
- *test_size=0.2:* This means 80% of the data is used for training, and 20% is used for testing.      
- *shuffle=False:* Since time series data is sequential, it’s important not to shuffle it when splitting.     

## Results and Evaluation    
A
fter training the model, let's evaluate its performance and visualize the results.

In [11]:
import matplotlib.pyplot as plt

# Make predictions
predictions = model.predict(X_test)

# Inverse scaling to get actual price predictions
predictions = scaler.inverse_transform(predictions)

# Plot the results
plt.plot(data['Close'].values)
plt.plot(predictions)
plt.show()