# Task 6.


### RNN Task: Predict Gold Prices for Next Day, Week, and Month

### Objective
Use a Recurrent Neural Network (RNN) to predict gold prices for the next day, next week, and next month using the "Gold Price Prediction Dataset" from Kaggle.

### Steps

1. **Data Acquisition**
   - Import necessary libraries (pandas, numpy, TensorFlow/Keras, etc.).
   - Load the dataset: `gold_data = pd.read_csv('path_to_FINAL_USO.csv')`.

2. **Data Preprocessing**
   - Perform data cleaning and preprocessing.
   - Normalize the data if necessary.
   - Create appropriate time series sequences for RNN.

3. **Model Development**
   - Define an RNN model using Torch or Keras.
   - Split the data into training and testing sets.

4. **Training the Model**
   - Train the model on the training dataset.
   - Use time steps as per your prediction needs (next day, week, month).

5. **Model Evaluation**
   - Evaluate the model's performance on the test set.
   - Adjust model parameters and architecture as needed.

6. **Prediction**
   - Make predictions for the next day, week, and month.
   - Analyze the results and accuracy of predictions.

7. **Conclusion**
   - Document your findings.
   - Suggest potential improvements or further experiments.

## Dataset
Download from: [Kaggle - Gold Price Prediction Dataset](https://www.kaggle.com/datasets/sid321axn/gold-price-prediction-dataset?select=FINAL_USO.csv)




In [1]:
import pandas as pd 
gold = pd.read_csv('C:/FINAL_USO.csv',
                     sep = ',',
                    header = 0)

In [2]:
gold

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,SP_open,SP_high,SP_low,...,GDX_Low,GDX_Close,GDX_Adj Close,GDX_Volume,USO_Open,USO_High,USO_Low,USO_Close,USO_Adj Close,USO_Volume
0,2011-12-15,154.740005,154.949997,151.710007,152.330002,152.330002,21521900,123.029999,123.199997,121.989998,...,51.570000,51.680000,48.973877,20605600,36.900002,36.939999,36.049999,36.130001,36.130001,12616700
1,2011-12-16,154.309998,155.369995,153.899994,155.229996,155.229996,18124300,122.230003,122.949997,121.300003,...,52.040001,52.680000,49.921513,16285400,36.180000,36.500000,35.730000,36.270000,36.270000,12578800
2,2011-12-19,155.479996,155.860001,154.360001,154.869995,154.869995,12547200,122.059998,122.320000,120.029999,...,51.029999,51.169998,48.490578,15120200,36.389999,36.450001,35.930000,36.200001,36.200001,7418200
3,2011-12-20,156.820007,157.429993,156.580002,156.979996,156.979996,9136300,122.180000,124.139999,120.370003,...,52.369999,52.990002,50.215282,11644900,37.299999,37.610001,37.220001,37.560001,37.560001,10041600
4,2011-12-21,156.979996,157.529999,156.130005,157.160004,157.160004,11996100,123.930000,124.360001,122.750000,...,52.419998,52.959999,50.186852,8724300,37.669998,38.240002,37.520000,38.110001,38.110001,10728000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1713,2018-12-24,119.570000,120.139999,119.570000,120.019997,120.019997,9736400,239.039993,240.839996,234.270004,...,20.650000,21.090000,21.090000,60507000,9.490000,9.520000,9.280000,9.290000,9.290000,21598200
1714,2018-12-26,120.620003,121.000000,119.570000,119.660004,119.660004,14293500,235.970001,246.179993,233.759995,...,20.530001,20.620001,20.620001,76365200,9.250000,9.920000,9.230000,9.900000,9.900000,40978800
1715,2018-12-27,120.570000,120.900002,120.139999,120.570000,120.570000,11874400,242.570007,248.289993,238.960007,...,20.700001,20.969999,20.969999,52393000,9.590000,9.650000,9.370000,9.620000,9.620000,36578700
1716,2018-12-28,120.800003,121.080002,120.720001,121.059998,121.059998,6864700,249.580002,251.399994,246.449997,...,20.570000,20.600000,20.600000,49835000,9.540000,9.650000,9.380000,9.530000,9.530000,22803400


In [3]:
column = gold.columns
column

Index(['Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume',
       'SP_open', 'SP_high', 'SP_low', 'SP_close', 'SP_Ajclose', 'SP_volume',
       'DJ_open', 'DJ_high', 'DJ_low', 'DJ_close', 'DJ_Ajclose', 'DJ_volume',
       'EG_open', 'EG_high', 'EG_low', 'EG_close', 'EG_Ajclose', 'EG_volume',
       'EU_Price', 'EU_open', 'EU_high', 'EU_low', 'EU_Trend', 'OF_Price',
       'OF_Open', 'OF_High', 'OF_Low', 'OF_Volume', 'OF_Trend', 'OS_Price',
       'OS_Open', 'OS_High', 'OS_Low', 'OS_Trend', 'SF_Price', 'SF_Open',
       'SF_High', 'SF_Low', 'SF_Volume', 'SF_Trend', 'USB_Price', 'USB_Open',
       'USB_High', 'USB_Low', 'USB_Trend', 'PLT_Price', 'PLT_Open', 'PLT_High',
       'PLT_Low', 'PLT_Trend', 'PLD_Price', 'PLD_Open', 'PLD_High', 'PLD_Low',
       'PLD_Trend', 'RHO_PRICE', 'USDI_Price', 'USDI_Open', 'USDI_High',
       'USDI_Low', 'USDI_Volume', 'USDI_Trend', 'GDX_Open', 'GDX_High',
       'GDX_Low', 'GDX_Close', 'GDX_Adj Close', 'GDX_Volume', 'USO_Open',
       'USO_Hig

In [4]:
index = gold.index
index

RangeIndex(start=0, stop=1718, step=1)

In [5]:
gold.isna()

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,SP_open,SP_high,SP_low,...,GDX_Low,GDX_Close,GDX_Adj Close,GDX_Volume,USO_Open,USO_High,USO_Low,USO_Close,USO_Adj Close,USO_Volume
0,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1713,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1714,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1715,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1716,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [6]:
X = gold['SP_close']
y = gold['SP_Ajclose']

In [7]:
X

0       122.180000
1       121.589996
2       120.290001
3       123.930000
4       124.169998
           ...    
1713    234.339996
1714    246.179993
1715    248.070007
1716    247.750000
1717    249.919998
Name: SP_close, Length: 1718, dtype: float64

In [8]:
y

0       105.441238
1       105.597549
2       104.468536
3       107.629784
4       107.838242
           ...    
1713    234.339996
1714    246.179993
1715    248.070007
1716    247.750000
1717    249.919998
Name: SP_Ajclose, Length: 1718, dtype: float64

In [9]:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import numpy as np

dataset_len = len(gold)

# Hyper Parameter - Time frame window size for predictions
time_frame_window_size = 3

X_np = np.array([y[n-time_frame_window_size:n] for n in range(time_frame_window_size, dataset_len)])
X = torch.tensor(X_np, dtype=torch.float).unsqueeze(-1)
Y = torch.tensor(np.reshape(y, (len(y), 1))[time_frame_window_size:], dtype=torch.float)


print("X.shape", X.shape)
print("Y.shape", Y.shape)

print("X[0]", X[0])
print("X[1]", X[1])
print("Y[1]", Y[0])

X.shape torch.Size([1715, 3, 1])
Y.shape torch.Size([1715, 1])
X[0] tensor([[105.4412],
        [105.5975],
        [104.4685]])
X[1] tensor([[105.5975],
        [104.4685],
        [107.6298]])
Y[1] tensor([107.6298])


In [10]:
# Model parameters
input_size = time_frame_window_size  # Because we are inputting N day's price at a time
hidden_size = 6  # Size of the RNN's hidden state
output_size = 1  # We want to output one price

#x = np.array(range(1, len(gold)))

# Define the RNN model
class StockRNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(StockRNN, self).__init__()
        # RNN layer: Defines a simple RNN layer with the specified input and hidden size.
        # 'batch_first=True' indicates that the first dimension of the input and output will be the batch size.        
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)  # RNN layer
        self.fc = nn.Linear(hidden_size, output_size)  # Fully connected layer for output DNN 

    def forward(self, x):
       # print("x.shape => ", x.shape)
        x = x.squeeze(-1)
       # print("x 2D=> ", x.shape) 
        out, _ = self.rnn(x)  # RNN output
        res = self.fc(out)  # Final output for each sequence
       # print("res ", res.shape) 
        return res
    
   

In [11]:
# Instantiate the model
model = StockRNN(input_size, hidden_size, output_size)

# Loss and optimizer
criterion = nn.MSELoss()  # Mean Squared Error Loss
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

# Training the model
for epoch in range(20):
    optimizer.zero_grad()
    output = model(X)  # Forward pass
    loss = criterion(output, Y)  # Compute loss
    loss.backward()  # Backpropagation
    optimizer.step()  # Update weights

    # Print loss every 2 epochs
    if (epoch+1) % 2 == 0:
        print(f'Epoch [{epoch+1}/20], Loss: {loss.item():.4f}')

# Predict the next day's price

last_working_week = X[-1].T
print("last week shape => ", last_working_week.shape)

# update price by shifting window +1 step and add predicted price for NEXT day.
for i in range(time_frame_window_size):
    print("last_working_week => ", last_working_week)    
    predicted_price = model(last_working_week)
    print(f"Predicted price for tomorrow: {predicted_price.item():.2f}")
    
    x = np.append(X, len(X)+1)
    y = np.append(y, predicted_price.detach().numpy()[0])
    
    print("Updated X => ", x)
    print("Updated Y => ", y)
    
    last_working_week = torch.tensor(torch.from_numpy(x[-time_frame_window_size:]), dtype=torch.float)
    last_working_week = last_working_week.reshape(1,time_frame_window_size)



Epoch [2/20], Loss: 38984.7227
Epoch [4/20], Loss: 38568.9141
Epoch [6/20], Loss: 38095.0547
Epoch [8/20], Loss: 37602.8984
Epoch [10/20], Loss: 37103.5820
Epoch [12/20], Loss: 36601.8555
Epoch [14/20], Loss: 36100.2383
Epoch [16/20], Loss: 35600.2188
Epoch [18/20], Loss: 35102.7773
Epoch [20/20], Loss: 34608.5742
last week shape =>  torch.Size([1, 3])
last_working_week =>  tensor([[246.1800, 248.0700, 247.7500]])
Predicted price for tomorrow: 13.42
Updated X =>  [ 105.4412384   105.59754944  104.46853638 ...  248.07000732  247.75
 1716.        ]
Updated Y =>  [105.441238   105.597549   104.468536   ... 247.75       249.919998
  13.41615772]
last_working_week =>  tensor([[ 248.0700,  247.7500, 1716.0000]])
Predicted price for tomorrow: 13.42
Updated X =>  [ 105.4412384   105.59754944  104.46853638 ...  248.07000732  247.75
 1716.        ]
Updated Y =>  [105.441238   105.597549   104.468536   ... 249.919998    13.41615772
  13.41615772]
last_working_week =>  tensor([[ 248.0700,  247.750

  last_working_week = torch.tensor(torch.from_numpy(x[-time_frame_window_size:]), dtype=torch.float)
