#Regression Model Version

# Reference Material

#https://www.scribd.com/document/826659027/Novel-Optimization-Approach-for-Stock-Price-Forecasting-Using

#NOTES
The LSTM's ONLY ROLE is to predict prices, NOT decide whether to buy / sell / hold

A separate function takes the predictions of the LSTM, analyzes rise / fall trends, then decides whether to buy / sell / hold

#-
While the LSTM aims to predict prices, WE DO NOT NECESSARILY CARE about matching prices. What we care about is whether the rise / fall trends match.

The prupose of the LSTM predictions is to know whether stock will rise / fall today/tomorrow. While predicting the next price accurately helps, what we care more about is if the rise / fall trends are similar to the actual stock data, that way we can rely on the LTSM's prediction to decide to buy / sell / hold

Ex: The predicted stock prices could all be offset from the real data by $1000. However that means the rise/fall trends are relatively the same. While the prices were not 'accurately' predicted, the trends are, which it what wind up getting used.

I couldnt treat this as just a Binary Classification problem because I also needed to know the rate of change of the rise / fall, which Binary Classification couldnt provide.

#-
This logic WILL BE USING THE 2ND TO LAST SIGNAL to determine buy / sell / hold decision of 'today'. This is mostly due to the fact that I cannot consistently replicate a scenario where we do not 'Hold' on the 'present day'. However I currently have the buy / sell / hold function to try and decide every possible day. So the decision of 'yesterday' usually winds up being the same as the decision of 'today'

I am under the assumption that the agent will implement some form of 'cooldown period' in order to prevent overbuying/overselling

#-
The model is trained on the latest 2 years worth of data (65% Training - 35% Testing)
The data is chronologically ordered, so testing data is always the 'latest' stock data.
This model was designed to be trained and tested with more than 100 data points worth of data. With a larger testing dataset, the accuracy ranges between 70%-90%, but this range decreases as the testing size gets smaller. As such, I cannot just test on the latest 7 days worth of data without having terrible accuracy in the metrics or accurate rise/fall trends.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%pip install yfinance --upgrade --no-cache-dir



In [13]:
import numpy as np
import json
import yfinance as yf
from keras.models import load_model
from sklearn.preprocessing import MinMaxScaler

# -----------------------------------------------------
# 1. Load the trained LSTM model
# -----------------------------------------------------
model = load_model('lstm_model.keras')

# -----------------------------------------------------
# 2. Rebuild the scaler from saved JSON
# -----------------------------------------------------
scaler = MinMaxScaler()

with open('scaler.json', 'r') as f:
    scaler_params = json.load(f)

if 'min_' in scaler_params:
    scaler.min_ = np.array(scaler_params['min_'])
if 'scale_' in scaler_params:
    scaler.scale_ = np.array(scaler_params['scale_'])
if 'data_min_' in scaler_params:
    scaler.data_min_ = np.array(scaler_params['data_min_'])
if 'data_max_' in scaler_params:
    scaler.data_max_ = np.array(scaler_params['data_max_'])
if 'data_range_' in scaler_params:
    scaler.data_range_ = np.array(scaler_params['data_range_'])
if 'n_features_in_' in scaler_params:
    scaler.n_features_in_ = scaler_params['n_features_in']

# -----------------------------------------------------
# 3. Download stock data (AAPL)
#    You can adjust the period as needed
# -----------------------------------------------------
data = yf.download('AAPL', start='2020-01-01', end=None)[['Close']].dropna()

# -----------------------------------------------------
# 4. Scale the closing prices
# -----------------------------------------------------
scaled = scaler.transform(data)

# -----------------------------------------------------
# 5. Build sequences for the last 10 days before today
#    Each prediction needs 100 previous days
# -----------------------------------------------------
X_last10 = []
n = len(scaled)

for i in range(n - 10, n):
    seq = scaled[i-100:i]          # last 100 days before day i
    X_last10.append(seq)

X_last10 = np.array(X_last10).reshape(10, 100, 1)

# -----------------------------------------------------
# 6. Predict the last 10 days
# -----------------------------------------------------
pred_scaled_last10 = model.predict(X_last10)
pred_last10 = scaler.inverse_transform(pred_scaled_last10)

# Actual last 10 real prices
actual_last10 = data['Close'].values[-10:]

print("\n=== LAST 10-DAY BACKTEST PREDICTIONS ===")
for i in range(10):
    actual_val = float(actual_last10[i])
    pred_val = float(pred_last10[i][0])
    print(f"Day {i+1}: Actual = {actual_val:.2f}, Predicted = {pred_val:.2f}")


# -----------------------------------------------------
# 7. Predict TOMORROW (using the most recent 100 days)
# -----------------------------------------------------
last_100 = scaled[-100:].reshape(1, 100, 1)
scaled_pred_tomorrow = model.predict(last_100)
predicted_tomorrow = scaler.inverse_transform(scaled_pred_tomorrow)[0][0]

print("\nPredicted NEXT closing price:", predicted_tomorrow)


  data = yf.download('AAPL', start='2020-01-01', end=None)[['Close']].dropna()
[*********************100%***********************]  1 of 1 completed


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 310ms/step


  actual_val = float(actual_last10[i])



=== LAST 10-DAY BACKTEST PREDICTIONS ===
Day 1: Actual = 266.25, Predicted = 256.79
Day 2: Actual = 271.49, Predicted = 255.78
Day 3: Actual = 275.92, Predicted = 255.18
Day 4: Actual = 276.97, Predicted = 255.43
Day 5: Actual = 277.55, Predicted = 256.53
Day 6: Actual = 278.85, Predicted = 258.22
Day 7: Actual = 283.10, Predicted = 260.29
Day 8: Actual = 286.19, Predicted = 262.74
Day 9: Actual = 284.15, Predicted = 265.53
Day 10: Actual = 280.70, Predicted = 268.11
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 247ms/step

Predicted NEXT closing price: 269.87833


In [14]:
import numpy as np

# -----------------------------------------------------
# Generate Buy/Sell/Hold signals for the last 10 days
# using the predicted prices (pred_last10)
# -----------------------------------------------------

threshold = 0.005  # 0.5% move
pred_prices = pred_last10.flatten()   # shape (10,)
actual_prices = actual_last10         # already 1D

signals = []

for i in range(len(pred_prices)):
    if i < len(pred_prices) - 1:
        # % change from predicted today -> predicted tomorrow
        future_change = (pred_prices[i+1] - pred_prices[i]) / pred_prices[i]

        if future_change > threshold:
            sig = "Sell"
        elif future_change < -threshold:
            sig = "Buy"
        else:
            sig = "Hold"
    else:
        # Last day has no "next day" to compare to, so default to Hold
        sig = "Hold"

    signals.append(sig)

# -----------------------------------------------------
# Print a nice summary for the 10 days
# -----------------------------------------------------
print("=== LAST 10 DAYS: SIGNALS BASED ON PREDICTED PRICES ===")
print("Day | Actual   | Predicted | Signal")
print("--------------------------------------")
for i in range(10):
    a = float(actual_prices[i])
    p = float(pred_prices[i])
    print(f"{i+1:3d} | {a:8.2f} | {p:9.2f} | {signals[i]}")

# -----------------------------------------------------
# "today's" decision (second-to-last day)
# -----------------------------------------------------
current_decision = signals[-2]  # like your original logic
print("\nSuggested action for 'today':", current_decision)

# one-hot encoding of the decision (Buy/Sell/Hold)
if current_decision == "Buy":
    one_hot = [1, 0, 0]
elif current_decision == "Sell":
    one_hot = [0, 1, 0]
else:
    one_hot = [0, 0, 1]

print("One-hot decision (Buy, Sell, Hold):", one_hot)


=== LAST 10 DAYS: SIGNALS BASED ON PREDICTED PRICES ===
Day | Actual   | Predicted | Signal
--------------------------------------
  1 |   266.25 |    256.79 | Hold
  2 |   271.49 |    255.78 | Hold
  3 |   275.92 |    255.18 | Hold
  4 |   276.97 |    255.43 | Hold
  5 |   277.55 |    256.53 | Sell
  6 |   278.85 |    258.22 | Sell
  7 |   283.10 |    260.29 | Sell
  8 |   286.19 |    262.74 | Sell
  9 |   284.15 |    265.53 | Sell
 10 |   280.70 |    268.11 | Hold

Suggested action for 'today': Sell
One-hot decision (Buy, Sell, Hold): [0, 1, 0]


  a = float(actual_prices[i])
