# LSTM

# Unlocking Market Insight with LSTM

## Introduction
"What if we could capture the hidden memory of the market?"

Markets are shaped by patterns, momentum, and historical context. While many models struggle to recognize these, LSTMs (Long Short-Term Memory networks) excel by remembering information over long periods. Let’s explore how LSTM technology unlocks deeper insights for stock market prediction.

---

## What is LSTM?
LSTM stands for Long Short-Term Memory. It is a special kind of Recurrent Neural Network (RNN) capable of learning long-term dependencies.

### Key Characteristics:
- **Recursive Neural Network (RNN)** foundation, designed for sequence-based data.
- **LSTM cells** use gates (input, forget, output) to control information flow.
- Enables **selective memory retention** and **long-term dependency learning**.

---

## How Does LSTM Work?
LSTM networks process data through a series of steps within each LSTM cell:
- **Input Gate**: Determines which values from the input to update.
- **Forget Gate**: Decides what information to discard.
- **Output Gate**: Chooses the output based on the cell state.

These mechanisms allow the network to maintain and update memory over time, solving issues like the vanishing gradient problem common in standard RNNs.

---

## Why LSTM for Stock Markets?
Stock markets are inherently temporal and noisy. LSTM models outperform others due to their ability to:

- **Capture long-term dependencies** across market cycles.
- **Recognize complex patterns** in sequential data that span long time periods.
- **Handle multivariate inputs**, such as price, volume, sentiment, and macroeconomic data.
- **Adapt to temporal dynamics**, critical for understanding momentum and volatility.
- **Avoid vanishing gradient issues** with their advanced memory cell architecture.

---

## Applications Beyond Finance
LSTMs have demonstrated success in many fields, reinforcing their versatility:
- **Text and speech modeling** (e.g., NLP, chatbots)
- **Machine translation**
- **Anomaly detection in time series**
- **Text classification and sentiment analysis**

Their ability to understand **temporally dynamic** and **sequential data** makes them valuable across domains.

---

## Conclusion
In a world where data flows in sequences and time-dependent patterns, LSTMs provide a powerful tool. For financial markets, this means:
- Better prediction of future trends.
- More adaptive, memory-driven strategies.
- Competitive edge in a data-rich, fast-changing environment.

LSTM: Because in the markets, memory is money.



In [None]:
from sklearn.preprocessing import MinMaxScaler
import numpy as np
import pandas as pd

df = pd.read_csv("../data/processed_combined_data.csv")
# features = df[['close_NVDA', 'oil',"Electricity_Proxy", "Semiconductor_ETF", "Lithium_ETF", "Gold_Futures","VIX_Index","Gold_Futures" ]].values
features = df[['close_NVDA']].values  #seems to work best

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(features.reshape(-1, 1))

In [None]:
def create_sequences(data, seq_length=60):
    X, y = [], []
    for i in range(seq_length, len(data)):
        X.append(data[i - seq_length:i])
        y.append(data[i])
    return np.array(X), np.array(y)

seq_length = 20  # seems to be better when small
X, y = create_sequences(scaled_data, seq_length)

In [None]:
# train_size = int(len(X) * 0.8)
# X_train, y_train = X[:train_size], y[:train_size]
# X_test, y_test = X[train_size:], y[train_size:]

In [None]:
# from tensorflow.keras.models import Sequential
# from tensorflow.keras.layers import LSTM, Dense

# model = Sequential()
# model.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[0], 1)))
# model.add(LSTM(units=50))
# model.add(Dense(units=1))

# model.compile(optimizer='adam', loss='mean_squared_error')
# model.summary()

In [None]:
# history = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_data=(X_test, y_test))

# predicted = model.predict(X_test)
# predicted_prices = scaler.inverse_transform(np.array(predicted).reshape(-1, 1))
# actual_prices = scaler.inverse_transform(np.array(y_test).reshape(-1, 1))


In [None]:
import matplotlib.pyplot as plt

# plt.figure(figsize=(14, 6))
# plt.plot(actual_prices, color="black", label="Echter Preis")
# plt.plot(predicted_prices, color="green", label="Vorhergesagter Preis")
# plt.title("LSTM Aktienkurs Vorhersage (X_test)")
# plt.xlabel("Zeit")
# plt.ylabel("Preis")
# plt.legend()
# plt.show()

In [None]:
# train_predicted = model.predict(X_train)
# train_predicted_prices = scaler.inverse_transform(train_predicted)

# plt.plot(range(seq_length, seq_length + len(train_predicted_prices)),
#          train_predicted_prices, label="Vorhergesagt (Train)", color="blue")

# # Plot für echte Testpreise
# plt.plot(range(seq_length + len(train_predicted_prices),
#                seq_length + len(train_predicted_prices) + len(actual_prices)),
#          actual_prices, label="Echter Preis (Test)", color="black")

# # Plot für vorhergesagte Testpreise
# plt.plot(range(seq_length + len(train_predicted_prices),
#                seq_length + len(train_predicted_prices) + len(predicted_prices)),
#          predicted_prices, label="Vorhergesagt (Test)", color="green")

# # Trennlinie
# plt.axvline(x=seq_length + len(train_predicted_prices), color='red', linestyle='--', label='Train/Test-Split')

# plt.title("LSTM Aktienkurs Vorhersage")
# plt.xlabel("Zeit (Index)")
# plt.ylabel("Preis")
# plt.legend()

In [None]:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# mse = mean_squared_error(actual_prices, predicted_prices)
# rmse = np.sqrt(mse)
# mae = mean_absolute_error(actual_prices, predicted_prices)
# r2 = r2_score(actual_prices, predicted_prices)

# print(f"📊 MSE  = {mse:.4f}")
# print(f"📊 RMSE = {rmse:.4f}")
# print(f"📊 MAE  = {mae:.4f}")
# print(f"📈 R²   = {r2:.4f}")

# mit Timeseries cross validation

In [None]:
from sklearn.model_selection import TimeSeriesSplit
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense


tscv = TimeSeriesSplit(n_splits=5)

def create_model(input_shape=(20, 1)): #inputshape defalt was (60,1)
    model = Sequential()
    model.add(LSTM(units=50, return_sequences=True, input_shape=input_shape))
    model.add(LSTM(units=50))
    model.add(Dense(1)) 
    
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

cv_mae_scores = []
cv_r2_scores = []
for fold, (train_idx, test_idx) in enumerate(tscv.split(X)):
    print(f"Fold {fold+1}:")
    X_train, X_test = X[train_idx], X[test_idx]
    y_train, y_test = y[train_idx], y[test_idx]
    
    # Modell trainieren und bewerten
    model = create_model()  
    model.fit(X_train, y_train, epochs=10, batch_size=32, verbose=0)
    predictions = model.predict(X_test)
    
    mae = mean_absolute_error(y_test, predictions)
    r2 = r2_score(y_test, predictions)
    print(f"  MAE: {mae:.4f}")
    print(f"  R² : {r2:.4f}")
    cv_mae_scores.append(mae)
    cv_r2_scores.append(r2)
    

In [None]:
# history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test), verbose=1)

# # Fehler (Loss) für Training und Validierung plotten
# plt.figure(figsize=(12, 6))
# plt.plot(history.history['loss'], label='Training Loss', color='blue')
# plt.plot(history.history['val_loss'], label='Validation Loss', color='orange')
# plt.title('Training & Validation Loss im Verlauf der Epochen')
# plt.xlabel('Epochs')
# plt.ylabel('Loss')
# plt.legend()
# plt.show()

In [None]:
# Vorhersage für Trainings- und Testdaten
train_pred = model.predict(X_train)
test_pred = model.predict(X_test)

# Vorhersage zurückskalieren (um echte Preiswerte zu erhalten)
train_pred = scaler.inverse_transform(train_pred)
test_pred = scaler.inverse_transform(test_pred)
y_train_actual = scaler.inverse_transform(y_train.reshape(-1, 1))
y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1))

# Plot
plt.figure(figsize=(14, 6))
plt.plot(y_train_actual, label='Echte Preise (Train)', color='blue')
plt.plot(train_pred, label='Vorhergesagt (Train)', color='green')
plt.plot(range(len(y_train_actual), len(y_train_actual) + len(y_test_actual)), y_test_actual, label='Echte Preise (Test)', color='black')
plt.plot(range(len(y_train_actual), len(y_train_actual) + len(test_pred)), test_pred, label='Vorhergesagt (Test)', color='red')

plt.axvline(x=len(y_train_actual), color='red', linestyle='--', label='Train/Test-Split')

plt.title('Vorhersage vs. Echte Werte')
plt.xlabel('Zeit (Index)')
plt.ylabel('Preis')
plt.legend()
plt.show()

In [None]:

print(f"\nDurchschnittlicher MAE über alle Folds: {np.mean(cv_mae_scores):.4f}")
print(f"Durchschnittlicher R²  über alle Folds: {np.mean(cv_r2_scores):.4f}")

# -> Use  TimeSeriesSplit for cross validation!

🔍 **Vergleich der MAE:**

- **Ohne CV:** `0.0554`  
- **Mit CV:** / nur close `0.0147`
- **Mit CV:** / nur close, inputshape(20,1), sequence 20  `0.0137`
- **Mit CV:** / close, oil `0.0236`
- **Mit CV** / 'close_NVDA', 'oil',"Electricity_Proxy", "Semiconductor_ETF", "Lithium_ETF", "Gold_Futures","VIX_Index","Gold_Futures" `0.0166`
- **inputshape(20,1) anstatt (60,1)** `0.0151`