# Reservoir Production Forecasting using CNN-LSTM and SVR

This notebook demonstrates a complete workflow for forecasting reservoir production using:
- Data preprocessing (synthetic dataset derived from SPE9 format)
- CNN-LSTM deep learning model
- Support Vector Regression (SVR)
- Evaluation metrics (RMSE, MAE, RÂ²)
- Comparison plots

All code is fully modular and calls scripts from the `src/` directory.

## 1. Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# importing project modules
from src.data_preprocessing import load_and_preprocess_data
from src.cnn_lstm_model import build_cnn_lstm, train_cnn_lstm_model
from src.svr_model import train_svr_model

## 2. Load and preprocess dataset
Dataset is expected from: `data/processed/spe9_synthetic.csv`
Preprocessing includes:
- Normalization
- Train/test split
- Sequence generation for CNN-LSTM

In [None]:
X_train, X_test, y_train, y_test = load_and_preprocess_data("data/processed/spe9_synthetic.csv")
X_train.shape, X_test.shape

## 3. Train CNN-LSTM Model

In [None]:
model = build_cnn_lstm(input_shape=X_train.shape[1:])
history, y_pred_cnnlstm = train_cnn_lstm_model(model, X_train, y_train, X_test, y_test)

## 4. Train SVR Model

In [None]:
y_pred_svr = train_svr_model(X_train.reshape(len(X_train), -1),
                             y_train,
                             X_test.reshape(len(X_test), -1))

## 5. Evaluate Both Models

In [None]:
def evaluate_model(y_true, y_pred):
    rmse = np.sqrt(mean_squared_error(y_true, y_pred))
    mae = mean_absolute_error(y_true, y_pred)
    r2 = r2_score(y_true, y_pred)
    return rmse, mae, r2

metrics_cnnlstm = evaluate_model(y_test, y_pred_cnnlstm)
metrics_svr = evaluate_model(y_test, y_pred_svr)

metrics_cnnlstm, metrics_svr

## 6. Plot Results

In [None]:
plt.figure(figsize=(10, 5))
plt.plot(y_test, label="Real", linewidth=3)
plt.plot(y_pred_cnnlstm, label="CNN-LSTM Prediction")
plt.plot(y_pred_svr, label="SVR Prediction")
plt.title("Real vs Predicted Production")
plt.xlabel("Time")
plt.ylabel("Production Rate")
plt.legend()
plt.grid(True)
plt.show()

## 7. Save Results

In [None]:
results = pd.DataFrame({
    "Model": ["CNN-LSTM", "SVR"],
    "RMSE": [metrics_cnnlstm[0], metrics_svr[0]],
    "MAE": [metrics_cnnlstm[1], metrics_svr[1]],
    "R2": [metrics_cnnlstm[2], metrics_svr[2]]
})

results.to_csv("results/metrics.csv", index=False)
results