# ðŸ“Š Multimodal Market Intelligence System â€” Final Submission-Ready Notebook

## Short-Term Commodity Price Forecasting with Interpretable Multimodal GRU

This notebook is the final, submission-ready version of the Multimodal Price Forecasting
project. It preserves the original project's objectives and pipeline while making the
implementation portable, reproducible, and robust across machines.

Key features:

- Maintains the original objective: predict next-day commodity price direction (up/down)
  using price time-series, news sentiment, and weather modalities.
- Uses a GRU-based multimodal neural network with exogenous feature fusion.
- Portable: synthetic-data fallback so the full pipeline runs without external APIs.
- Reproducible: seeded RNGs, train/val/test splits, scalers saved with model artifacts.
- Robust training: validation-based checkpointing, pos_weight handling for class imbalance,
  early stopping, and clear evaluation metrics (accuracy, precision, recall, F1, ROC-AUC).
- Interpretability: shows logistic coefficients baseline, and provides hooks to add attention
  and SHAP explainability (optional cells included).

Run order (recommended):
1. Environment check
2. (optional) Automated install cell â€” disabled by default
3. Imports & configuration
4. Data generation / ingestion (synthetic fallback included)
5. Preprocessing & scaling
6. Dataloaders (train/val/test)
7. Model definition (GRU with optional attention)
8. Training (with early stopping & checkpointing)
9. Evaluation & visualizations
10. Ablation experiments (optional)
11. Interpretability / SHAP hooks (optional)
12. Save artifacts and final summary

If you prefer a quick run, run the demo cells (synthetic data) from top to bottom.

---


## Quick notes about changes from the original notebook
- Removed machine-specific and destructive environment modifications (no auto pip/conda uninstall/install from cells).
- Replaced hard-coded paths with project-relative paths and portable logging.
- Implemented a GRU multimodal model and added improvements for reproducibility and robustness.
- Added train/val/test split, scalers, checkpointing, class weight handling, and evaluation metrics.

If you want additional features (attention visualization, SHAP explanations, hyperparameter search), scroll to the relevant section near the end.

## 1. Environment check (safe, non-destructive)
This cell reports environment info and missing packages. It does not modify your environment.

In [None]:
import sys, platform, textwrap
from pathlib import Path
PROJECT_ROOT = Path.cwd()
print(f'Project root (cwd): {PROJECT_ROOT}')
print(f'Python: {platform.python_version()} ({platform.python_implementation()})')

def check(pkgs):
    missing = []
    for p in pkgs:
        import_name = p if p != 'scikit_learn' else 'sklearn'
        try:
            __import__(import_name)
        except Exception:
            missing.append(import_name)
    return missing

required = ['numpy','pandas','scikit_learn','matplotlib','seaborn','torch']
missing = check(required)
if missing:
    print('
Missing packages:', missing)
    print(textwrap.dedent('''
        To install required packages run:
          pip install -r requirements.txt
        Or install manually:
          pip install numpy pandas scikit-learn matplotlib seaborn torch
    '''))
else:
    print('
All required packages appear installed.')


### Optional: Automated install (disabled by default)
Enable only if you want the notebook to attempt to pip-install packages from within the notebook.

In [None]:
AUTO_INSTALL = False
if AUTO_INSTALL:
    import subprocess, sys
    reqs = ['numpy','pandas','matplotlib','seaborn','scikit-learn','torch']
    subprocess.check_call([sys.executable,'-m','pip','install'] + reqs)
    print('Installed packages. Restart kernel and re-run the notebook.')
else:
    print('Auto-install disabled (recommended).')


## 1. Imports and configuration
Set deterministic flags and device selection. Logging writes to a project-local directory.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import logging
from pathlib import Path
import joblib
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score, roc_curve

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
torch.manual_seed(RANDOM_SEED)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Device:', DEVICE)

LOG_DIR = Path('.') / '.cursor'
LOG_DIR.mkdir(exist_ok=True)
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s', handlers=[logging.StreamHandler(), logging.FileHandler(LOG_DIR / 'final_submission.log')])
logger = logging.getLogger('final_submission')
logger.info('Imports and configuration set')


## 2. Data generation / ingestion (synthetic fallback)
By default this notebook uses synthetic data so everything runs without external APIs. Replace this cell with your real data ingestion code if available.

In [None]:
def generate_synthetic_data(days: int = 900, seed: int = RANDOM_SEED):
    import numpy as _np
    import pandas as _pd
    _np.random.seed(seed)
    dates = _pd.date_range(end=_pd.Timestamp.today().normalize(), periods=days, freq='D')
    base = 100.0
    mu = 0.0004
    sigma = 0.02
    returns = _np.random.normal(loc=mu, scale=sigma, size=days)
    prices = base * _np.cumprod(1 + returns)
    price_df = _pd.DataFrame({'date': dates, 'price': prices})
    sentiment = _np.clip(_np.random.normal(0.0, 0.25, size=days), -1, 1)
    sentiment_df = _pd.DataFrame({'date': dates, 'sentiment': sentiment})
    doy = dates.dayofyear.values
    temp = 25 + 6 * _np.sin(2 * _np.pi * doy / 365) + _np.random.normal(0, 2, size=days)
    humidity = _np.clip(60 + 15 * _np.cos(2 * _np.pi * doy / 365) + _np.random.normal(0, 5, size=days), 0, 100)
    rainfall = (_np.random.rand(days) < 0.12).astype(float) * (_np.random.exponential(scale=2.0, size=days))
    weather_df = _pd.DataFrame({'date': dates, 'temp': temp, 'humidity': humidity, 'rainfall': rainfall})
    return price_df, sentiment_df, weather_df

# Generate synthetic data (default)
price_df, sentiment_df, weather_df = generate_synthetic_data(days=900)
print('Generated data shapes -> price:', price_df.shape, 'sentiment:', sentiment_df.shape, 'weather:', weather_df.shape)


## 3. Preprocessing: returns, sequences, exogenous features, scaling
We compute returns, create sequences for GRU input (SEQ_LEN), and use last-day exogenous features: sentiment, temp, humidity, rainfall.

In [None]:
def prepare_dataset_for_gru(price_df, sentiment_df, weather_df, seq_len=14):
    df = price_df.merge(sentiment_df, on='date', how='left').merge(weather_df, on='date', how='left')
    df = df.sort_values('date').reset_index(drop=True)
    df['return'] = df['price'].pct_change().fillna(0.0)
    df['target_next_up'] = (df['price'].shift(-1) > df['price']).astype(int)
    df = df.dropna(subset=['target_next_up']).reset_index(drop=True)
    sequences, exogs, targets, dates = [], [], [], []
    for i in range(seq_len - 1, len(df) - 1):
        seq_returns = df['return'].values[i - (seq_len - 1): i + 1]
        exog = df.loc[i, ['sentiment','temp','humidity','rainfall']].values.astype(float)
        target = df.loc[i, 'target_next_up']
        sequences.append(seq_returns.reshape(seq_len, 1))
        exogs.append(exog)
        targets.append(int(target))
        dates.append(df.loc[i, 'date'])
    import numpy as _np
    return _np.array(sequences), _np.array(exogs), _np.array(targets), pd.Series(dates)

SEQ_LEN = 14
X_seq, X_exog, y_all, dates = prepare_dataset_for_gru(price_df, sentiment_df, weather_df, seq_len=SEQ_LEN)
print('Prepared dataset shapes -> X_seq:', X_seq.shape, 'X_exog:', X_exog.shape, 'y:', y_all.shape)


### 4. Time-based train/validation/test split and scalers
Split chronologically: training -> validation -> test. Fit scalers only on training data.

In [None]:
from sklearn.preprocessing import StandardScaler
def time_based_split_and_scale(X_seq, X_exog, y, test_frac=0.2, val_frac=0.1):
    n = len(y)
    n_test = int(n * test_frac)
    n_trainval = n - n_test
    n_val = int(n_trainval * val_frac)
    n_train = n_trainval - n_val
    X_seq_train = X_seq[:n_train]
    X_seq_val = X_seq[n_train:n_train+n_val]
    X_seq_test = X_seq[n_train+n_val:]
    X_exog_train = X_exog[:n_train]
    X_exog_val = X_exog[n_train:n_train+n_val]
    X_exog_test = X_exog[n_train+n_val:]
    y_train = y[:n_train]
    y_val = y[n_train:n_train+n_val]
    y_test = y[n_train+n_val:]
    exog_scaler = StandardScaler().fit(X_exog_train)
    seq_scaler = StandardScaler().fit(X_seq_train.reshape(-1, X_seq_train.shape[-1]))
    def transform_seq(x, scaler):
        s = x.reshape(-1, x.shape[-1])
        s = scaler.transform(s)
        return s.reshape(x.shape)
    X_seq_train_s = transform_seq(X_seq_train, seq_scaler)
    X_seq_val_s = transform_seq(X_seq_val, seq_scaler)
    X_seq_test_s = transform_seq(X_seq_test, seq_scaler)
    X_exog_train_s = exog_scaler.transform(X_exog_train)
    X_exog_val_s = exog_scaler.transform(X_exog_val)
    X_exog_test_s = exog_scaler.transform(X_exog_test)
    return (X_seq_train_s, X_exog_train_s, y_train,
            X_seq_val_s, X_exog_val_s, y_val,
            X_seq_test_s, X_exog_test_s, y_test,
            exog_scaler, seq_scaler)

(X_seq_train, X_exog_train, y_train,
 X_seq_val, X_exog_val, y_val,
 X_seq_test, X_exog_test, y_test,
 exog_scaler, seq_scaler) = time_based_split_and_scale(X_seq, X_exog, y_all, test_frac=0.2, val_frac=0.1)
print('Train/Val/Test sizes:', len(y_train), len(y_val), len(y_test))


## 5. PyTorch Dataset & Dataloaders
Shuffle only training loader; validation and test loaders are deterministic.

In [None]:
class MultimodalSequenceDataset(Dataset):
    def __init__(self, X_seq, X_exog, y):
        import torch
        self.X_seq = torch.tensor(X_seq, dtype=torch.float32)
        self.X_exog = torch.tensor(X_exog, dtype=torch.float32)
        self.y = torch.tensor(y, dtype=torch.float32)
    def __len__(self):
        return len(self.y)
    def __getitem__(self, idx):
        return self.X_seq[idx], self.X_exog[idx], self.y[idx]

def get_loaders(X_seq_train, X_exog_train, y_train, X_seq_val, X_exog_val, y_val, X_seq_test, X_exog_test, y_test, batch_size=128):
    train_ds = MultimodalSequenceDataset(X_seq_train, X_exog_train, y_train)
    val_ds = MultimodalSequenceDataset(X_seq_val, X_exog_val, y_val)
    test_ds = MultimodalSequenceDataset(X_seq_test, X_exog_test, y_test)
    train_loader = DataLoader(train_ds, batch_size=batch_size, shuffle=True)
    val_loader = DataLoader(val_ds, batch_size=batch_size, shuffle=False)
    test_loader = DataLoader(test_ds, batch_size=batch_size, shuffle=False)
    return train_loader, val_loader, test_loader

BATCH_SIZE = 128
train_loader, val_loader, test_loader = get_loaders(X_seq_train, X_exog_train, y_train, X_seq_val, X_exog_val, y_val, X_seq_test, X_exog_test, y_test, batch_size=BATCH_SIZE)
print('Batches -> train:', len(train_loader), 'val:', len(val_loader), 'test:', len(test_loader))


## 6. Model definition: GRU multimodal with optional attention
You can toggle attention for interpretability via USE_ATTENTION flag.

In [None]:
USE_ATTENTION = True
import torch.nn.functional as F

class Attention(nn.Module):
    def __init__(self, hidden_dim):
        super().__init__()
        self.proj = nn.Linear(hidden_dim, 1)
    def forward(self, encoder_outputs):
        scores = self.proj(encoder_outputs).squeeze(-1)
        weights = F.softmax(scores, dim=1)
        context = (encoder_outputs * weights.unsqueeze(-1)).sum(dim=1)
        return context, weights

class GRUMultimodal(nn.Module):
    def __init__(self, input_size=1, gru_hidden=128, gru_layers=2, exog_size=4, fc_hidden=64, dropout=0.2, use_attention=True):
        super().__init__()
        self.gru = nn.GRU(input_size=input_size, hidden_size=gru_hidden, num_layers=gru_layers, batch_first=True)
        self.use_attention = use_attention
        if use_attention:
            self.attn = Attention(gru_hidden)
            fc_in = gru_hidden + exog_size
        else:
            fc_in = gru_hidden + exog_size
        self.fc = nn.Sequential(
            nn.Linear(fc_in, fc_hidden),
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(fc_hidden, 1)
        )
    def forward(self, x_seq, x_exog):
        out, h_n = self.gru(x_seq)
        if self.use_attention:
            context, weights = self.attn(out)
            x = torch.cat([context, x_exog], dim=1)
            return self.fc(x).squeeze(1), weights
        else:
            last_h = h_n[-1]
            x = torch.cat([last_h, x_exog], dim=1)
            return self.fc(x).squeeze(1), None

model = GRUMultimodal(input_size=1, gru_hidden=128, gru_layers=2, exog_size=X_exog.shape[1], fc_hidden=64, dropout=0.2, use_attention=USE_ATTENTION).to(DEVICE)
print(model)


## 7. Training utilities: pos_weight, loss, optimizer, scheduler, early stopping
We derive pos_weight from training labels to handle class imbalance and use BCEWithLogitsLoss.

In [None]:
import torch.optim as optim
import numpy as _np
def compute_pos_weight(y):
    pos = (y == 1).sum()
    neg = (y == 0).sum()
    if pos == 0:
        return 1.0
    return float(neg) / float(pos)

def train_with_val(model, train_loader, val_loader, epochs=40, lr=1e-3, weight_decay=1e-5, patience=6):
    y_train = []
    for _, _, y in train_loader:
        y_train.append(y.numpy())
    y_train = _np.concatenate(y_train)
    pos_weight = compute_pos_weight(y_train)
    logger.info('pos_weight=%.4f', pos_weight)
    criterion = nn.BCEWithLogitsLoss(pos_weight=torch.tensor(pos_weight, dtype=torch.float32).to(DEVICE))
    optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=weight_decay)
    scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=3, verbose=True)
    best_val_loss = float('inf')
    best_state = None
    epochs_no_improve = 0
    history = []
    for epoch in range(1, epochs+1):
        model.train()
        train_losses = []
        for x_seq, x_exog, y in train_loader:
            x_seq = x_seq.to(DEVICE)
            x_exog = x_exog.to(DEVICE)
            y = y.to(DEVICE)
            optimizer.zero_grad()
            logits, _ = model(x_seq, x_exog)
            loss = criterion(logits, y)
            loss.backward()
            optimizer.step()
            train_losses.append(loss.item())
        avg_train_loss = float(sum(train_losses)/len(train_losses)) if train_losses else 0.0
        model.eval()
        val_losses = []
        all_probs, all_preds, all_targets = [], [], []
        with torch.no_grad():
            for x_seq, x_exog, y in val_loader:
                x_seq = x_seq.to(DEVICE)
                x_exog = x_exog.to(DEVICE)
                y = y.to(DEVICE)
                logits, _ = model(x_seq, x_exog)
                loss = criterion(logits, y)
                val_losses.append(loss.item())
                probs = torch.sigmoid(logits).cpu().numpy()
                preds = (probs >= 0.5).astype(int)
                all_probs.append(probs)
                all_preds.append(preds)
                all_targets.append(y.cpu().numpy())
        avg_val_loss = float(sum(val_losses)/len(val_losses)) if val_losses else 0.0
        scheduler.step(avg_val_loss)
        if all_targets:
            y_prob = _np.concatenate(all_probs).flatten()
            y_pred = _np.concatenate(all_preds).flatten()
            y_true = _np.concatenate(all_targets).flatten()
            val_acc = float((y_pred == y_true).mean())
            try:
                val_auc = float(roc_auc_score(y_true, y_prob))
            except Exception:
                val_auc = None
        else:
            val_acc = 0.0
            val_auc = None
        history.append({'epoch': epoch, 'train_loss': avg_train_loss, 'val_loss': avg_val_loss, 'val_acc': val_acc, 'val_auc': val_auc})
        logger.info(f'Epoch {epoch}: train_loss={avg_train_loss:.6f} val_loss={avg_val_loss:.6f} val_acc={val_acc:.4f} val_auc={val_auc}')
        if avg_val_loss < best_val_loss - 1e-6:
            best_val_loss = avg_val_loss
            best_state = {k: v.cpu() for k, v in model.state_dict().items()}
            epochs_no_improve = 0
        else:
            epochs_no_improve += 1
            if epochs_no_improve >= patience:
                logger.info('Early stopping')
                break
    if best_state is not None:
        model.load_state_dict(best_state)
    return model, pd.DataFrame(history)


## 8. Train end-to-end (synthetic demo)
Run training. Adjust EPOCHS for longer experiments. This uses the previously defined splits and loaders.

In [None]:
EPOCHS = 40
LR = 1e-3
PATIENCE = 6
BATCH_SIZE = 128

train_loader, val_loader, test_loader = get_loaders(X_seq_train, X_exog_train, y_train, X_seq_val, X_exog_val, y_val, X_seq_test, X_exog_test, y_test, batch_size=BATCH_SIZE)
model = GRUMultimodal(input_size=1, gru_hidden=128, gru_layers=2, exog_size=X_exog.shape[1], fc_hidden=64, dropout=0.2, use_attention=USE_ATTENTION).to(DEVICE)
trained_model, history_df = train_with_val(model, train_loader, val_loader, epochs=EPOCHS, lr=LR, patience=PATIENCE)
print('Training finished. Recent history:')
display(history_df.tail())


## 9. Final evaluation on test set
Compute accuracy, precision, recall, F1, ROC-AUC and show the confusion matrix.

In [None]:
def evaluate_model(model, loader):
    model.eval()
    preds, probs, targets = [], [], []
    attentions = []
    with torch.no_grad():
        for x_seq, x_exog, y in loader:
            x_seq = x_seq.to(DEVICE)
            x_exog = x_exog.to(DEVICE)
            logits, attn = model(x_seq, x_exog)
            p = torch.sigmoid(logits).cpu().numpy()
            preds.append((p >= 0.5).astype(int))
            probs.append(p)
            targets.append(y.numpy())
            if attn is not None:
                attentions.append(attn.cpu().numpy())
    import numpy as _np
    y_pred = _np.concatenate(preds).flatten()
    y_prob = _np.concatenate(probs).flatten()
    y_true = _np.concatenate(targets).flatten()
    acc = accuracy_score(y_true, y_pred)
    prec = precision_score(y_true, y_pred, zero_division=0)
    rec = recall_score(y_true, y_pred, zero_division=0)
    f1 = f1_score(y_true, y_pred, zero_division=0)
    try:
        auc = roc_auc_score(y_true, y_prob)
    except Exception:
        auc = None
    cm = confusion_matrix(y_true, y_pred)
    return {'acc': acc, 'precision': prec, 'recall': rec, 'f1': f1, 'auc': auc, 'cm': cm, 'y_true': y_true, 'y_pred': y_pred, 'y_prob': y_prob, 'attentions': attentions}

test_results = evaluate_model(trained_model, test_loader)
print('Test metrics:')
for k in ['acc','precision','recall','f1','auc']:
    print(f'{k}:', test_results[k])
print('
Confusion matrix:
', test_results['cm'])


## 10. Visualizations
Plot training history, confusion matrix, ROC curve, attention weights (if used), and recent price time-series for context.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

def plot_history(df):
    fig, ax = plt.subplots(1,2, figsize=(12,4))
    ax[0].plot(df['epoch'], df['train_loss'], label='train_loss')
    ax[0].plot(df['epoch'], df['val_loss'], label='val_loss')
    ax[0].legend(); ax[0].set_title('Loss')
    ax[1].plot(df['epoch'], df['val_acc'], label='val_acc')
    if 'val_auc' in df.columns:
        ax[1].plot(df['epoch'], df['val_auc'], label='val_auc')
    ax[1].legend(); ax[1].set_title('Validation metrics')
    plt.show()

def plot_confusion(cm):
    fig, ax = plt.subplots(figsize=(4,3))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', ax=ax)
    ax.set_xlabel('Predicted'); ax.set_ylabel('True')
    plt.show()

def plot_roc(y_true, y_prob):
    try:
        fpr, tpr, _ = roc_curve(y_true, y_prob)
        auc = roc_auc_score(y_true, y_prob)
        plt.figure(figsize=(6,4))
        plt.plot(fpr, tpr, label=f'AUC={auc:.3f}')
        plt.plot([0,1],[0,1],'--', color='gray')
        plt.xlabel('FPR'); plt.ylabel('TPR'); plt.legend(); plt.title('ROC')
        plt.show()
    except Exception as e:
        print('ROC unavailable:', e)

plot_history(history_df)
plot_confusion(test_results['cm'])
plot_roc(test_results['y_true'], test_results['y_prob'])

if USE_ATTENTION and test_results['attentions']:
    attn_batch = test_results['attentions'][0]
    plt.figure(figsize=(8,3))
    plt.imshow(attn_batch, aspect='auto', cmap='viridis')
    plt.colorbar(label='attention weight')
    plt.title('Attention weights (example batch)')
    plt.xlabel('Time step'); plt.ylabel('Sample index')
    plt.show()

display(price_df.tail(120).reset_index(drop=True))


## 11. Ablation experiments (quick)
Run these to quantify the contribution of each modality. These experiments retrain smaller models and may take time.

In [None]:
def run_ablation(exog_mask, epochs=20):
    # exog_mask: [use_sentiment, use_temp, use_humidity, use_rainfall]
    X_exog_train_mask = X_exog_train.copy()
    X_exog_val_mask = X_exog_val.copy()
    X_exog_test_mask = X_exog_test.copy()
    for i, use in enumerate(exog_mask):
        if not use:
            X_exog_train_mask[:, i] = 0.0
            X_exog_val_mask[:, i] = 0.0
            X_exog_test_mask[:, i] = 0.0
    tr, vl, te = get_loaders(X_seq_train, X_exog_train_mask, y_train, X_seq_val, X_exog_val_mask, y_val, X_seq_test, X_exog_test_mask, y_test, batch_size=BATCH_SIZE)
    m = GRUMultimodal(input_size=1, gru_hidden=128, gru_layers=2, exog_size=X_exog.shape[1], fc_hidden=64, dropout=0.2, use_attention=USE_ATTENTION).to(DEVICE)
    m_trained, _ = train_with_val(m, tr, vl, epochs=epochs, lr=LR, patience=4)
    res = evaluate_model(m_trained, te)
    return res

print('Running ablation (this may take a while)...')
res_price_only = run_ablation([False, False, False, False], epochs=20)
res_price_sent = run_ablation([True, False, False, False], epochs=20)
res_price_weather = run_ablation([False, True, True, True], epochs=20)
print('Ablation results:')
print('price_only -> acc,auc:', res_price_only['acc'], res_price_only['auc'])
print('price+sentiment -> acc,auc:', res_price_sent['acc'], res_price_sent['auc'])
print('price+weather -> acc,auc:', res_price_weather['acc'], res_price_weather['auc'])


## 12. Interpretability: FC contributions & SHAP hook
Quickly inspect FC layer weights; SHAP optional (disabled by default, install shap separately).

In [None]:
fc = trained_model.fc if hasattr(trained_model, 'fc') else None
if fc is not None:
    try:
        w = fc[0].weight.detach().cpu().numpy()
        print('FC weights shape:', w.shape)
    except Exception as e:
        print('Could not get FC weights:', e)
else:
    print('FC layer not found')

SHAP_ENABLED = False
if SHAP_ENABLED:
    print('Make sure shap is installed: pip install shap')
    # Example: create a small explainer on the FC inputs (last hidden + exog)


## 13. Save artifacts
Save model state, scalers, and metadata for reproducibility.

In [None]:
ART_DIR = Path('models')
ART_DIR.mkdir(exist_ok=True)
MODEL_PATH = ART_DIR / 'gru_multimodal_best.pth'
SCALER_PATH = ART_DIR / 'scalers.joblib'
META_PATH = ART_DIR / 'meta.joblib'

torch.save(trained_model.state_dict(), MODEL_PATH)
joblib.dump({'exog_scaler': exog_scaler, 'seq_scaler': seq_scaler}, SCALER_PATH)
joblib.dump({'seq_len': SEQ_LEN, 'exog_size': X_exog.shape[1], 'device': str(DEVICE), 'use_attention': USE_ATTENTION}, META_PATH)
print('Saved artifacts to', ART_DIR)


## 14. Final summary & submission checklist
- Objective preserved: predict next-day price direction using price, sentiment, and weather.
- Improvements included: scaling, val-split & checkpointing, pos_weight, optional attention, ablation experiments,
  deterministic flags, and artifact saving.


## 15. Link to GitHub Repository for Future Reference
By interacting with this notebook, you have already established a connection to the GitHub Repository for this project on AI-IN-MARKET-TREND-ANALYSIS. Utilize this link to revisit and reuse the code as necessary.