## Differences Between ConvLSTM_V3 and ConvLSTM_V2

Unlike ConvLSTM_V2, which adopts a direct multi-step forecasting strategy to predict multiple future time steps simultaneously, ConvLSTM_V3 reformulates the task as a single-step forecasting problem, focusing exclusively on predicting the next time step (t+1).

By reducing the output layer from parallel multi-horizon outputs to a single prediction target, ConvLSTM_V3 allows the model to allocate its full representational capacity to short-term forecasting, thereby avoiding gradient interference among different forecast horizons. This simplification leads to a more stable training process and improved short-term prediction accuracy.

ConvLSTM_V3 serves as a baseline model for evaluating the impact of multi-step forecasting

ConvLSTM_V3 maintains the same core architecture and hyperparameter settings as ConvLSTM_V2, including the number of filters, kernel size, and input sequence length. The primary difference lies in the forecasting objective: V2 performs direct multi-step prediction over seven future time steps, whereas V3 focuses on single-step prediction (t+1), serving as a baseline for comparison.

中文解释：与 ConvLSTM_V2 采用直接多步预测策略不同，ConvLSTM_V3 将预测任务简化为 单步预测问题，仅对未来一个时间步（t+1）进行预测。通过将输出层由多维并行输出（t+1 至 t+7）调整为单一输出，模型能够将全部表达能力集中于短期预测目标，从而避免多步预测中不同预测步长之间的梯度干扰。

此外，单步预测设置降低了模型训练难度与不确定性，使模型在验证集上的收敛过程更加稳定。ConvLSTM_V3 的设计旨在作为多步预测模型的对照基线，用于评估多步预测策略对短期预测精度的影响。

| 类别          | 参数 / 设置          | **V2（多步预测）**  | **V3（单步预测）** | 说明     |
| ----------- | ---------------- | ------------- | ------------ | ------ |
| **预测任务**    | forecasting type | multi-step    | single-step  | 任务定义不同 |
| **预测步长**    | `horizon`        | **7**         | **1**        | 最核心差异  |
| **输出层**     | `Dense(...)`     | `Dense(7)`    | `Dense(1)`   | 输出维度不同 |
| **预测目标**    | y                | `[t+1 … t+7]` | `t+1`        | 标签构造不同 |
| **loss 计算** | loss             | 多步平均          | 单步 loss      | 优化目标不同 |
| **标签维度**    | `y.shape`        | `(N, 7)`      | `(N, 1)`     | 数据结构不同 |


那 train 里：

第一个训练样本：

X：第 1–14 天

y：第 15 天

第二个训练样本：

X：第 2–15 天

y：第 16 天

…

最后一个训练样本：

X：第 56–69 天

y：第 70 天

In [1]:
import os
import json
import math
from dataclasses import dataclass
from typing import Dict, List, Tuple, Optional

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import joblib
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau


# -----------------------------
# Config
# -----------------------------
@dataclass
class Config:
    # CSV paths (update if needed)
    csv_paths: Dict[str, str] = None

    # Columns
    date_col: str = "date"
    probe_col: str = "probe_name"

    # Feature set (8 variables)
    feature_cols: List[str] = None
    target_col: str = "sm_30cm"

    # Time series settings
    window_list: List[int] = None          # lookback windows to try
    horizon: int = 1                       # ONLY t+1 for formal experiment
    split_ratio: Tuple[float, float, float] = (0.70, 0.15, 0.15)

    # Model hyperparameters (small model as requested)
    filters: int = 16                      # you can set 8 or 16
    dropout: float = 0.2                   # Dropout(0.2)
    kernel_size: Tuple[int, int] = (1, 3)  # convolution over feature axis (cols)
    batch_size: int = 16
    max_epochs: int = 150
    learning_rate: float = 1e-3

    # Output root folder
    output_root: str = "outputs_tplus1_smallwin"

    # Reproducibility
    seed: int = 42


def set_seed(seed: int = 42):
    """Fix random seeds for reproducibility."""
    np.random.seed(seed)
    tf.random.set_seed(seed)


# -----------------------------
# Data utilities
# -----------------------------
def load_and_clean_csv(path: str, cfg: Config) -> pd.DataFrame:
    """
    Load a CSV, parse date, keep required columns, and drop missing values.
    """
    df = pd.read_csv(path)
    df.columns = [c.strip() for c in df.columns]

    required = [cfg.date_col, cfg.probe_col] + cfg.feature_cols
    for c in required:
        if c not in df.columns:
            raise ValueError(
                f"Missing column '{c}' in {path}. Available: {df.columns.tolist()}"
            )

    df[cfg.date_col] = pd.to_datetime(df[cfg.date_col])
    df = df.sort_values(cfg.date_col).reset_index(drop=True)

    # Keep only needed columns
    df = df[required].copy()

    # Convert numeric columns
    for c in cfg.feature_cols:
        df[c] = pd.to_numeric(df[c], errors="coerce")

    # Drop rows with missing essential values
    df = df.dropna(subset=cfg.feature_cols).reset_index(drop=True)
    return df


def split_by_probe(df: pd.DataFrame, cfg: Config) -> Dict[str, pd.DataFrame]:
    """
    Split a dataframe into multiple dataframes keyed by probe_name.
    """
    out = {}
    for probe_name, g in df.groupby(cfg.probe_col):
        g = g.sort_values(cfg.date_col).reset_index(drop=True)
        out[str(probe_name)] = g
    return out


def chronological_split_indices(n: int, ratios: Tuple[float, float, float]) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    """
    Chronological split indices: train -> val -> test.
    """
    r_train, r_val, r_test = ratios
    if abs((r_train + r_val + r_test) - 1.0) > 1e-9:
        raise ValueError("split_ratio must sum to 1.0")

    n_train = int(math.floor(n * r_train))
    n_val = int(math.floor(n * r_val))
    n_test = n - n_train - n_val

    idx_train = np.arange(0, n_train)
    idx_val = np.arange(n_train, n_train + n_val)
    idx_test = np.arange(n_train + n_val, n)

    return idx_train, idx_val, idx_test


def make_tplus1_samples_with_context_borrow(
    X_scaled: np.ndarray,       # [N, F]
    y_scaled: np.ndarray,       # [N, 1]
    dates: np.ndarray,          # [N]
    window: int,
    idx_split: np.ndarray,      # indices for current split
    idx_prev: Optional[np.ndarray] = None  # previous split indices (train for val, val for test)
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    """
    Create t+1 samples for one split, using Scheme A (context borrowing):
    - For val: prepend last `window` rows from train as input context
    - For test: prepend last `window` rows from val as input context
    - Targets (y) are strictly inside the current split (no leakage)

    For t+1:
    - Input window: X[t-window : t]
    - Target: y[t]  (interpreted as "next day" if you define t as next day index)
    Here we align the label to be the day right after the window ends.

    Returns:
      X_seq: [num_samples, window, 1, F, 1]
      y_seq: [num_samples, 1]   (still keep 2D for Keras)
      d_seq: [num_samples]      forecast start date (the day we are predicting)
    """
    if len(idx_split) == 0:
        return np.empty((0,)), np.empty((0,)), np.empty((0,))

    # Build context arrays
    if idx_prev is None or len(idx_prev) == 0:
        ctx_idx = idx_split
        offset = 0
    else:
        prev_tail = idx_prev[-window:] if len(idx_prev) >= window else idx_prev
        ctx_idx = np.concatenate([prev_tail, idx_split])
        offset = len(prev_tail)

    X_ctx = X_scaled[ctx_idx]
    y_ctx = y_scaled[ctx_idx]
    d_ctx = dates[ctx_idx]

    starts = []
    # We want the target index s to be inside the "current split part" => s >= offset
    # Need enough history: s-window >= 0
    for s in range(offset, len(X_ctx)):
        if s - window < 0:
            continue
        # t+1 label is y at s (the day being predicted)
        starts.append(s)

    if len(starts) == 0:
        return np.empty((0,)), np.empty((0,)), np.empty((0,))

    F = X_ctx.shape[1]
    X_seq = np.zeros((len(starts), window, 1, F, 1), dtype=np.float32)
    y_seq = np.zeros((len(starts), 1), dtype=np.float32)
    d_seq = np.zeros((len(starts),), dtype="datetime64[ns]")

    for i, s in enumerate(starts):
        Xw = X_ctx[s - window: s]     # [window, F]
        yt = y_ctx[s]                 # [1]
        X_seq[i, :, 0, :, 0] = Xw
        y_seq[i, 0] = float(yt)
        d_seq[i] = d_ctx[s]           # date of the target day

    return X_seq, y_seq, d_seq


# -----------------------------
# Model
# -----------------------------
def build_convlstm_model(window: int, n_features: int, cfg: Config) -> tf.keras.Model:
    """
    ConvLSTM-style model for tabular daily time series.
    We treat features as a "spatial axis":
      input shape = (time, rows=1, cols=n_features, channels=1)
    """
    tf.keras.backend.clear_session()

    inp = layers.Input(shape=(window, 1, n_features, 1))

    x = layers.ConvLSTM2D(
        filters=cfg.filters,
        kernel_size=cfg.kernel_size,
        padding="same",
        activation="tanh",
        recurrent_activation="sigmoid",
        return_sequences=False
    )(inp)

    x = layers.BatchNormalization()(x)
    x = layers.Dropout(cfg.dropout)(x)

    x = layers.Flatten()(x)
    x = layers.Dense(32, activation="relu")(x)
    x = layers.Dropout(cfg.dropout)(x)

    # horizon = 1 => output one value (t+1)
    out = layers.Dense(1, activation="linear")(x)

    model = models.Model(inputs=inp, outputs=out)
    opt = tf.keras.optimizers.Adam(learning_rate=cfg.learning_rate)
    model.compile(optimizer=opt, loss="mse")
    return model


# -----------------------------
# Plotting / metrics
# -----------------------------
def plot_loss(history: tf.keras.callbacks.History, out_png: str, title: str):
    """Plot training and validation loss."""
    plt.figure(figsize=(8, 5))
    plt.plot(history.history.get("loss", []), label="train_loss")
    plt.plot(history.history.get("val_loss", []), label="val_loss")
    plt.title(title)
    plt.xlabel("epoch")
    plt.ylabel("loss (MSE on scaled y)")
    plt.legend()
    plt.tight_layout()
    plt.savefig(out_png, dpi=150)
    plt.close()


def plot_true_vs_pred(dates: np.ndarray, y_true: np.ndarray, y_pred: np.ndarray, out_png: str, title: str, ylabel: str):
    """Plot true vs predicted series."""
    plt.figure(figsize=(8, 5))
    plt.plot(dates, y_true, label="actual")
    plt.plot(dates, y_pred, label="prediction")
    plt.title(title)
    plt.xlabel("date")
    plt.ylabel(ylabel)
    plt.legend()
    plt.tight_layout()
    plt.savefig(out_png, dpi=150)
    plt.close()


def rmse(y_true: np.ndarray, y_pred: np.ndarray) -> float:
    return float(np.sqrt(mean_squared_error(y_true, y_pred)))


# -----------------------------
# One experiment: (region, probe, window)
# -----------------------------
def run_one_experiment(df_probe: pd.DataFrame, region: str, probe: str, window: int, cfg: Config):
    """
    Train / evaluate / save outputs for one probe and one window size (t+1 only).
    """
    out_dir = os.path.join(cfg.output_root, f"window_{window}", region, probe)
    os.makedirs(out_dir, exist_ok=True)

    # Prepare arrays
    dates = df_probe[cfg.date_col].values
    X = df_probe[cfg.feature_cols].values.astype(np.float32)     # [N, F]
    y = df_probe[[cfg.target_col]].values.astype(np.float32)     # [N, 1]

    N, F = X.shape
    # Minimal sanity: need enough points for window and some val/test
    if N < (window + 10):
        print(f"[SKIP] {region}/{probe} window={window} -> too few rows: {N}")
        return

    # Chronological split indices
    idx_train, idx_val, idx_test = chronological_split_indices(N, cfg.split_ratio)

    # Fit scalers on TRAIN ONLY (no leakage)
    scaler_x = StandardScaler()
    scaler_y = StandardScaler()
    scaler_x.fit(X[idx_train])
    scaler_y.fit(y[idx_train])

    Xs = scaler_x.transform(X)
    ys = scaler_y.transform(y)

    # Build samples with Scheme A context borrowing
    X_train, y_train, d_train = make_tplus1_samples_with_context_borrow(
        Xs, ys, dates, window, idx_train, idx_prev=None
    )
    X_val, y_val, d_val = make_tplus1_samples_with_context_borrow(
        Xs, ys, dates, window, idx_val, idx_prev=idx_train
    )
    X_test, y_test, d_test = make_tplus1_samples_with_context_borrow(
        Xs, ys, dates, window, idx_test, idx_prev=idx_val
    )

    if len(X_train) == 0 or len(X_val) == 0 or len(X_test) == 0:
        print(f"[SKIP] {region}/{probe} window={window} -> empty samples "
              f"(train={len(X_train)}, val={len(X_val)}, test={len(X_test)})")
        return

    # Build model
    model = build_convlstm_model(window=window, n_features=F, cfg=cfg)

    # Callbacks: early stop often stops before max_epochs (this is normal)
    early_stop = EarlyStopping(monitor="val_loss", patience=10, restore_best_weights=True)
    reduce_lr = ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=5, min_lr=1e-6, verbose=1)

    # Train
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=cfg.max_epochs,
        batch_size=cfg.batch_size,
        callbacks=[early_stop, reduce_lr],
        verbose=1
    )

    # Save training history
    hist_df = pd.DataFrame({
        "epoch": np.arange(len(history.history["loss"])),
        "loss": history.history["loss"],
        "val_loss": history.history["val_loss"]
    })
    hist_df.to_csv(os.path.join(out_dir, "history.csv"), index=False)

    # Plot loss curve
    plot_loss(
        history,
        out_png=os.path.join(out_dir, "loss_train_val.png"),
        title=f"LOSS: {region}/{probe} (window={window}, horizon=1)"
    )

    # Predict on test
    y_pred_scaled = model.predict(X_test, verbose=0)   # [Ntest, 1]

    # Inverse scale back to original units
    y_true = scaler_y.inverse_transform(y_test)        # [Ntest, 1]
    y_pred = scaler_y.inverse_transform(y_pred_scaled) # [Ntest, 1]

    # Compute metrics (t+1)
    mae = float(mean_absolute_error(y_true[:, 0], y_pred[:, 0]))
    r = rmse(y_true[:, 0], y_pred[:, 0])

    metrics = {
        "region": region,
        "probe_name": probe,
        "window": window,
        "horizon": 1,
        "n_rows": int(N),
        "n_train_samples": int(len(X_train)),
        "n_val_samples": int(len(X_val)),
        "n_test_samples": int(len(X_test)),
        "filters": cfg.filters,
        "dropout": cfg.dropout,
        "MAE_t+1": mae,
        "RMSE_t+1": r
    }
    with open(os.path.join(out_dir, "summary_metrics.json"), "w", encoding="utf-8") as f:
        json.dump(metrics, f, indent=2)

    # Plot test true vs pred (t+1)
    test_dates = pd.to_datetime(d_test)  # d_test already corresponds to target date
    plot_true_vs_pred(
        dates=test_dates,
        y_true=y_true[:, 0],
        y_pred=y_pred[:, 0],
        out_png=os.path.join(out_dir, "test_true_vs_pred_tplus1.png"),
        title=f"Test True vs Pred (t+1): {region}/{probe} | window={window}",
        ylabel="sm_30cm (m3/m3)"
    )

    # Save test compare CSV
    test_df = pd.DataFrame({
        "target_date": test_dates,
        "y_true": y_true[:, 0],
        "y_pred": y_pred[:, 0]
    })
    test_df.to_csv(os.path.join(out_dir, "test_compare_tplus1.csv"), index=False)

    # Save model and scalers
    model.save(os.path.join(out_dir, "model.h5"))
    joblib.dump(scaler_x, os.path.join(out_dir, "scaler_x.pkl"))
    joblib.dump(scaler_y, os.path.join(out_dir, "scaler_y.pkl"))

    # Forecast the next day AFTER the last available date
    # Use last `window` X values as input
    last_X_window = Xs[-window:]                                  # [window, F] scaled
    last_X_seq = last_X_window.reshape(1, window, 1, F, 1)        # [1, window, 1, F, 1]
    next_day_scaled = model.predict(last_X_seq, verbose=0)        # [1, 1]
    next_day_value = scaler_y.inverse_transform(next_day_scaled)[0, 0]

    last_date = pd.to_datetime(df_probe[cfg.date_col].iloc[-1])
    next_date = last_date + pd.to_timedelta(1, unit="D")

    future_df = pd.DataFrame({
        "forecast_date": [next_date],
        "pred_sm_30cm": [float(next_day_value)]
    })
    future_df.to_csv(os.path.join(out_dir, "last_date_forecast_next_1_day.csv"), index=False)

    print(f"[OK] {region}/{probe} window={window} -> saved to: {out_dir}")




In [2]:
# -----------------------------
# Main
# -----------------------------
def main():
    cfg = Config()

    # Your provided paths
    cfg.csv_paths = {
        "Grandvillers_Sec": r"D:\UV Projet\Soil Moisture\Grandvillers_Sec.csv",
        "Grandvillers_Canon": r"D:\UV Projet\Soil Moisture\Grandvillers-Canon.csv",
        "Grandvillers_Robot_20": r"D:\UV Projet\Soil Moisture\Grandvillers-Robot-20.csv",
        "Grandvillers_Robot": r"D:\UV Projet\Soil Moisture\Grandvillers-Robot.csv",
    }

    # Your 8 input variables (keep sm_30cm in X as autoregressive feature)
    cfg.feature_cols = [
        "sm_30cm",
        "irrig_mm",
        "IRRAD",
        "TMIN",
        "TMAX",
        "VAP",
        "WIND",
        "RAIN",
    ]

    # Only t+1 (formal)
    cfg.horizon = 1

    # Windows you want to test
    cfg.window_list = [14, 30]  # you can change to [14] only

    # Small model settings (as you requested)
    cfg.filters = 16     # or 8
    cfg.dropout = 0.2

    set_seed(cfg.seed)
    os.makedirs(cfg.output_root, exist_ok=True)

    all_probes = set()

    for region, path in cfg.csv_paths.items():
        df = load_and_clean_csv(path, cfg)
        probes = split_by_probe(df, cfg)

        print(f"\nRegion={region} | probes={list(probes.keys())} (count={len(probes)})")
        for p in probes.keys():
            all_probes.add(p)

        for probe_name, df_probe in probes.items():
            for window in cfg.window_list:
                run_one_experiment(df_probe, region, probe_name, window, cfg)

    print("\nAll unique probe_name values across 4 CSV:")
    print(sorted(list(all_probes)))
    print(f"Total unique probe_name: {len(all_probes)}")
    print(f"\nDone. Outputs saved under: {cfg.output_root}")


if __name__ == "__main__":
    main()



Region=Grandvillers_Sec | probes=['Sec'] (count=1)
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
1/5 [=====>........................] - ETA: 0s - loss: 0.2210
Epoch 00010: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
1/5 [=====>........................] - ETA: 0s - loss: 0.3181
Epoch 00015: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
[OK] Grandvillers_Sec/Sec window=14 -> saved to: outputs_tplus1_smallwin\window_14\Grandvillers_Sec\Sec
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 00011: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
[OK] Grandvillers_Sec/Sec window=30 -> saved to: outputs_tplus1_smallwin\window_