**Time-Series Forecasting Pipeline – Quick Guide**

---

### Purpose

End-to-end framework for multivariate forecasting with state-of-the-art Darts models
(**N-Beats, NHiTS, TiDE, TiDE + RIN, NLinear**).

---

### Supported Datasets

| Dataset       | Wells / Series                               | Target                            | File                                            |
| ------------- | -------------------------------------------- | --------------------------------- | ----------------------------------------------- |
| **VOLVE**     | 15/9-F-14, 15/9-F-12, 15/9-F-11, 15/9-F-15 D | `BORE_OIL_VOL`                    | `data/volve/Volve_Equinor.csv`                  |
| **UNISIM**    | Prod-1 … Prod-10                             | `QOOB`                            | `data/unisim/production.csv`                    |
| **UNISIM IV** | P16                                          | `BORE_OIL_VOL`                    | `data/UNISIM-IV-2026/Well_{well}_UNISIM-IV.csv` |
| **OPSD**      | wind / solar / load                          | `GB_GBN_<type>_generation_actual` | `data/OPSD/time_series_30min_singleindex.csv`   |

---

### Workflow

1. **Load & Clean**
   `DataSource(cfg).get_loader().load()`
   *Handles zero removal, cumulative sums, feature engineering.*

2. **Prepare Series**

   * Split → train | validation | test
   * Build Darts `TimeSeries` for target & covariates
   * Scale with `Scaler`

3. **Train** (`train_deep_encoder_model`)
   Common args: `input_chunk_length=7`, `output_chunk_length=56`, early-stopping, LR scheduler.
   Works for deep nets **and** classics (ARIMA, AutoARIMA, LinearRegression).

4. **Forecast**
   `fast_iterative_forecast` – first window full horizon, then 1-step roll-ahead.

5. **Evaluate**
   MAE, MSE + optional cumulative plots; metrics collected per *(dataset, well, model)*.

---

### Quick Configuration

```python
MODEL_TYPES      = ["TiDE", "NLinear"]
LAG_WINDOW       = 7
FORECAST_HORIZON = 56
TRAIN_SIZE       = 150 + FORECAST_HORIZON
DATA_SOURCES     = build_sources_with_opsd_variants()  # wind, solar, load
run_pipeline(DATA_SOURCES)
```

Everything else—parallel training, covariate handling, plotting—is managed internally.

In [1]:
%%capture capturado
#If you want to see the plots coment the line above
# =====================================================================
# 0. Imports
# =====================================================================
import warnings, logging
from pprint import pprint

from data.data_loading      import DataSource
from common.preprocessing   import process_deep_encoder_data_source
from common.config_wells    import get_data_sources
from evaluation.evaluation  import display_metrics

# =====================================================================
# 1. Hyper-parameters (user editable)
# =====================================================================
#"NHiTS", "TiDE", "TiDE+RIN", "NLinear", "N-Beats", "ARIMA", "AutoARIMA", "LinearRegression"
MODEL_TYPES       = ["NHiTS", "TiDE", "TiDE+RIN", "NLinear", "ARIMA", "AutoARIMA", "LinearRegression"]
# MODEL_TYPES       = ["NHiTS"]
LAG_WINDOW        = 7
FORECAST_HORIZON  = 56
TRAIN_SIZE        = 150 + FORECAST_HORIZON
SAMPLING_RATE     = 1

base_sources = get_data_sources()                         # default 'wind'
non_opsd     = [d for d in base_sources if d["name"] != "OPSD"]

opsd_variants = []
for t in ("wind", "solar", "load"):
    cfg_opsd = next(
        d for d in get_data_sources(opsd_type=t) if d["name"] == "OPSD"
    )
    opsd_variants.append(cfg_opsd)

DATA_SOURCES = opsd_variants + non_opsd      # final list used by the loop

# =====================================================================
# 2. Silence warnings / logs
# =====================================================================
warnings.filterwarnings("ignore")
logging.getLogger("pytorch_lightning").setLevel(logging.ERROR)

# =====================================================================
# 3. Main loop – all datasets, all wells (incl. OPSD wind|solar|load)
# =====================================================================
all_metrics = []

for ds_cfg in DATA_SOURCES:
    print(f"\n=== Loading dataset: {ds_cfg['name']} | wells: {ds_cfg['wells']} ===")

    raw_data = DataSource(ds_cfg).get_loader().load()  # DataFrame or {well: DataFrame}

    process_deep_encoder_data_source(
        data_source         = ds_cfg,
        train_size          = TRAIN_SIZE,
        forecast_horizon    = FORECAST_HORIZON,
        lag_window          = LAG_WINDOW,
        sampling_rate       = SAMPLING_RATE,
        metrics_accumulator = all_metrics,
        preloaded_data      = raw_data,
        model_types         = MODEL_TYPES,
    )

# =====================================================================
# 4. Show final metrics
# =====================================================================
print("\n=========== Cumulative metrics ===========")
pprint(all_metrics)          # raw view
display_metrics(all_metrics) # your formatted view

In [2]:
display_metrics(all_metrics) # your formatted view

Unnamed: 0,Poço,Método,R²,SMAPE,MAE
0,wind,NHiTS,0.8778,16.92%,1621191.0207
1,wind,TiDE,0.9985,3.35%,183539.4278
2,wind,TiDE+RIN,0.9864,5.77%,520753.8617
3,wind,NLinear,0.8356,20.78%,1801662.4469
4,wind,ARIMA,0.9459,8.15%,870471.1148
5,wind,AutoARIMA,0.9817,9.86%,640447.5073
6,wind,LinearRegression,0.9685,11.14%,775771.2428
7,solar,NHiTS,-61.8879,130.43%,9592546.0663
8,solar,TiDE,-10.0008,88.29%,4010932.6168
9,solar,TiDE+RIN,-37.9547,119.24%,7539571.1023


Unnamed: 0,Poço,Método,R²,SMAPE,MAE
0,wind,NHiTS,0.877751,16.924927,1.621191e+06
1,wind,TiDE,0.998467,3.352997,1.835394e+05
2,wind,TiDE+RIN,0.986379,5.765123,5.207539e+05
3,wind,NLinear,0.835616,20.784380,1.801662e+06
4,wind,ARIMA,0.945850,8.154996,8.704711e+05
...,...,...,...,...,...
121,Prod-10,TiDE+RIN,0.985956,4.971158,2.920369e+05
122,Prod-10,NLinear,0.988998,4.432929,2.595589e+05
123,Prod-10,ARIMA,0.999955,0.251758,1.464054e+04
124,Prod-10,AutoARIMA,0.990675,4.016034,2.377142e+05
