# Experimentation - LSTM in Climate Change - ERA5 to ERA5 (Univariable)

This notebook aims to predict the total precipitation value in Alegrete via LSTM using ERA5 to train and test.

## Workflow:
1. Load and inspect data
2. Create input/output sequences
3. Scale and reshape data
4. Train an LSTM model
5. Generate and evaluate forecasts

In [52]:
# Imports

# Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn import preprocessing
import xarray as xr

# Built-in functions
from src.data import load_interim
from src.data.preprocessing import create_sliding_windows
from src.data.preprocessing import prepare_data_seq_to_one


In [3]:
np.random.seed(42)
tf.random.set_seed(42)

## 1. Load and inspect data

In [38]:
# Loads the data from alegrete
path = load_interim('era5_precipitation_timeseries_alegrete_1D.nc')

nc_dataset = xr.open_dataset(path, engine="netcdf4")

# Saves the variable tp = Total Precipitation
tp = nc_dataset["tp"]
df = tp.to_dataframe(name="total_precipitation")

print(df)

            latitude  longitude  total_precipitation
valid_time                                          
1979-01-01    -29.75      -55.5             0.171185
1979-01-02    -29.75      -55.5             0.332832
1979-01-03    -29.75      -55.5             0.007153
1979-01-04    -29.75      -55.5             0.000000
1979-01-05    -29.75      -55.5             0.000000
...              ...        ...                  ...
2025-12-27    -29.75      -55.5            11.691570
2025-12-28    -29.75      -55.5            21.647930
2025-12-29    -29.75      -55.5            42.373180
2025-12-30    -29.75      -55.5             0.635624
2025-12-31    -29.75      -55.5             1.400471

[17167 rows x 3 columns]


## Create input/output sequences

### Sequence configuration

- Timesteps (input window): 14 days
- Lead time (forecast horizon): 1 day ahead

### Create sliding windows

In [50]:
TIMESTEPS = 14    # past days used as input
HORIZON = 1       # predict 1 day ahead

series = df["total_precipitation"]

X, y = create_sliding_windows(series, TIMESTEPS, HORIZON)

print("X shape:", X.shape)
print("y shape:", y.shape)


X shape: (17153, 14)
y shape: (17153, 1)


### Preparation of data

In [57]:
X_scaled, y_scaled, scaler_x, scaler_y = prepare_data_seq_to_one(X, y, num_features=1)



0.0 1.0 0.0 1.0
