Dado que el entrenamiento de redes neuronales es una tarea  muy costosa, **se recomienda ejecutar el notebooks en [Google Colab](https://colab.research.google.com)**, por supuesto también se puede ejecutar en local.

Al entrar en [Google Colab](https://colab.research.google.com) bastará con hacer click en `upload` y subir este notebook. No olvide luego descargarlo en `File->Download .ipynb`

**El examen deberá ser entregado con las celdas ejecutadas, si alguna celda no está ejecutadas no se contará.**

El examen se divide en tres partes, con la puntuación que se indica a continuación. La puntuación máxima será 10.

    
- [Actividad 1: Redes Recurrentes](#actividad_1): 10 pts
    - [Cuestión 1](#3.1): 2.5 pt
    - [Cuestión 2](#3.2): 2.5 pt
    - [Cuestión 3](#3.3): 2.5 pts
    - [Cuestión 4](#3.4): 1.25 pts
    - [Cuestión 5](#3.5): 1.25 pts



In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

<a name='actividad_1'></a>
# Actividad 1: Redes Recurrentes


- [Cuestión 1](#3.1): 2.5 pt
- [Cuestión 2](#3.2): 2.5 pt
- [Cuestión 3](#3.3): 2.5 pts
- [Cuestión 4](#3.4): 1.25 pts
- [Cuestión 5](#3.5): 1.25 pts

Vamos a usar un dataset de las temperaturas mínimas diarias en Melbourne. La tarea será la de predecir la temperatura mínima en dos días. Puedes usar técnicas de series temporales vistas en otras asignaturas, pero no es necesario.


In [2]:
dataset_url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv'
data_dir = tf.keras.utils.get_file('daily-min-temperatures.csv', origin=dataset_url)

In [3]:
df = pd.read_csv(data_dir, parse_dates=['Date'])
df.head()

Unnamed: 0,Date,Temp
0,1981-01-01,20.7
1,1981-01-02,17.9
2,1981-01-03,18.8
3,1981-01-04,14.6
4,1981-01-05,15.8


In [4]:
temperatures = df['Temp'].values
print('number of samples:', len(temperatures))
train_data = temperatures[:3000]
test_data = temperatures[3000:]
print('number of train samples:', len(train_data))
print('number of test samples:', len(test_data))
print('first train samples:', train_data[:10])

number of samples: 3650
number of train samples: 3000
number of test samples: 650
first train samples: [20.7 17.9 18.8 14.6 15.8 15.8 15.8 17.4 21.8 20. ]


<a name='3.1'></a>
## Cuestión 1: Convierta `train_data` y `test_data`  en ventanas de tamaño 5, para predecir el valor en 2 días

En la nomenclatura de [Introduction_to_RNN_Time_Series.ipynb](https://github.com/ezponda/intro_deep_learning/blob/main/class/RNN/Introduction_to_RNN_Time_Series.ipynb)
```python
past, future = (5, 2)
```

Para las primeras 10 muestras de train_data `[20.7, 17.9, 18.8, 14.6, 15.8, 15.8, 15.8, 17.4, 21.8, 20. ]` el resultado debería ser:

```python
x[0] : [20.7, 17.9, 18.8, 14.6, 15.8] , y[0]: 15.8
x[1] : [17.9, 18.8, 14.6, 15.8, 15.8] , y[1]: 17.4
x[2] : [18.8, 14.6, 15.8, 15.8, 15.8] , y[2]: 21.8
x[3] : [14.6, 15.8, 15.8, 15.8, 17.4] , y[3]: 20.             
```

In [5]:
def create_windows_np(data, window_size, horizon, shuffle=False):
    """
    Creates a dataset from the given time series data using NumPy.

    Parameters:
    data (np.ndarray): Time series data with one dimension.
    window_size (int): The number of past time steps to use as input features.
    horizon (int): The number of future time steps to predict.
    shuffle (bool): Shuffle the windows or not.

    Returns:
    tuple: A tuple containing the input-output pairs (windows, targets) as NumPy arrays.
    """

    X, y = [], []
    for i in range(len(data) - window_size - horizon + 1):
        X.append(data[i:i+window_size])
        y.append(data[i+window_size+horizon-1])

    X, y = np.array(X), np.array(y)

    if shuffle:
        indices = np.arange(len(X))
        np.random.shuffle(indices)
        X, y = X[indices], y[indices]

    return X, y

def create_windows_tf(data, window_size, horizon, shuffle=False):
    """
    Creates a dataset from the given time series data using tf.data.Dataset.

    Parameters:
    data (np.ndarray): Time series data with with one dimension.
    window_size (int): The number of past time steps to use as input features.
    horizon (int): The number of future time steps to predict.
    shuffle (bool): Whether to shuffle the data or not.

    Returns:
    tf.data.Dataset: The resulting dataset.
    """
    ts_data = tf.data.Dataset.from_tensor_slices(data)
    ts_data = ts_data.window(window_size + horizon, shift=1, drop_remainder=True)
    ts_data = ts_data.flat_map(lambda window: window.batch(window_size + horizon))
    ts_data = ts_data.map(lambda window: (window[:window_size], window[-1]))
    if shuffle:
        ts_data = ts_data.shuffle(buffer_size=data.shape[0])
    return ts_data

In [6]:
import time
inicio = time.time()
X_trial, y_trial = create_windows_np(train_data,
                                     window_size=5,
                                     horizon=2,
                                     shuffle=False)
fin = time.time()
print(f"Tiempo que tarda la función con numpy: {fin-inicio}")
inicio = time.time()
ts_dataset = create_windows_tf(train_data,
                                     window_size=5,
                                     horizon=2,
                                     shuffle=False)

fin = time.time()
print(f"Tiempo que tarda la función con tensorflow: {fin-inicio}")

Tiempo que tarda la función con numpy: 0.010882854461669922
Tiempo que tarda la función con tensorflow: 0.6868076324462891


In [7]:
for ind in range(len(y_trial)):
    print(X_trial[ind, :], y_trial[ind])

[20.7 17.9 18.8 14.6 15.8] 15.8
[17.9 18.8 14.6 15.8 15.8] 17.4
[18.8 14.6 15.8 15.8 15.8] 21.8
[14.6 15.8 15.8 15.8 17.4] 20.0
[15.8 15.8 15.8 17.4 21.8] 16.2
[15.8 15.8 17.4 21.8 20. ] 13.3
[15.8 17.4 21.8 20.  16.2] 16.7
[17.4 21.8 20.  16.2 13.3] 21.5
[21.8 20.  16.2 13.3 16.7] 25.0
[20.  16.2 13.3 16.7 21.5] 20.7
[16.2 13.3 16.7 21.5 25. ] 20.6
[13.3 16.7 21.5 25.  20.7] 24.8
[16.7 21.5 25.  20.7 20.6] 17.7
[21.5 25.  20.7 20.6 24.8] 15.5
[25.  20.7 20.6 24.8 17.7] 18.2
[20.7 20.6 24.8 17.7 15.5] 12.1
[20.6 24.8 17.7 15.5 18.2] 14.4
[24.8 17.7 15.5 18.2 12.1] 16.0
[17.7 15.5 18.2 12.1 14.4] 16.5
[15.5 18.2 12.1 14.4 16. ] 18.7
[18.2 12.1 14.4 16.  16.5] 19.4
[12.1 14.4 16.  16.5 18.7] 17.2
[14.4 16.  16.5 18.7 19.4] 15.5
[16.  16.5 18.7 19.4 17.2] 15.1
[16.5 18.7 19.4 17.2 15.5] 15.4
[18.7 19.4 17.2 15.5 15.1] 15.3
[19.4 17.2 15.5 15.1 15.4] 18.8
[17.2 15.5 15.1 15.4 15.3] 21.9
[15.5 15.1 15.4 15.3 18.8] 19.9
[15.1 15.4 15.3 18.8 21.9] 16.6
[15.4 15.3 18.8 21.9 19.9] 16.8
[15.3 18

In [8]:
past, future = (5, 2)
X_train, y_train = create_windows_np(train_data, window_size=past, horizon=future, shuffle=True)
X_test, y_test = create_windows_np(test_data, window_size=past, horizon=future, shuffle=False)

In [9]:
print(f'Train shape: {X_train.shape}')
print(f'Test shape: {X_test.shape}')

Train shape: (2994, 5)
Test shape: (644, 5)


<a name='3.2'></a>
## Cuestión 2: Cree un modelo recurrente de dos capas GRU para predecir con las ventanas de la cuestión anterior.


In [10]:
norm = tf.keras.layers.Normalization()
norm.adapt(X_train)
inputs = keras.layers.Input(shape=(past, 1))
inputs_norm = norm(inputs)
gru_out_1 = keras.layers.GRU(32, return_sequences=True)(inputs_norm)
gru_out_2 = keras.layers.GRU(32, return_sequences=False)(gru_out_1)
outputs = keras.layers.Dense(1)(gru_out_2)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.Adam(), loss="mse")
model.summary()

In [11]:
es_callback = keras.callbacks.EarlyStopping(
    monitor="val_loss", min_delta=0, patience=10)

history = model.fit(
    X_train, y_train,
    epochs=200,
    validation_split=0.2, shuffle=True, batch_size = 64, callbacks=[es_callback]
)

Epoch 1/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 25ms/step - loss: 135.8041 - val_loss: 114.8484
Epoch 2/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 89.0441 - val_loss: 42.5841
Epoch 3/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 22.9232 - val_loss: 12.6779
Epoch 4/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 13.4982 - val_loss: 10.5201
Epoch 5/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 12.8653 - val_loss: 9.4887
Epoch 6/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 11.7825 - val_loss: 8.6870
Epoch 7/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 10.4394 - val_loss: 8.4362
Epoch 8/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 10.2706 - val_loss: 8.2151
Epoch 9/200
[1m38/38[0m [32m

In [12]:
results = model.evaluate(X_test, y_test, verbose=1)
print('Test Loss: {}'.format(results))

[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 6.7803
Test Loss: 6.775528430938721


<a name='3.3'></a>
## Cuestión 3: Añada más features a la series temporal, por ejemplo `portion_year`. Cree un modelo que mejore al anterior.


In [13]:
## Puede añadir más features
df['portion_year'] = df['Date'].dt.dayofyear / 365.0
df_multi = df[['Temp', 'portion_year']].copy()

## train - test split
train_data = df_multi.iloc[:3000].copy()
test_data = df_multi.loc[3000:, :].copy()

In [14]:
def create_windows_multivariate_np(data, window_size, horizon, target_col_idx, shuffle=False):
    """
    Creates a dataset from the given time series data using NumPy.

    Parameters:
    data (np.ndarray or pd.DataFrame): Time series data with multiple features.
    window_size (int): The number of past time steps to use as input features.
    horizon (int): The number of future time steps to predict.
    target_col_idx (int): The index of the target column in the input data.
    shuffle (bool): Whether to shuffle the data or not.

    Returns:
    tuple: A tuple containing the input-output pairs (X, y) as NumPy arrays.
    """
    if isinstance(data, pd.DataFrame):
        data = data.values

    X, y = [], []
    for i in range(len(data) - window_size - horizon + 1):
        X.append(data[i:i+window_size, :])
        y.append(data[i+window_size+horizon-1, target_col_idx])

    X, y = np.array(X), np.array(y)

    if shuffle:
        indices = np.arange(X.shape[0])
        np.random.shuffle(indices)
        X, y = X[indices], y[indices]

    return X, y


def create_windows_multivariate_tf(data, window_size, horizon, target_col_idx, shuffle=False):
    """
    Creates a dataset from the given time series data using tf.data.Dataset.

    Parameters:
    data (pd.DataFrame): Time series data with multiple features.
    window_size (int): The number of past time steps to use as input features.
    horizon (int): The number of future time steps to predict.
    target_col_idx (int): The index of the target column in the input data.
    shuffle (bool): Whether to shuffle the data or not.

    Returns:
    tf.data.Dataset: The resulting dataset.
    """
    if isinstance(data, pd.DataFrame):
        data = data.values

    ts_data = tf.data.Dataset.from_tensor_slices(data)
    ts_data = ts_data.window(window_size + horizon, shift=1, drop_remainder=True)
    ts_data = ts_data.flat_map(lambda window: window.batch(window_size + horizon))
    ts_data = ts_data.map(lambda window: (
        window[:window_size], window[-1, target_col_idx]))
    if shuffle:
        ts_data = ts_data.shuffle(buffer_size=data.shape[0])
    return ts_data

In [15]:
import time
inicio = time.time()
X_trial, y_trial = create_windows_multivariate_np(train_data,
                                     window_size=past,
                                     horizon=future,
                                     target_col_idx=0,
                                     shuffle=False)
fin = time.time()
print(f"Tiempo que tarda la función con numpy: {fin-inicio}")
inicio = time.time()
ts_dataset = create_windows_multivariate_tf(train_data,
                                     window_size=past,
                                     horizon=future,
                                     target_col_idx=0,
                                     shuffle=False)

fin = time.time()
print(f"Tiempo que tarda la función con tensorflow: {fin-inicio}")

Tiempo que tarda la función con numpy: 0.003501415252685547
Tiempo que tarda la función con tensorflow: 0.044977664947509766


In [16]:
X_train, y_train = create_windows_multivariate_np(
    train_data, window_size=past, horizon=future, target_col_idx=0, shuffle=True)
X_test, y_test = create_windows_multivariate_np(
    test_data, window_size=past, horizon=future, target_col_idx=0, shuffle=False)

In [17]:
print(f'Train shape: {X_train.shape}')
print(f'Test shape: {X_test.shape}')

Train shape: (2994, 5, 2)
Test shape: (644, 5, 2)


In [18]:
norm = tf.keras.layers.Normalization(axis=-1, dtype=None, mean=None, variance=None)
norm.adapt(X_train)
inputs = keras.layers.Input(shape=(past, 2))
inputs_norm = norm(inputs)
bidirectional_1 = keras.layers.Bidirectional(keras.layers.GRU(64, return_sequences=True, recurrent_dropout=0.1))(inputs_norm)
bidirectional_2 = keras.layers.Bidirectional(keras.layers.GRU(64, return_sequences=False, recurrent_dropout=0.1))(bidirectional_1)
dense = layers.Dense(64, activation='relu')(bidirectional_2)
outputs = keras.layers.Dense(1)(dense)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-4, clipnorm=1.0), loss="mse")
model.summary()

In [19]:
es_callback = keras.callbacks.EarlyStopping(
    monitor="val_loss", min_delta=0, patience=10)

history = model.fit(
    X_train, y_train,
    epochs=200,
    validation_split=0.2, shuffle=True, batch_size = 64, callbacks=[es_callback]
)

Epoch 1/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 93ms/step - loss: 142.2539 - val_loss: 134.6451
Epoch 2/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 73ms/step - loss: 132.0728 - val_loss: 121.8789
Epoch 3/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 59ms/step - loss: 119.7424 - val_loss: 102.7241
Epoch 4/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 58ms/step - loss: 97.2632 - val_loss: 76.1618
Epoch 5/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 58ms/step - loss: 71.7972 - val_loss: 48.7459
Epoch 6/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 77ms/step - loss: 45.8017 - val_loss: 28.2831
Epoch 7/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 65ms/step - loss: 25.9034 - val_loss: 14.7557
Epoch 8/200
[1m38/38[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 59ms/step - loss: 14.3284 - val_loss: 10.5611
Epoch 9/200
[1m3

In [20]:
results = model.evaluate(X_test, y_test, verbose=1)
print('Test Loss: {}'.format(results))

[1m21/21[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 13ms/step - loss: 5.7847
Test Loss: 6.145246982574463


<a name='3.4'></a>
## Cuestión 4: ¿En cuáles de estas aplicaciones se usaría un arquitectura 'many-to-one'?

**a)** Clasificación de sentimiento en textos

**b)** Verificación de voz para iniciar el ordenador.

**c)** Generación de música.

**d)** Un clasificador que clasifique piezas de música según su autor.


a), b), d)

<a name='3.5'></a>
## Cuestión 5: ¿Qué ventajas aporta el uso de word embeddings?

**a)** Permiten reducir la dimensión de entrada respecto al one-hot encoding.

**b)** Permiten descubrir la similaridad entre palabras de manera más intuitiva que con one-hot encoding.

**c)** Son una manera de realizar transfer learning en nlp.

**d)** Permiten visualizar las relaciones entre palabras con métodos de reducción de dimensioones como el PCA.


a), b), c), d)