<a href="https://colab.research.google.com/github/theresaskruzna/riiid_knowledge_tracing/blob/main/03_Model_Building.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import numpy as np
import pandas as pd

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Input, Dropout, Dense, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras import regularizers

Building an RNN - LSTM model, binary classification

Sequential model using the Keras API in TensorFlow. This model is specifically designed for sequence data processing, for tasks like time series forecasting, natural language processing, or any other task involving sequential patterns.

Sequential Model: The use of a Sequential model is not limited to image classification. You can use a Sequential model to build a network for time series forecasting by replacing the image-specific layers with layers suitable for sequential data, such as LSTM (Long Short-Term Memory) or GRU (Gated Recurrent Unit) layers.

In [None]:
def sequential_lstm_model(input_shape, hidden_units=128): # hidden_units=128 is default number that sets the number of neurons
    model = Sequential() # creates sequential model = linear stack of layers
    # input fits on X
    model.add(Input(shape=X_train.shape[1:]))
    # First LSTM(i.e.long short-term memory) layer with return sequences for stacking
    model.add(LSTM(hidden_units, return_sequences=True, input_shape=input_shape))
    model.add(Dropout(0.2)) # prevent overfitting - randomly ignore 20% neurons during training
    # use recurrent_dropout=0.2?

    # Second LSTM layer without return sequences for final output
    model.add(LSTM(hidden_units, return_sequences=False))
    model.add(Dropout(0.2))

    model.add(BatchNormalization()) # stabilize training and improve convergence

    # Output layer with one neuron (for binary classification), sigmoid produces a probability between 0 and 1
    model.add(Dense(1, activation='sigmoid', kernel_regularizer=tf.keras.regularizers.l2(0.01))) # L2 regularisor to prevent overfitting

    model.compile( # configure learning process of the model
        optimizer='adam', # specify opitimisation algorithm for training
        loss='binary_crossentropy', # measure difference between predictions and actual values
        metrics=['accuracy', tf.keras.metrics.AUC()] # set metrics to track during training and evaluation, AUC(area under the curve)
    )
    model.summary()

    return model

tweak model by changing number of hidden units and layers to optimise the architecture of the model

Early stopping

In [None]:
early_stopping = EarlyStopping(
    # monitor accuracy of the validation dataset
    monitor="val_accuracy",
    # if it doesn't improve by at least 0.3%
    min_delta=0.003,
    # within the last 10 epochs
    patience=10,
    # turn it off and restore weights from the epoch with the highest accuracy on the validation set
    restore_best_weights=True,
)

Model checkpoint

In [None]:
save = ModelCheckpoint(
    # where to save model
    filepath="best_model.keras",
    # monitor accuracy of the validation set
    monitor="val_accuracy",
    # save only one file with the highest metrics value
    save_best_only=True,
    # save architecture and wights into one file
    save_weights_only=False,
    # after every epoch
    save_freq="epoch"
)

In [None]:
model.fit(X_train, y_train, epochs=50, validation_data=(X_val, y_val), callbacks=[early_stopping])

In [None]:
history = model.fit(X_train, y_train, epochs=1000, validation_data=(X_test, y_test), batch_size=32, callbacks=[early, save])

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Příklad dat
data = np.sin(np.linspace(0, 100, 500))  # simulovaná časová řada
window_size = 60
prediction_steps = 14

# Vytvoření vstupů a výstupů
X, y = [], []
for i in range(len(data) - window_size - prediction_steps):
    X.append(data[i:i+window_size])
    y.append(data[i+window_size:i+window_size+prediction_steps])

X, y = np.array(X), np.array(y)
model = Sequential([
    LSTM(64, activation='relu', input_shape=(X.shape[1], 1)),
    Dense(prediction_steps)  # Výstupní vrstva s 14 hodnotami
])
model.compile(optimizer='adam', loss='mse')
# Úprava tvaru dat (přidání dimenze pro jednotlivé kanály)
X = X.reshape((X.shape[0], X.shape[1], 1))

# Trénování
model.fit(X, y, epochs=20, batch_size=32)

This part of the code DOES demonstrate a basic time series forecasting setup using an LSTM layer. It focuses on predicting future values in a sequence, which is a common task in time series analysis.

Training the model - Training function with callbacks

In [None]:
def train_model(model, X_train, y_train, X_val, y_val, batch_size=64, epochs=10):
    # Define callbacks
    checkpoint = ModelCheckpoint(
        'best_sequential_model.h5',
        monitor='val_auc',
        verbose=1,
        save_best_only=True,
        mode='max'
    )
    early_stopping = EarlyStopping(
        monitor='val_auc',
        patience=3,
        mode='max',
        restore_best_weights=True
    )
    reduce_lr = ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=2,
        min_lr=0.0001
    )

    # Train the model
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        batch_size=batch_size,
        epochs=epochs,
        callbacks=[checkpoint, early_stopping, reduce_lr],
        verbose=1
    )

    return model, history