# LeeCarter Neural Network Model
In this Jupyter notebook we try to reproduce the Neural Network as given in Ronald & Wüthrich, presented in Lecture 10

In [None]:
from tensorflow.keras.layers import Input, Embedding, Flatten, Concatenate, Dense, BatchNormalization, Dropout
from tensorflow.keras.models import Model
import numpy as np
import pandas as pd
from tensorflow.keras.optimizers import Adam

from tensorflow.keras.callbacks import ReduceLROnPlateau, ModelCheckpoint
from sklearn.preprocessing import LabelEncoder




As I did not find the data as presented in the lecture, I had to put it together from the hmd-data available. I saved it in the file `mortality_rates.csv` 

In [None]:
TrainData = pd.read_csv('mortality_rates.csv')
YearData = TrainData['Year'].values.reshape(-1, 1)
AgeData = TrainData['Age'].values.reshape(-1, 1)
_CountryData = TrainData['Country'].values
_GenderData = TrainData["sex"].values
# Encode country strings to integers
CountryData = label_encoder.fit_transform(_CountryData).reshape(-1, 1)
GenderData = label_encoder.fit_transform(_GenderData).reshape(-1, 1)

ageLim = AgeData.max()+1
CountryLim = CountryData.max()+1
GenderLim = GenderData.max()+1

print("Age limit: ", ageLim, "Country limit", CountryLim, "Gender limit", GenderLim) #I had to modify the code as I seemed to have a country more than the Wuethrich dataset. Gender and age however are the same.

TargetData = TrainData['log_mx'].values.reshape(-1, 1)

Age limit:  101 Country limit 42 Gender limit 2


In [None]:

# Define embedding dimensions
age_embedding_dim = 5
gender_embedding_dim = 5
country_embedding_dim = 5

# Define input layers
Year = Input(shape=(1,), dtype='float32', name='Year')
Age = Input(shape=(1,), dtype='int32', name='Age')
Country = Input(shape=(1,), dtype='int32', name='Country')
Gender = Input(shape=(1,), dtype='int32', name='Gender')

# Define embedding layers
Age_embed = Flatten()(Embedding(input_dim=ageLim, output_dim=age_embedding_dim, input_length=1, name='Age_embed')(Age))
Gender_embed = Flatten()(Embedding(input_dim=GenderLim, output_dim=gender_embedding_dim, input_length=1, name='Gender_embed')(Gender))
Country_embed = Flatten()(Embedding(input_dim=CountryLim, output_dim=country_embedding_dim, input_length=1, name='Country_embed')(Country))

# Concatenate features
features = Concatenate()([Year, Age_embed, Gender_embed, Country_embed])

# Define middle layers
middle = features
dropout_rate = 0.05
for _ in range(4):
    middle = Dense(units=128, activation='tanh')(middle)
    middle = BatchNormalization()(middle)
    middle = Dropout(dropout_rate)(middle)

# Define main output
main_output = Concatenate()([features, middle])
main_output = Dense(units=128, activation='tanh')(main_output)
main_output = BatchNormalization()(main_output)
main_output = Dropout(dropout_rate)(main_output)
main_output = Dense(units=1, activation='sigmoid', name='main_output')(main_output)

# Create model
model = Model(inputs=[Year, Age, Country, Gender], outputs=[main_output])



In [43]:
print(YearData.shape, AgeData.shape, CountryData.shape, GenderData.shape, TargetData.shape)


(391274, 1) (391274, 1) (391274, 1) (391274, 1) (391274, 1)


In [37]:

# Generate some test data
num_samples = 1000
YearData = np.random.randint(2000, 2021, size=(num_samples, 1)).astype('int32')
AgeData = np.random.randint(0, 100, size=(num_samples, 1)).astype('int32')
CountryData = np.random.randint(0, 41, size=(num_samples, 1)).astype('int32')
GenderData = np.random.randint(0, 2, size=(num_samples, 1)).astype('int32')
TargetData = np.random.randint(0, 2, size=(num_samples, 1)).astype('float32')


In [38]:
print(YearData.shape, AgeData.shape, CountryData.shape, GenderData.shape, TargetData.shape)

(1000, 1) (1000, 1) (1000, 1) (1000, 1) (1000, 1)


In [45]:

# Compile the model
model.compile(optimizer=Adam(learning_rate=0.0005), 
              loss='mse')#, metrics=['accuracy'])

lr_callback = ReduceLROnPlateau(factor=0.80, patience=5, verbose=1, cooldown=5, min_lr=0.00005)

mc_callback = ModelCheckpoint(
    filepath='Lee_Carter_NN_model.keras',
    monitor='val_loss',
    verbose=1,
    save_best_only=True,
    save_weights_only=False
)


# Fit the model
model.fit(x=[YearData, AgeData, CountryData, GenderData], 
          y=TargetData, 
          epochs=10, 
          batch_size=32, 
          verbose=0,
          shuffle=True,
          validation_split=0.05,
          callbacks=[mc_callback, lr_callback])


Epoch 1: val_loss improved from inf to 29.57295, saving model to Lee_Carter_NN_model.keras

Epoch 2: val_loss improved from 29.57295 to 29.57294, saving model to Lee_Carter_NN_model.keras

Epoch 3: val_loss did not improve from 29.57294

Epoch 4: val_loss did not improve from 29.57294

Epoch 5: val_loss did not improve from 29.57294

Epoch 6: val_loss did not improve from 29.57294

Epoch 6: ReduceLROnPlateau reducing learning rate to 0.0004000000189989805.

Epoch 7: val_loss did not improve from 29.57294

Epoch 8: val_loss did not improve from 29.57294

Epoch 9: val_loss did not improve from 29.57294

Epoch 10: val_loss did not improve from 29.57294


<keras.src.callbacks.history.History at 0x377989a00>