In [9]:
import sys
sys.path.append('../src')

from preprocessing import load_data
from features import splitting_data, feature_engineering, scaling, add_constant_column
from models import train_model, full_pipeline, calculate_rmse

## Model testing

In this notebook, the model is evaluated using the test dataset obtained from the train-test split. This steps helps to assess how well the model performs on unseen data and ensures that it can generalise beyond the training set.

The evaluation uses RMSE, MAE and R-squared to measure prediction accuracy and model fit. RMSE and MAE capture the average error in predictions whereas, R-squared indicates how well the model explains the variance in data.

## Train a model

In [10]:
minimal_cols = ['Region',
                'Under_five_deaths',
                'Adult_mortality',
                'GDP_per_capita',
                'Schooling',
                'Economy_status_Developed',
                'Life_expectancy'
                ]

elaborate_cols = minimal_cols + [
                  'Alcohol_consumption',
                  'Hepatitis_B',
                  'Measles',
                  'BMI',
                  'Polio',
                  'Diphtheria',
                  'Incidents_HIV',
                  'Thinness_ten_nineteen_years',
                  'Thinness_five_nine_years',
                 ]

## Model pipeline overview
The full_pipeline function below allows us to automate data preparation and model training.

This function performs the train-test split, feature engineering, scaling and model fitting. It then returns the processed data and trained model.

The pipeline has been run twice. Once using the minimal feature set (minimal_cols) and once using elaborate feature set (elaborate_cols). This allows comparison between the two - the simpler and more complex models to see which performs better.

In [11]:
X_train_m_fe, X_test_m_fe, y_train_m, y_test_m, minimal_results, scaler_m, training_columns_m = full_pipeline(minimal_cols)
X_train_e_fe, X_test_e_fe, y_train_e, y_test_e, elaborate_results, scaler_e, training_columns_e = full_pipeline(elaborate_cols)

## Testing our minimal model

In this section, we tested how well the minimal model makes predictions on the training data. This was done using the .predict() function which outputs the model's predicted values for the same inputs it was trained on. 

The root mean squared error (RMSE) was then calculated between the predicted values and actual values in the training set. The same was carried out for the test dataset, which the model had not been shown before. This helps to evaluate how well the model works on unseen data. 

In [12]:
# Make prediction (assuming model results saved as an object called 'results'
y_train_m_pred = minimal_results.predict(X_train_m_fe)

# Calculate RMSE from train data
minimal_rmse_train = calculate_rmse(y_train_m, y_train_m_pred)

print(f'Root Mean Squared Error for training data: {minimal_rmse_train}')

Root Mean Squared Error for training data: 1.420312266166898


In [13]:
# Do the same thing as above but with TEST data
y_test_m_pred = minimal_results.predict(X_test_m_fe)

minimal_rmse_test = calculate_rmse(y_test_m, y_test_m_pred)

print(f'Root Mean Squared Error for testing data: {minimal_rmse_test}')

Root Mean Squared Error for testing data: 1.411767033259221


The RMSE for the training data is 1.42 and the testing data is 1.41. Both values are almost identical which shows that the model performs consistently on both the data it was trained on and the unseen data. This shows that the model is *not* overfitting and is able to perform well on new unseen data.

The RMSE value itself shows that on average, the model's predictions differ only from the true value by approximately 1.4 years. This indicates a good level of accuracy for this model. 

# Testing our elaborate model

In [14]:
# Make prediction (assuming model results saved as an object called 'results'
y_train_e_pred = elaborate_results.predict(X_train_e_fe)

# Calculate RMSE from train data
elaborate_rmse_train = calculate_rmse(y_train_e, y_train_e_pred)

print(f'Root Mean Squared Error for training data: {elaborate_rmse_train}')

Root Mean Squared Error for training data: 1.3946036666072652


In [15]:
# Do the same thing as above but with TEST data
y_test_e_pred = elaborate_results.predict(X_test_e_fe)

elaborate_rmse_test = calculate_rmse(y_test_e, y_test_e_pred)

print(f'Root Mean Squared Error for testing data: {elaborate_rmse_test}')

Root Mean Squared Error for testing data: 1.3877250880253615


The RMSE for our training data is 1.395 and the testing data is 1.388. The RMSE's for our elaborate model are slightly lower than for that of the minimal model, showing that it performs better overall when compared to the minimal model as it makes smaller prediction errors.

Both training and testing RMSE values are similar showing that this model is not overfitting and is able to generalise well on given data. In summary, the elaborate model provides a small improvement in accuracy while also maintaining strong performance.