<p style="text-align:center;">
<img src="https://www.icn.ch/sites/default/files/styles/icn_free_ratio/public/2023-06/WHO.jpg?h=960eb3b3&itok=autONhTv"
     alt="DigitalFuturesLogo"
     style="float: center; margin-right: 10px;" />
</p>


# WHO Life Expectancy Project - The Function
## Root Team Squared Error

----
### Project Overview
The aim is to construct two linear regression models for life expectancy based on the data provided by the WHO. One model which features as much data as necessary for model performance, and one which is restricted on the basis of what we evaluate to be sensitive data. The data spans 183 countries and records information between 2000 and 2015.

### Team Members
* Fátima González González
* Lydia Drabkin-Reiter
* Ollie Hanlon
* Rowan Jarvis
* Jake Haycocks

----


# FUNCTION

In [None]:
import numpy as np

In [None]:
def model_selector():
    """
    Allows a selection of model based on consent to using advanced population data, which may include protected information for better accuracy.

    User Inputs:
    Consent y/n
    Returns:
    best_performing_model or minimalistic_model (var): name of the chosen model
    """
    model_choice = input("Do you consent to using advanced population data, which may include protected information for better accuracy? (y/n) ").lower().strip()
    while True:
        if model_choice == 'y':
            return best_performing_model
        elif model_choice == 'n':
            return limited_model
        else:
          model_choice = input("Please, answer y/n. \nDo you consent to using advanced population data, which may include protected information for better accuracy? (y/n) ").lower().strip()

In [None]:
def get_data_from_user(model):
    """
    Gets needed data from user for model

    Args:
    model (var) : model to use to extract the relevant columns

    Returns:
    scaled_values (list): scaled values to implement in the model
    """
    # values empty list that will be filled with user input
    values = []
    for col in model['columns']:
      values.append([col, float(input(f'Provide a value for {col}: ').strip())])
      # for models with GDP, we calculate the log
      if col == 'GDP':
        values[-1][1] = np.log(values[-1][1])
    # scaling the data (MinMaxScaler transformation)
    scaled_values = scale(values, scaler)
    return scaled_values

In [None]:
def scale(values, scaler):
    """
    Scales list of values using MinMaxScaler transformation

    Args:
    values (list) : tuples in the form (column, input value)
    scaler (dic)  : Dictionary with columns as keys and dictionaries of mins and maxes as values

    Returns:
    scaled_list (list): scaled list of input values
    """
    scaled_list = []
    for col, value in values:
        X_std = (value - scaler[col]['min']) / (scaler[col]['max'] - scaler[col]['min'])
        #    X_scaled = X_std * (1 - 0) + 0 #default max is 1, min is 0. It's usually * (max - min) + min
        scaled_list.append(X_std)
    return scaled_list

In [None]:
def life_expectancy_predictor(model, scaled_values):
    """
    Performs Life Expectancy prediction using a specificed model with the data provided by the user.

    Args:
    model (dic) : model to use for the prediction
    scaled_values (list) : scaled values to use for the prediction

    Returns:
    life_expectancy_prediction (float) : predicted life expectancy

    """
    # initialising life_expectancy_prediction with the constant value
    life_expectancy_prediction = model['params'][0]
    # implementing the model
    for p, x in zip(model['params'][1:], scaled_values):
      life_expectancy_prediction += p*x
    # print statement with the final life expectancy prediction
    if 40.639251 < life_expectancy_prediction < 97.072899:  # Predition within 3 standard deviation from the mean
      print('The estimated Life Expectancy is ', round(life_expectancy_prediction, 2))
    else:
      # Life Expectancy out of expected ranges
      print('\nWarning: The estimated Life Expectancy is far out the expected range.\n')
      # Negative Life Expectancy retuns 0
      if life_expectancy_prediction < 0:
        life_expectancy_prediction = 0
      print('The estimated Life Expectancy is ', round(life_expectancy_prediction, 2))
    # returns life_expectancy_prediction value (float) in case it is wanted for futher use
    return life_expectancy_prediction

In [None]:
def main ():
    model = model_selector()
    scaled_values = get_data_from_user(model)
    life_expectancy_predictor(model, scaled_values)

In [None]:
# DATA
best_performing_model = {'columns':['Adult Mortality', 'Infant Deaths', 'GDP'], 'params' : [76.6453, -30.258006, -17.446125, 4.720947]}
limited_model = {'columns':['Adult Mortality', 'GDP'], 'params' : [71.265829, -40.091971, 12.347190]}
scaler = {'GDP': {'max': 11.629979 , 'min': 4.997212}, 'Adult Mortality': {'max': 703.677, 'min': 49.384}, 'Year': {'max': 2015, 'min': 2000},'Infant Deaths': {'max': 135.6, 'min': 1.8}}

### Field descriptions

|Field|Description|
|---:|:---|
|Life expectancy|Life Expectancy in age|
|Adult Mortality|Adult Mortality Rates of both sexes (probability of dying between 15 and 60 years per 1000 population)|
|infant deaths|Number of Infant Deaths per 1000 population|
GDP|Gross Domestic Product per capita (in USD)|

### Execution of main function

In [None]:
main()

Do you consent to using advanced population data, which may include protected information for better accuracy? (y/n) y
Provide a value for Adult Mortality: 73
Provide a value for Infant Deaths: -20
Provide a value for GDP: 55
The estimated Life Expectancy is  77.69


## Examples

Testing data

|Field|Value|
|---:|:---|
|Adult Mortality|176.6680|
|Infant Deaths|14.3|
|GDP|35863|
|**Life expectancy**|**72.0**|



### Model 1: general model (answering 'y')

In [None]:
main()

Do you consent to using advanced population data, which may include protected information for better accuracy? (y/n) y
Provide a value for Adult Mortality: 176.6680
Provide a value for Infant Deaths: 14.3
Provide a value for GDP: 35863 
The estimated Life Expectancy is  73.04


In [None]:
print('Off by', round(73.04 - 72, 2), 'years' )

Off by 1.04 years


 RMSE: **1.471**

### Model 2: limited model (answering 'n')

In [None]:
main()

Do you consent to using advanced population data, which may include protected information for better accuracy? (y/n) n
Provide a value for Adult Mortality: 176.6680
Provide a value for GDP: 35863
The estimated Life Expectancy is  73.69


In [None]:
print('Off by',round(73.69 - 72, 2), 'years')

Off by 1.69 years


RMSE: **2.418**

### Out of range values

We implemented a warning statement for out of range values.

In [None]:
main()

Do you consent to using advanced population data, which may include protected information for better accuracy? (y/n) n
Provide a value for Adult Mortality: 10000000
Provide a value for GDP: 10000000000


The estimated Life Expectancy is  0


In [None]:
main()

Do you consent to using advanced population data, which may include protected information for better accuracy? (y/n) n
Provide a value for Adult Mortality: 0
Provide a value for GDP: 1000000000


The estimated Life Expectancy is  103.57
