
</p> <p style="text-align:center;">
<img 
    src="https://images-ext-1.discordapp.net/external/8rgDaCvLqrqclKam5YznFYmsGOTU899pxnFf406bX4I/https/i.postimg.cc/tgyfnHYs/WHOlytics.png?format=webp&quality=lossless"
     alt="WHOlytics"
    width="200"
     style="float: right; margin-left: 0px;"/>
</p>

<p style="text-align:center;">
<img 
    src="https://i.postimg.cc/9fQzwcLM/WHO-cropped.jpg"
     alt="WHO"
     style="float: center; margin-left: 0px;"/>

# WHOlytics Life Expectancy Prediction Model
(modelled on data for each country between 2000 - 2015)

Thanks for using our model to predict the life expectancy for your country. Please note all metrics should be calculated on an annual basis specific to the year you want to predict for. 

Please be informed that you will be required to provide the following information at <I> minimum </I> for the model to predict:

- The <B>region</B> your country is in (Africa, Asia etc.);
- The <B>year</B> you would like to predict your country's life expectancy for;
- Your country's <B>adult mortality rate</B> per 1000 population;
- Your country's <B>Polio (Pol3) immunization</B> coverage among 1-year-olds as a percentage;
- Your country's <B>average Body Mass Index (BMI)</B>; and
- Your country's <B>Gross Domestic Product (GDP) per capita (in USD)</B>

If you would like to use the advanced model, you will be required to provide the following <I>additional</I> information:

- Your country's number of <B>infant deaths</B> per 1000 population;
- The number of deaths per 1,000 live births due to <B>HIV/AIDS</B> for children aged 0-4 in your country;
- The percentage <B>prevalence of thinness</B> among children and adolescents (ages 10 to 19) in your country; and
- The average years of number of <B>years of schooling</B> in your country.


In [1]:
# Code for the model

import numpy as np
import pandas as pd

def equation1(
    year_input,
    infant_deaths_input,
    adult_mortality_input,
    bmi_input,
    polio_input,
    thinness_ten_nineteen_years_input,
    schooling_input,
    hiv_input,
    gdp_input,
    asia_input,
    central_america_and_caribbean_input,
    european_union_input,
    middle_east_input,
    north_america_input,
    oceania_input,
    rest_of_europe_input,
    south_america_input
):
    return (
        68.2364 +
        (0.1831 * ((year_input - 2007.5) / 4.610577)) +
        (-3.1212 * ((infant_deaths_input - 30.363792) / 27.538117)) +
        (-5.0882 * ((adult_mortality_input - 192.251775) / 114.910281)) +
        (-0.484 * ((bmi_input - 25.032926) / 2.193905)) +
        (0.1278 * ((polio_input - 86.499651) / 15.080365)) +
        (-0.0803 * ((thinness_ten_nineteen_years_input - 4.865852) / 4.438234)) +
        (0.4661 * ((schooling_input - 7.632123) / 3.171556)) +
        (-0.2834 * ((np.log(hiv_input) -1.594968) / 1.572341)) +
        (1.0065 * ((np.log(gdp_input) - 8.399358) / 1.444216)) +
        (0.443 * asia_input) +
        (1.9606 * central_america_and_caribbean_input) +
        (0.9536 * european_union_input) +
        (0.1072 * middle_east_input) +
        (1.8726 * north_america_input) +
        (-0.081 * oceania_input) +
        (0.6233 * rest_of_europe_input) +
        (1.6252 * south_america_input)
    )



def equation2(
    year_input,
    adult_mortality_input,
    bmi_input,
    polio_input, 
    gdp_input,
    asia_input,
    central_america_and_caribbean_input,
    european_union_input,
    middle_east_input,
    north_america_input,
    oceania_input,
    rest_of_europe_input,
    south_america_input
):
    return (
        67.0214 +
        (0.3767 * ((year_input - 2007.5) / 4.610577)) +
        (-5.9612 * ((adult_mortality_input - 192.251775) / 114.910281)) +
        (-0.0396 * ((bmi_input - 25.032926) / 2.193905)) +
        (1.2279 * ((polio_input - 86.499651) / 15.080365)) +
        (2.0575 * ((np.log(gdp_input) - 8.399358) / 1.444216)) +
        (0.443 * asia_input) +
        (1.9606 * central_america_and_caribbean_input) +
        (0.9536 * european_union_input) +
        (0.1072 * middle_east_input) +
        (1.8726 * north_america_input) +
        (-0.081 * oceania_input) +
        (0.6233 * rest_of_europe_input) +
        (1.6252 * south_america_input)
    )



def get_data_advanced():
    country_data = {}
    # REGION
    while True:
        try:
            region_dict = {
                0: 'Africa', 1: 'Asia', 2: 'Central America and Caribbean',
                3: 'European Union', 4: 'Middle East', 5: 'North America',
                6: 'Oceania', 7: 'Rest of Europe', 8: 'South America'
            }
            region_number = int(input(
            "🌍 Please enter the number that best represents the region of your country out of the following: (0-8):\n"
            " 0. Africa \n"
            " 1. Asia \n"
            " 2. Central America and Caribbean \n"
            " 3. European Union \n"
            " 4. Middle East \n"
            " 5. North America \n"
            " 6. Oceania \n"
            " 7. Rest of Europe \n"
            " 8. South America \n ➡️"
        ))
            if 0 <= region_number <= 8:
                country_data['region'] = region_dict[region_number]
                for key in region_dict:
                    country_data[f"region_{region_dict[key].replace(' ', '_').lower()}"] = (key == region_number)
                break
            else:
                print("🚫 Invalid input. Please enter a number between 0 and 8.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # YEAR
    while True:
        try:
            year = int(input("\n📅 Please enter the year of the dataset you would like to predict for in number format, any year from 2000 onwards (2000 or later): ➡️ "))
            if year >= 2000:
                country_data['year'] = year
                break
            else:
                print("🚫 Invalid input. The year must be 2000 or later.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # ADULT MORTALITY
    while True:
        try:
            adult_mortality = float(input("\n💔 Enter adult mortality rate per 1000 population (0-1000): ➡️ "))
            if 0 <= adult_mortality <= 1000:
                country_data['adult_mortality'] = adult_mortality
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 1000.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # INFANT DEATHS
    while True:
        try:
            infant_deaths = float(input("\n👶 Enter infant deaths per 1000 population (0-1000): ➡️ "))
            if 0 <= infant_deaths < 1000:
                country_data['infant_deaths'] = infant_deaths
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 1000.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # POLIO
    while True:
        try:
            polio = float(input("\n💉 Enter your Polio (Pol3) immunization coverage among 1-year-olds as a percentage (%) (0-100): ➡️ "))
            if 0 <= polio <= 100:
                country_data['polio'] = polio
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 100.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # HIV
    while True:
        try:
            incidents_hiv = float(input("\n🦠 Enter the number of deaths per 1,000 live births due to HIV/AIDS for children aged 0-4 in your country (0-1000): ➡️ "))
            if 0 <= incidents_hiv <= 1000:
                country_data['incidents_hiv'] = incidents_hiv
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 1000.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # BMI
    while True:
        try:
            bmi = float(input("\n⚖️ Enter your country's average Body Mass Index (BMI): ➡️ "))
            if 0 <= bmi <= 1000:
                country_data['bmi'] = bmi
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 1000.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # THINNESS
    while True:
        try:
            thinness = float(input("\n📉 Enter prevalence of thinness among ages 10-19 (%) (0-100): ➡️ "))
            if 0 <= thinness <= 100:
                country_data['thinness_ten_nineteen_years'] = thinness
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 100.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # SCHOOLING
    while True:
        try:
            schooling = float(input("\n📚 Enter average years of schooling (0-100): ➡️ "))
            if 0 <= schooling <= 100:
                country_data['schooling'] = schooling
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 100.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # GDP PER CAPITA
    while True:
        try:
            gdp = float(input("\n💰 Enter your country's Gross domestic product (GDP) per capita in USD (without the $, e.g., 5000): ➡️ "))
            country_data['gdp_per_capita'] = gdp
            break
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
    print("✅ Advanced data collection complete. Proceeding with analysis...")
    return country_data


def get_data_sensitive():
    country_data = {}
    # REGION
    while True:
        try:
            region_dict = {
                0: 'Africa', 1: 'Asia', 2: 'Central America and Caribbean',
                3: 'European Union', 4: 'Middle East', 5: 'North America',
                6: 'Oceania', 7: 'Rest of Europe', 8: 'South America'
            }
            region_number = int(input(
            "🌍 Please enter the number that best represents the region of your country out of the following: (0-8):\n"
            " 0. Africa \n"
            " 1. Asia \n"
            " 2. Central America and Caribbean \n"
            " 3. European Union \n"
            " 4. Middle East \n"
            " 5. North America \n"
            " 6. Oceania \n"
            " 7. Rest of Europe \n"
            " 8. South America \n ➡️"
        ))
            if 0 <= region_number <= 8:
                country_data['region'] = region_dict[region_number]
                for key in region_dict:
                    country_data[f"region_{region_dict[key].replace(' ', '_').lower()}"] = (key == region_number)
                break
            else:
                print("🚫 Invalid input. Please enter a number between 0 and 8.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # YEAR
    while True:
        try:
            year = int(input("\n📅 Please enter the year of the dataset you would like to predict for in number format, any year from 2000 onwards (2000 or later): ➡️ "))
            if year >= 2000:
                country_data['year'] = year
                break
            else:
                print("🚫 Invalid input. The year must be 2000 or later.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # ADULT MORTALITY
    while True:
        try:
            adult_mortality = float(input("\n💔 Enter adult mortality rate per 1000 population (0-1000): ➡️ "))
            if 0 <= adult_mortality <= 1000:
                country_data['adult_mortality'] = adult_mortality
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 1000.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

        # POLIO
    while True:
        try:
            polio = float(input("\n💉 Enter your Polio (Pol3) immunization coverage among 1-year-olds as a percentage (%) (0-100): ➡️ "))
            if 0 <= polio <= 100:
                country_data['polio'] = polio
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 100.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")
    
    # BMI
    while True:
        try:
            bmi = float(input("\n⚖️ Enter your country's average Body Mass Index (BMI): ➡️ "))
            if 0 <= bmi <= 1000:
                country_data['bmi'] = bmi
                break
            else:
                print("🚫 Invalid input. Must be between 0 and 1000.")
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    # GDP PER CAPITA
    while True:
        try:
            gdp = float(input("\n💰 Enter your country's Gross domestic product (GDP) per capita in USD (without the $, e.g., 5000): ➡️ "))
            country_data['gdp_per_capita'] = gdp
            break
        except ValueError:
            print("🚫 Invalid input. Please enter a valid number.")

    print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
    print("✅ Data collection complete. Proceeding with analysis...")
    return country_data


import json


def start_predict():
    print("🌟 Welcome to the Life Expectancy Predictor! 🌟")
    print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
    input_data = None
    life_expectancy = None

    try:
        while True:
            print("🔍 This tool helps predict life expectancy based on your data! \n")
            use_advanced = input(
                "🤔 Do you consent to using advanced population data (including protected info) for better accuracy? (Y/N): \n"
            ).strip().lower()

            # Decide which function to call based on consent
            if use_advanced == 'y':
                print("✨ You've chosen the advanced mode! Collecting detailed data... 🌍 \n")
                use_dict = input("\n Do you want to enter the data in the format of a dictionary? (Y/N) \n")
                if use_dict.lower() == 'y':
                    input_dict = input('Please ensure that the JSON dictionary is of the following example format (JSON string): {"region": "Africa", "region_africa": 0, "year": 2006, "adult_mortality": 515.718, "infant_deaths": 48.7, "polio": 79.0, "incidents_hiv": 11.13, "bmi": 26.6, "thinness_ten_nineteen_years": 1.6, "schooling": 9.0, "gdp_per_capita": 5827.0} \n')
                    input_data = json.loads(input_dict)
                elif use_dict.lower() == 'n':
                    input_data = get_data_advanced()
                life_expectancy = equation1(
                    year_input=input_data.get('year'),
                    infant_deaths_input=input_data.get('infant_deaths', 30.363792),
                    adult_mortality_input=input_data.get('adult_mortality', 192.251775),
                    bmi_input=input_data.get('bmi', 25.032926),
                    polio_input=input_data.get('polio', 86.499651),
                    thinness_ten_nineteen_years_input=input_data.get('thinness_ten_nineteen_years', 4.865852),
                    schooling_input=input_data.get('schooling', 7.632123),
                    hiv_input=input_data.get('incidents_hiv', 0.8942877094972066),
                    gdp_input=input_data.get('gdp_per_capita', 11540.924930167597),
                    asia_input=input_data.get('region_asia', 0),
                    central_america_and_caribbean_input=input_data.get('region_central_america_and_caribbean', 0),
                    european_union_input=input_data.get('region_european_union', 0),
                    middle_east_input=input_data.get('region_middle_east', 0),
                    north_america_input=input_data.get('region_north_america', 0),
                    oceania_input=input_data.get('region_oceania', 0),
                    rest_of_europe_input=input_data.get('region_rest_of_europe', 0),
                    south_america_input=input_data.get('region_south_america', 0)
                )
                break
            elif use_advanced == 'n':
                print("📋 You've chosen the basic mode! Collecting essential data... 🌎 \n")
                use_dict = input("\n Do you want to enter the data in the format of a dictionary? (Y/N) \n")
                if use_dict.lower() == 'y':
                    input_dict = input('Please ensure that the JSON dictionary is of the following example format (JSON string): {"region": "Africa", "region_africa": 0, "year": 2006, "adult_mortality": 515.718, "polio": 79.0, "bmi": 26.6, "gdp_per_capita": 5827.0} \n')
                    input_data = json.loads(input_dict)
                elif use_dict.lower() == 'n':
                    input_data = get_data_sensitive()
                
                life_expectancy = equation2(
                year_input=input_data.get('year'),
                adult_mortality_input=input_data.get('adult_mortality', 192.251775),
                bmi_input=input_data.get('bmi', 25.032926),
                polio_input=input_data.get('polio', 86.499651),
                gdp_input=input_data.get('gdp_per_capita', 11540.924930167597),
                asia_input=input_data.get('region_asia', 0),
                central_america_and_caribbean_input=input_data.get('region_central_america_and_caribbean', 0),
                european_union_input=input_data.get('region_european_union', 0),
                middle_east_input=input_data.get('region_middle_east', 0),
                north_america_input=input_data.get('region_north_america', 0),
                oceania_input=input_data.get('region_oceania', 0),
                rest_of_europe_input=input_data.get('region_rest_of_europe', 0),
                south_america_input=input_data.get('region_south_america', 0)
                )
                break
            else:
                print("🚫 Invalid input. Please enter 'Y' or 'N'.")

    except ValueError as e:
        print(f"⚠️ A value error occurred: {e}")
    except Exception as e:
        print(f"❗ An unexpected error occurred: {e}")
    
    print("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━")
    print(f"✅ Prediction complete! \n \n 🎉 Please run the next cell to see the results. \n")
    return input_data, life_expectancy

def print_output(input_data, life_expectancy):
    if input_data:
        output_message = "🌏✨ WORLD HEALTH DATA INSIGHTS ✨🌏\n"
        output_message += "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n" 
        output_message += "📊 KEY DATA COLLECTED:\n"
        for key, value in input_data.items():
            if key not in [
                "region_africa",
                "region_asia",
                "region_central_america_and_caribbean",
                "region_european_union",
                "region_middle_east",
                "region_north_america",
                "region_oceania",
                "region_rest_of_europe",
                "region_south_america",
            ]:
                # Format each key-value pair
                formatted_key = key.replace('_', ' ').title().replace('Hiv', 'HIV').replace('Bmi', 'BMI').replace('Gdp', 'GDP')
                output_message += f"★ {formatted_key} ➡️ {value}\n"
                
        output_message += "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n"

        year = input_data.get('year', 'N/A')
        region = input_data.get('region', 'N/A')

        output_message += "\n⚡ LIFE EXPECTANCY REPORT ⚡\n"
        output_message += f"🏯 Region: {region.upper()}\n"
        output_message += f"🗓️ Year: {year}\n"
        output_message += f"🎉 PREDICTED LIFE EXPECTANCY: {life_expectancy:.2f} YEARS\n"
        output_message += "\n🌀 This analysis contributes to WHO's mission to monitor and improve global health. 🌍✨\n"
        output_message += "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n"
        output_message += "🔥 THANK YOU FOR BEING PART OF A HEALTHIER WORLD! 💪⚡\n"
        print(output_message)
    else:
        print("No data available to display.")

# User Data Collection and Life Prediction Result
---

**Please note that you can input the data as a JSON string if preferable, please see examples below:**

For Advanced Mode: (JSON Data Example):
```
{"region": "Africa", "region_africa": 0, "year": 2006, "adult_mortality": 515.718, "infant_deaths": 48.7, "polio": 79.0, "incidents_hiv": 11.13, "bmi": 26.6, "thinness_ten_nineteen_years": 1.6, "schooling": 9.0, "gdp_per_capita": 5827.0}
```
For Sensitive Mode: (JSON Data Example):
```
{"region": "South America", "region_south_america": 1, "year": 2012, "adult_mortality": 150.2245, "polio": 96, "bmi": 26.1, "gdp_per_capita": 9057.0}
```
Please note that the region should be inputed into the JSON string in the following pairs for accurate data entry - this is due to the One-hot encoding (OHE) of the region columns:

```
"region": "Africa", "region_africa": 0
"region": "Asia", "region_asia": 1
"region": "Central America and Caribbean", "region_central_america_and_caribbean": 1
"region": "European Union", "region_european_union": 1
"region": "Middle East", "region_middle_east": 1
"region": "North America", "region_north_america": 1
"region": "Oceania", "region_oceania": 1
"region": "Rest of Europe","region_rest_of_europe": 1
"region": "South America", "region_south_america": 1
```

Please run the code block below to input your data for the model:

In [2]:
input_data, life_expectancy = start_predict()

🌟 Welcome to the Life Expectancy Predictor! 🌟
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔍 This tool helps predict life expectancy based on your data! 

📋 You've chosen the basic mode! Collecting essential data... 🌎 

❗ An unexpected error occurred: 'NoneType' object has no attribute 'get'
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✅ Prediction complete! 
 
 🎉 Please run the next cell to see the results. 



In [3]:
print_output(input_data, life_expectancy)

No data available to display.
