# Life Expectancy Analysis

![Life Cycle](life.extension)

## 1. Business Understanding

### a) Introduction.

The objective of this project is to analyze life expectancy data using multiple regression modeling techniques. By employing multiple regression, we aim to identify and quantify the relationships that exist between various factors or predictors and life expectancy. This analysis will provide insights into the key drivers of life expectancy and help stakeholders make informed decisions related to public health, healthcare policies, and socioeconomic development.

### b) Problem Statement.

 This project seeks to address the importance of understanding the factors that significantly affect life expectancy. Traditional approaches to public health analysis often rely on limited analysis or anecdotal evidence, leading to incomplete insights. This knowledge gap hampers policymakers' ability to predict and interpret changes in life expectancy effectively. To address this problem, we propose utilizing multiple regression modeling techniques to comprehensively analyze life expectancy data. By conducting multiple regression, we can identify the relationships between different factors and life expectancy, taking into account the collective impact of multiple independent variable

### c) Main Objective

The main objective of this project is to analyze life expectancy data using multiple regression modeling techniques to identify and quantify the key factors influencing life expectancy. The analysis will help stakeholders make data-driven decisions in areas such as public health planning, resource allocation, and policy formulation.

### d) Subjective Objectives

1. Conduct exploratory data analysis to gain insights into the relationships between different variables and life expectancy, assisting in the selection of relevant variables for the regression model.

2. Develop a multiple regression model to predict life expectancy, considering the selected independent variables and their impact on the dependent variable.

3. Validate the model assumptions, assess its goodness of fit, and refine the model if necessary.

4. Interpret the coefficients of the independent variables in the model to determine their individual impact on life expectancy, identifying the most influential factors affecting life expectancy and their respective effects.

5. Evaluate and validate the performance of the model through appropriate metrics and statistical tests.

6. Provide actionable insights and recommendations based on the analysis to assist policymakers and public health officials in making informed decisions regarding healthcare strategies, resource allocation, and intervention planning.

### e) Notebook Structure

1. Reading the data.
2. Data Cleaning and Preprocessing.
3. Exploratory Data Analysis (EDA) and Feature Selection.
4. Multiple Regression Modeling.
5. Model Evaluation and Interpretation.
6. Results Presentation and Conclusions.
7. Recommendations.
8. References.

### f) Data Understanding

The dataset used in this project contains information about various factors affecting life expectancy, such as healthcare indicators, socioeconomic factors, and lifestyle variables. Explanatory data analysis will be used to gain a clear understanding of the dataset, including handling missing values, checking data types, identifying outliers, and selecting relevant features for analysis.

The source of the data used in this project is the World Health Organization (WHO). The dataset contains information compiled by the World Health Organization on various factors related to life expectancy and is used for analyzing the factors influencing life expectancy through multiple regression modeling techniques.

 **Country**: The name of the country for which the data is recorded.

 **Year**: The year for which the data is reported.

 **Developing Status**: Indicates whether the country is classified as "developing" or "developed."

 **Adult Mortality**: A measure of the probability of dying between the ages of 15 and 60 years per 1000 population.

 **Life Expectancy**: The average number of years a person is expected to live, based on current mortality rates.

 **Infant Deaths**: The number of infant deaths (children under one year old) in the country in a given year.

 **Alcohol Consumption per Capita**: The average amount of alcohol consumed per person in the country.

 **Country's Expenditure on Health**: The amount of money spent on healthcare by the country.

 **Immunization Coverage**: The percentage of the population covered by immunization programs.

 **BMI (Body Mass Index)**: A measure of body fat based on height and weight.

 **Deaths Under 5-years-old**: The number of deaths of children under five years old in the country in a given year.

 **Deaths due to HIV/AIDS**: The number of deaths attributed to HIV/AIDS in the country in a given year.

 **GDP (Gross Domestic Product)**: The total economic output of the country.

 **Population**: The total population of the country in a given year.

 **Body Condition**: Possibly a measure of overall health or well-being.

 **Income Information**: Information about the average income or income distribution in the country.
 
 **Education**: Possibly information about educational attainment or literacy rates in the country.


## Import Relevant Packages

In [1]:
import numpy as np # For mathematical functions and operations
import pandas as pd # For data manipulation and analysis

import warnings
warnings.filterwarnings('ignore')

import matplotlib.pyplot as plt # For data Visualisation
%matplotlib inline 
# For showing the visuals on the current notebook
import seaborn as sns # For enhanced data visualisation

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split,GridSearchCV,cross_val_score
from sklearn.neighbors import KNeighborsClassifier,KNeighborsRegressor
from sklearn.ensemble import RandomForestRegressor,RandomForestClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.pipeline import Pipeline

from sklearn.metrics import confusion_matrix,ConfusionMatrixDisplay,classification_report
