### BUSINESS UNDERSTANDING
Background:

Sustainable Development Goals (SDGs) are a set of global goals adopted by United Nations member states in 2015 to address social, economic, and environmental challenges.

Predicting progress towards SDGs is essential for monitoring development outcomes, guiding policy interventions, and mobilizing resources effectively.


Business Problem:

- Lack of reliable methods to forecast progress towards SDGs hinders informed decision-making and targeted interventions.
- Developing predictive models based on historical data from the Millennium Development Goals (MDGs) era can provide insights into the factors driving successful development outcomes and inform future policy actions.

Business Objectives:

- Develop predictive models to forecast progress towards the Sustainable Development Goals (SDGs) using machine learning techniques.
- Identify key factors and indicators that contribute to successful development outcomes and progress towards the SDGs.
- Provide actionable insights and policy recommendations based on the predictive models to support decision-making and resource allocation strategies.

### DATA UNDERSTANDING

### IMPORTING OF NECESSARY LIBRARIES

In [2]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

### LOADING OF DATASETS

In [8]:
#load dataset from csv
#data = pd.read_csv('dataset/data/africa_millennium_development_goals_xlsx_1.csv')
#data.head(15)

In [4]:
#load dataset from excel file
original_data = pd.read_excel('dataset/original/africa-millennium-development-goals-xlsx-1.xlsx')
original_data.head(15)

Unnamed: 0,CountryName,Country,GoalName,Goal,IndicatorName,Indicator,Social GroupName,Social Group,Units,Scale,Frequency,Date,Value
0,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,1990-01-01,123.0
1,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,1992-01-01,107.2
2,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,1996-01-01,108.9
3,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,2000-01-01,110.0
4,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,2001-01-01,95.0
5,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,2007-01-01,70.0
6,Somalia,44,Goal 5: Improve maternal health,KN.1000040,Contraceptive prevalence rate 15-49,20267705,Women,20136705,,1,A,2000-01-01,11.7
7,Somalia,44,Goal 5: Improve maternal health,KN.1000040,Contraceptive prevalence rate 15-49,20267705,Women,20136705,,1,A,2007-01-01,14.6
8,Somalia,44,Goal 1: Eradicate extreme poverty and hunger,KN.1000000,Prevalence of underweight children under-five ...,20266005,Women,20136705,,1,A,2005-01-01,26.0
9,Somalia,44,Goal 1: Eradicate extreme poverty and hunger,KN.1000000,Prevalence of underweight children under-five ...,20266005,Women,20136705,,1,A,2007-01-01,36.0


In [5]:
#performing data exploration
#print("Dimensions of the dataset:",data.shape)
#print("\nFirst few rows of the dataset:")
#data.head()

Dimensions of the dataset: (10610, 13)

First few rows of the dataset:


Unnamed: 0,countryname,country,goalname,goal,indicatorname,indicator,social_groupname,social_group,units,scale,frequency,date,value
0,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,True,A,1990-01-01,123.0
1,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,True,A,1992-01-01,107.2
2,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,True,A,1996-01-01,108.9
3,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,True,A,2000-01-01,110.0
4,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,True,A,2001-01-01,95.0


In [7]:
#performing data exploration
print("Dimensions of the dataset:",original_data.shape)
print("\nFirst few rows of the dataset:")
original_data.head()

Dimensions of the dataset: (10610, 13)

First few rows of the dataset:


Unnamed: 0,CountryName,Country,GoalName,Goal,IndicatorName,Indicator,Social GroupName,Social Group,Units,Scale,Frequency,Date,Value
0,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,1990-01-01,123.0
1,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,1992-01-01,107.2
2,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,1996-01-01,108.9
3,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,2000-01-01,110.0
4,Kenya,25,Goal 4: Reduce child mortality,KN.1000030,Infant mortality rate per 1000 live births,20267305,Women,20136705,,1,A,2001-01-01,95.0


In [9]:
#descriptive statistics
print("\nSummary statistics:")
original_data.describe()


Summary statistics:


Unnamed: 0,Country,Indicator,Social Group,Units,Scale,Value
count,10610.0,10610.0,10610.0,0.0,10610.0,10610.0
mean,27.962677,20267660.0,20137020.0,,1.0,6365.211
std,15.145405,1734.153,163.0599,,0.0,129894.2
min,1.0,20265000.0,20136600.0,,1.0,0.0
25%,14.0,20266400.0,20137000.0,,1.0,10.5
50%,30.0,20267300.0,20137100.0,,1.0,39.4
75%,39.0,20269300.0,20137100.0,,1.0,73.1
max,53.0,20270700.0,20137100.0,,1.0,4366048.0


In [None]:
#