In [3]:
# Load and activate the SQL extension to allows us to execute SQL in a Jupyter notebook. 
# If you get an error here, make sure that mysql and pymysql is installed correctly. 

%load_ext sql

In [5]:
# Establish a connection to the local database using the '%sql' magic command,
# Replace 'password' with our connection password and `db_name` with our database name. 
# If you get an error here, please make sure the database name or password is correct.

%sql mysql+pymysql://root:ofge@localhost:3306/united_nations

In [16]:
%%sql

SELECT
    *
FROM
    Access_to_Basic_Services
LIMIT 5;

 * mysql+pymysql://root:***@localhost:3306/united_nations
5 rows affected.


Region,Sub_region,Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions,Land_area,Pct_unemployment
Central and Southern Asia,Central Asia,Kazakhstan,2015,94.67,98.0,17.542806,184.39,2699700.0,4.93
Central and Southern Asia,Central Asia,Kazakhstan,2016,94.67,98.0,17.794055,137.28,2699700.0,4.96
Central and Southern Asia,Central Asia,Kazakhstan,2017,95.0,98.0,18.037776,166.81,2699700.0,4.9
Central and Southern Asia,Central Asia,Kazakhstan,2018,95.0,98.0,18.276452,179.34,2699700.0,4.85
Central and Southern Asia,Central Asia,Kazakhstan,2019,95.0,98.0,18.513673,181.67,2699700.0,4.8


## Task 1
Select data from the Sub-Saharan African region during the year 2020.

In [70]:
%%sql

SELECT
	Country_name,
	Time_period,
	Pct_managed_drinking_water_services,
	Pct_managed_sanitation_services,
	Est_population_in_millions,
	Est_gdp_in_billions
FROM
	united_nations.Access_to_Basic_Services
WHERE
	region = 'Sub-Saharan Africa'
AND
    Time_period = 2020;

 * mysql+pymysql://root:***@localhost:3306/united_nations
47 rows affected.


Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions
Burundi,2020,70.33,44.33,12.220227,2.65
Djibouti,2020,69.0,56.0,1.090156,3.18
Ethiopia,2020,58.0,11.67,117.190911,107.66
Kenya,2020,67.0,33.67,51.98578,100.67
Madagascar,2020,56.33,13.0,28.225177,13.05
Malawi,2020,74.33,28.67,19.377061,12.18
Mauritius,2020,100.0,96.0,1.26574,11.4
Mayotte,2020,96.0,100.0,,
Mozambique,2020,66.67,40.33,31.178239,14.03
Rwanda,2020,66.33,64.0,13.146362,10.18


## Task 2
Sometimes there are null values in our entries. Any country having Null values for their GDP should not be included in our query as they will not help us determine if there is any correlation between GDP and access to basic services. For this task determine if there are any NULL values in the GDP column

In [69]:
%%sql

SELECT
	Country_name,
	Time_period,
	Pct_managed_drinking_water_services,
	Pct_managed_sanitation_services,
	Est_gdp_in_billions,
    region
FROM
	united_nations.Access_to_Basic_Services
WHERE
	region = 'Sub-Saharan Africa'
AND
	Time_period = 2020
AND
	Est_gdp_in_billions IS NULL;

 * mysql+pymysql://root:***@localhost:3306/united_nations
9 rows affected.


Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_gdp_in_billions,region
Mayotte,2020,96.0,100.0,,Sub-Saharan Africa
Réunion,2020,100.0,100.0,,Sub-Saharan Africa
South Sudan,2020,48.33,22.33,,Sub-Saharan Africa
United Republic of Tanzania,2020,65.0,34.0,,Sub-Saharan Africa
Congo,2020,69.0,17.67,,Sub-Saharan Africa
Democratic Republic of the Congo,2020,47.67,15.33,,Sub-Saharan Africa
Côte d'Ivoire,2020,70.67,34.67,,Sub-Saharan Africa
Gambia,2020,79.33,44.33,,Sub-Saharan Africa
Saint Helena,2020,99.0,100.0,,Sub-Saharan Africa


## Task 3
If there are any Null values exclude them from your query.

In [68]:
%%sql

SELECT
	Country_name,
	Time_period,
	Pct_managed_drinking_water_services,
	Pct_managed_sanitation_services,
	Est_gdp_in_billions,
    	region
FROM
	united_nations.Access_to_Basic_Services
WHERE
	region = 'Sub-Saharan Africa'
AND
	Time_period = 2020
AND
	Est_gdp_in_billions IS NOT NULL
ORDER BY Est_gdp_in_billions
LIMIT 10;

 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_gdp_in_billions,region
Sao Tome and Principe,2020,77.33,46.0,0.47,Sub-Saharan Africa
Guinea-Bissau,2020,60.0,19.33,1.43,Sub-Saharan Africa
Cabo Verde,2020,87.33,78.0,1.7,Sub-Saharan Africa
Lesotho,2020,76.33,49.67,2.23,Sub-Saharan Africa
Central African Republic,2020,38.33,15.0,2.33,Sub-Saharan Africa
Burundi,2020,70.33,44.33,2.65,Sub-Saharan Africa
Liberia,2020,75.0,17.67,3.04,Sub-Saharan Africa
Djibouti,2020,69.0,56.0,3.18,Sub-Saharan Africa
Eswatini,2020,76.67,61.33,3.98,Sub-Saharan Africa
Sierra Leone,2020,65.0,17.33,4.06,Sub-Saharan Africa


## Task 4
Lets get an idea if there's any correlation between the GDP and access to basic services for the top 5 economies in Sub-Saharan Africr. The top 5 GDP's are : ('Nigeria','South Africa','Ethiopia','Kenya','Ghana'). Make sure your query only includes these countries.

In [66]:
%%sql

SELECT
	Country_name,
	Time_period,
	Pct_managed_drinking_water_services,
	Pct_managed_sanitation_services,
	Est_population_in_millions,
	Est_gdp_in_billions
FROM
	united_nations.Access_to_Basic_Services

WHERE
	Region = 'Sub-Saharan Africa'
AND
Time_period = 2020

AND
	Est_gdp_in_billions IS NOT NULL

AND
	Country_name  NOT IN ('Nigeria','South Africa','Ethiopia','Kenya','Ghana')
ORDER BY Est_gdp_in_billions
limit 10;

 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions
Sao Tome and Principe,2020,77.33,46.0,0.218641,0.47
Guinea-Bissau,2020,60.0,19.33,2.015828,1.43
Cabo Verde,2020,87.33,78.0,0.58264,1.7
Lesotho,2020,76.33,49.67,2.2541,2.23
Central African Republic,2020,38.33,15.0,5.34302,2.33
Burundi,2020,70.33,44.33,12.220227,2.65
Liberia,2020,75.0,17.67,5.087584,3.04
Djibouti,2020,69.0,56.0,1.090156,3.18
Eswatini,2020,76.67,61.33,1.180655,3.98
Sierra Leone,2020,65.0,17.33,8.23397,4.06


## Task 5
We only looked at 5 countries in the previous query. Lets have a look at the rest of Sub-Saharan Africa . Exclude the countries mentioned in the previous task

In [62]:
%%sql

SELECT
	Country_name,
	Time_period,
	Pct_managed_drinking_water_services,
	Pct_managed_sanitation_services,
	Est_population_in_millions,
	Est_gdp_in_billions
FROM
	united_nations.Access_to_Basic_Services

WHERE
	Region = 'Sub-Saharan Africa'
AND
Time_period = 2020

AND
	Est_gdp_in_billions IS NOT NULL

AND
	Country_name IN ('Nigeria','South Africa','Ethiopia','Kenya','Ghana')
ORDER BY Pct_managed_sanitation_services;


 * mysql+pymysql://root:***@localhost:3306/united_nations
5 rows affected.


Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions
Ethiopia,2020,58.0,11.67,117.190911,107.66
Ghana,2020,84.67,23.0,32.180401,70.04
Kenya,2020,67.0,33.67,51.98578,100.67
Nigeria,2020,77.33,42.67,208.327405,432.2
South Africa,2020,92.0,78.67,58.801927,337.62
