#  Subquery in the WHERE Clause

###  Description
In this notebook, we’ll explore how to use **subqueries inside the WHERE clause** to filter data dynamically.  
This approach helps us find insights such as countries that are economically strong but still face infrastructure challenges.



###  Learning Objectives
By the end of this notebook, you will:
- Understand how to use **subqueries within the WHERE clause**.  
- Learn how to **join two tables** using a common field.  
- Filter data dynamically based on computed results.



###  Connecting to the MySQL Database
We’ll use the **Basic_Services** and **Economic_Indicators** tables in the **united_nations** database.


In [1]:
%load_ext sql

###  Preview the Basic_Services Table
Let’s look at the first few records from the `Basic_Services` table to understand its structure.


In [3]:
%%sql

SELECT 
    *
FROM 
    Basic_Services
LIMIT 5;


 * mysql+pymysql://root:***@localhost:3306/united_nations
5 rows affected.


Country_name,Time_period,Access_to_clean_water,Access_to_sanitation
Kazakhstan,2015,94.67,98.0
Kazakhstan,2016,94.67,98.0
Kazakhstan,2017,95.0,98.0
Kazakhstan,2018,95.0,98.0
Kazakhstan,2019,95.0,98.0


###  Preview the Economic_Indicators Table
We’ll also explore the `Economic_Indicators` table.


In [4]:
%%sql

SELECT 
    *
FROM 
    Economic_Indicators
LIMIT 5;


 * mysql+pymysql://root:***@localhost:3306/united_nations
5 rows affected.


Country_name,Time_period,Est_gdp_in_billions,Est_population_in_millions,Pct_unemployment,Land_area
Kazakhstan,2015,184.39,17.542806,4.93,2699700.0
Kazakhstan,2016,137.28,17.794055,4.96,2699700.0
Kazakhstan,2017,166.81,18.037776,4.9,2699700.0
Kazakhstan,2018,179.34,18.276452,4.85,2699700.0
Kazakhstan,2019,181.67,18.513673,4.8,2699700.0


##  Exercise

**Question:**  
For the year **2020**, which countries have a **GDP above the global average**,  
but still have **less than 90%** of their population with access to clean water?

This helps identify countries with strong economies but where access to basic clean water remains below ideal levels.


###  Task 1 — Calculate the Global Average GDP (for 2020)
We first calculate the **global average GDP** of all countries for the year 2020.  
This value will be used in our main query’s subquery filter.


In [8]:
%%sql

SELECT
    AVG(Est_gdp_in_billions) AS Global_Avg_GDP
FROM
    Economic_Indicators
WHERE
    Time_period = 2020;


 * mysql+pymysql://root:***@localhost:3306/united_nations
1 rows affected.


Global_Avg_GDP
301.176825


###  Task 2 — Join the Economic_Indicators and Basic_Services Tables
We’ll join both tables on `Country_name` and `Time_period`  
to combine economic and basic service data for each country.


In [9]:
%%sql

SELECT
    e.Country_name,
    e.Time_period,
    e.Est_gdp_in_billions,
    b.Access_to_clean_water
FROM
    Economic_Indicators AS e
JOIN
    Basic_Services AS b
ON
    e.Country_name = b.Country_name
    AND e.Time_period = b.Time_period
LIMIT 10;


 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Time_period,Est_gdp_in_billions,Access_to_clean_water
Kazakhstan,2015,184.39,94.67
Kazakhstan,2016,137.28,94.67
Kazakhstan,2017,166.81,95.0
Kazakhstan,2018,179.34,95.0
Kazakhstan,2019,181.67,95.0
Kazakhstan,2020,171.08,95.0
Kyrgyzstan,2015,,89.67
Kyrgyzstan,2016,,90.33
Kyrgyzstan,2017,,91.0
Kyrgyzstan,2018,,91.33


###  Task 3 — Use a Subquery in the WHERE Clause
Now we’ll filter:
- Year = **2020**
- GDP > global average GDP (from the subquery)
- Access to clean water < 90%

This is the main query where we use our **subquery in the WHERE clause**.


In [10]:
%%sql

SELECT
    e.Country_name,
    e.Est_gdp_in_billions,
    b.Access_to_clean_water,
    e.Time_period
FROM
    Economic_Indicators AS e
JOIN
    Basic_Services AS b
ON
    e.Country_name = b.Country_name
    AND e.Time_period = b.Time_period
WHERE
    e.Time_period = 2020
    AND e.Est_gdp_in_billions > (
        SELECT
            AVG(Est_gdp_in_billions)
        FROM
            Economic_Indicators
        WHERE
            Time_period = 2020
    )
    AND b.Access_to_clean_water < 90
ORDER BY
    e.Est_gdp_in_billions DESC;


 * mysql+pymysql://root:***@localhost:3306/united_nations
1 rows affected.


Country_name,Est_gdp_in_billions,Access_to_clean_water,Time_period
Nigeria,432.2,77.33,2020


##  Summary

In this notebook, we learned to:
- Use a **subquery inside the WHERE clause** to filter data dynamically.
- Combine information from multiple tables using **JOIN**.
- Identify high-GDP countries where **access to clean water** is still below 90%.

This helps analysts identify inequality in development between economic performance and public service accessibility.
