#  Subqueries in the SELECT Clause

In this notebook, we explore **subqueries** — queries nested inside other queries — to perform dynamic calculations in SQL.

We’ll use the **Access_to_Basic_Services** table from the **united_nations** database.

### Learning objectives
In this exercise, you’ll learn:
- How to use **subqueries** instead of static values for dynamic calculations.
- How to convert a **regular subquery** into a **correlated subquery** to perform calculations based on the current row.



### Scenario
We want to calculate the **percentage of total land area** that each country occupies within its **sub-region**.

To achieve this, we’ll:
1. Find the total land area for a given sub-region.
2. Calculate each country’s land area as a percentage of that total.
3. Use both static values and subqueries.
4. Finally, apply a **correlated subquery** to handle all regions dynamically.


In [1]:
%load_ext sql

In [12]:
%%sql

-- Step 1: Calculate the total land area of the 'Central Asia' sub-region
SELECT 
    Sub_region,
    SUM(Land_area) AS total_land_area
FROM 
    Access_to_Basic_Services
WHERE 
    Sub_region = 'Central Asia';


 * mysql+pymysql://root:***@localhost:3306/united_nations
1 rows affected.


Sub_region,total_land_area
Central Asia,22494091.0


## Step 2️ : Calculate Land Area Percentages Using a Static Value

Let’s assume the total land area for 'Central Asia' from the previous query was **13,498,500.00**.  
We can manually use that static value to compute the percentage share for each country in this sub-region.


In [13]:
%%sql

SELECT 
    Country_name,
    Sub_region,
    Land_area,
    (Land_area / 13498500.00) * 100 AS pct_regional_land
FROM 
    Access_to_Basic_Services
WHERE 
    Sub_region = 'Central Asia'
GROUP BY 
    Country_name, Sub_region, Land_area;


 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Sub_region,Land_area,pct_regional_land
Kazakhstan,Central Asia,2699700.0,20.0
Kyrgyzstan,Central Asia,,
Tajikistan,Central Asia,138790.0,1.028188
Turkmenistan,Central Asia,469930.0,3.48135
Uzbekistan,Central Asia,440600.0,3.264066
Uzbekistan,Central Asia,440591.0,3.264
Uzbekistan,Central Asia,440588.0,3.263977
Uzbekistan,Central Asia,440587.0,3.26397
Uzbekistan,Central Asia,440555.0,3.263733
Uzbekistan,Central Asia,440650.0,3.264437


## Step 3️: Calculate Land Area Percentages Using a Subquery (Dynamic)

Now, let’s make the query **dynamic** using a **subquery** to calculate the total land area for 'Central Asia' automatically.


In [14]:
%%sql

SELECT 
    Country_name,
    Sub_region,
    Land_area,
    (Land_area / 
        (SELECT SUM(Land_area)
         FROM Access_to_Basic_Services
         WHERE Sub_region = 'Central Asia')
    ) * 100 AS pct_regional_land
FROM 
    Access_to_Basic_Services
WHERE 
    Sub_region = 'Central Asia'
GROUP BY 
    Country_name, Sub_region, Land_area;


 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Sub_region,Land_area,pct_regional_land
Kazakhstan,Central Asia,2699700.0,12.001819
Kyrgyzstan,Central Asia,,
Tajikistan,Central Asia,138790.0,0.617006
Turkmenistan,Central Asia,469930.0,2.089126
Uzbekistan,Central Asia,440600.0,1.958737
Uzbekistan,Central Asia,440591.0,1.958697
Uzbekistan,Central Asia,440588.0,1.958683
Uzbekistan,Central Asia,440587.0,1.958679
Uzbekistan,Central Asia,440555.0,1.958537
Uzbekistan,Central Asia,440650.0,1.958959


## Step 4️: Calculate Land Area Percentages for All Sub-regions Using a Correlated Subquery

We’ll now make the query fully dynamic using a **correlated subquery**.  
This allows the query to automatically calculate land area percentages for **all sub-regions** without specifying any particular one.


In [15]:
%%sql

SELECT 
    outer_table.Country_name,
    outer_table.Sub_region,
    outer_table.Land_area,
    (outer_table.Land_area /
        (SELECT SUM(inner_table.Land_area)
         FROM Access_to_Basic_Services AS inner_table
         WHERE inner_table.Sub_region = outer_table.Sub_region)
    ) * 100 AS pct_regional_land
FROM 
    Access_to_Basic_Services AS outer_table
GROUP BY 
    outer_table.Country_name, outer_table.Sub_region, outer_table.Land_area;


 * mysql+pymysql://root:***@localhost:3306/united_nations
227 rows affected.


Country_name,Sub_region,Land_area,pct_regional_land
Kazakhstan,Central Asia,2699700.0,12.001819
Kyrgyzstan,Central Asia,,
Tajikistan,Central Asia,138790.0,0.617006
Turkmenistan,Central Asia,469930.0,2.089126
Uzbekistan,Central Asia,440600.0,1.958737
Uzbekistan,Central Asia,440591.0,1.958697
Uzbekistan,Central Asia,440588.0,1.958683
Uzbekistan,Central Asia,440587.0,1.958679
Uzbekistan,Central Asia,440555.0,1.958537
Uzbekistan,Central Asia,440650.0,1.958959


##  Summary

 **Uncorrelated subquery:** Executes once and returns a single value (e.g., total for 'Central Asia').  
 **Correlated subquery:** Executes per row, dynamically referencing each country’s sub-region.  
 Subqueries make SQL more flexible and eliminate hardcoding, helping you analyze data dynamically across multiple regions.
