#  Conditional Calculations Using IF

##  Learning Objectives
By the end of this session, you should be able to:
- Understand how to use the **IF statement** for conditional data manipulation.
- Apply **nested IF statements** to address multiple conditions sequentially.

##  Connecting to the MySQL Database
We’ll be using our **Access_to_Basic_Services** table from MySQL Workbench to explore how conditional logic can help fill in missing values and classify data dynamically.

If you’re running this in Jupyter Notebook, ensure that `mysql` and `pymysql` are installed correctly and that your connection parameters (username, password, database name) are valid.


In [1]:
%load_ext sql

##  Exercise 1 — Replace NULL Values for Central and Southern Asia

We’ll start by checking whether the **Region** is `Central and Southern Asia` and whether `Pct_unemployment` is `NULL`.  
If both conditions are met, we’ll assign the value `19.59` to a new alias column named `New_pct_unemployment`.  
Otherwise, we’ll use `-999` as a placeholder value.


In [3]:
%%sql

SELECT 
    Region,
    Country_name,
    IF(
        Region = 'Central and Southern Asia' AND Pct_unemployment IS NULL,
        19.59,
        -999
    ) AS New_pct_unemployment
FROM Access_to_Basic_Services;


 * mysql+pymysql://root:***@localhost:3306/united_nations
1048 rows affected.


Region,Country_name,New_pct_unemployment
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59


##  Exercise 2 — Replace NULL Values for Eastern and South-Eastern Asia

Now we extend our logic to include **Eastern and South-Eastern Asia**.  
We’ll use a **nested IF statement** to assign different values based on the region, while still using `-999` as the fallback for unmatched cases.


In [4]:
%%sql

SELECT 
    Region,
    Country_name,
    IF(
        Region = 'Central and Southern Asia' AND Pct_unemployment IS NULL,
        19.59,
        IF(
            Region = 'Eastern and South-Eastern Asia' AND Pct_unemployment IS NULL,
            22.64,
            -999
        )
    ) AS New_pct_unemployment
FROM Access_to_Basic_Services;


 * mysql+pymysql://root:***@localhost:3306/united_nations
1048 rows affected.


Region,Country_name,New_pct_unemployment
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59


##  Exercise 3 — Replace NULL Values for Remaining Regions

We’ll now extend our nested IF logic to include all remaining regions.  
Each region will have its own default value assigned when `Pct_unemployment` is `NULL`, based on the data below:

| Region | Replacement Value |
|--------|-------------------|
| Europe and Northern America | 24.43 |
| Latin America and the Caribbean | 24.23 |
| Northern Africa and Western Asia | 17.84 |
| Oceania | 4.98 |
| Sub-Saharan Africa | 33.65 |

Any other records will remain with the placeholder `-999`.


In [5]:
%%sql

SELECT 
    Region,
    Country_name,
    IF(Region = 'Central and Southern Asia' AND Pct_unemployment IS NULL, 19.59,
        IF(Region = 'Eastern and South-Eastern Asia' AND Pct_unemployment IS NULL, 22.64,
            IF(Region = 'Europe and Northern America' AND Pct_unemployment IS NULL, 24.43,
                IF(Region = 'Latin America and the Caribbean' AND Pct_unemployment IS NULL, 24.23,
                    IF(Region = 'Northern Africa and Western Asia' AND Pct_unemployment IS NULL, 17.84,
                        IF(Region = 'Oceania' AND Pct_unemployment IS NULL, 4.98,
                            IF(Region = 'Sub-Saharan Africa' AND Pct_unemployment IS NULL, 33.65,
                                -999
                            )
                        )
                    )
                )
            )
        )
    ) AS New_pct_unemployment
FROM Access_to_Basic_Services;


 * mysql+pymysql://root:***@localhost:3306/united_nations
1048 rows affected.


Region,Country_name,New_pct_unemployment
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kazakhstan,-999.0
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59


##  Exercise 4 — Replace Placeholder Values

Finally, we’ll replace the placeholder value of `-999` with the **original Pct_unemployment** values  
for records that don’t meet any of the specified conditions.


In [6]:
%%sql

SELECT 
    Region,
    Country_name,
    IF(Region = 'Central and Southern Asia' AND Pct_unemployment IS NULL, 19.59,
        IF(Region = 'Eastern and South-Eastern Asia' AND Pct_unemployment IS NULL, 22.64,
            IF(Region = 'Europe and Northern America' AND Pct_unemployment IS NULL, 24.43,
                IF(Region = 'Latin America and the Caribbean' AND Pct_unemployment IS NULL, 24.23,
                    IF(Region = 'Northern Africa and Western Asia' AND Pct_unemployment IS NULL, 17.84,
                        IF(Region = 'Oceania' AND Pct_unemployment IS NULL, 4.98,
                            IF(Region = 'Sub-Saharan Africa' AND Pct_unemployment IS NULL, 33.65,
                                Pct_unemployment
                            )
                        )
                    )
                )
            )
        )
    ) AS New_pct_unemployment
FROM Access_to_Basic_Services;


 * mysql+pymysql://root:***@localhost:3306/united_nations
1048 rows affected.


Region,Country_name,New_pct_unemployment
Central and Southern Asia,Kazakhstan,4.93
Central and Southern Asia,Kazakhstan,4.96
Central and Southern Asia,Kazakhstan,4.9
Central and Southern Asia,Kazakhstan,4.85
Central and Southern Asia,Kazakhstan,4.8
Central and Southern Asia,Kazakhstan,4.89
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59
Central and Southern Asia,Kyrgyzstan,19.59


##  Summary

In this exercise, we used the **IF** statement to conditionally replace missing data values in our dataset.  
We started by assigning default unemployment rates for specific regions and then nested multiple IF conditions to handle several cases.

Finally, we replaced placeholder values with the original column data for non-matching conditions.  

This approach is extremely powerful in real-world data cleaning — allowing us to handle missing or inconsistent values dynamically, ensuring our dataset remains consistent and ready for analysis.
