<div align="right" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/ExploreAI_logos/Logo blue_dark.png"  style="width:25px" align="right";/>
</div>

# Transform columns using numeric functions
© ExploreAI Academy

In this notebook, we demonstrate how to use numeric functions to transform columns in a table in SQL.

> ⚠️ This notebook will not run on Google Colab because it cannot connect to a local database. Please make sure that this notebook is running on the same local machine as your MySQL Workbench installation and MySQL `united_nations` database.

## Learning objectives

By the end of this train, you should:
- Understand how to use `SQRT`, `LOG`, and `ROUND` functions to transform columns in a table in SQL.

## Connecting to our MySQL database

Continuing with the numerical analysis of our  Access_to_Basic_Services table, we are trying to get an understanding of the scope of our dataset, more specifically, the numerical columns. We can apply the same queries in MySQL Workbench and in this notebook if we connect to our MySQL server. 
Since we have a MySQL database, we can connect to it using mysql and pymysql.

In [1]:
# Load and activate the SQL extension to allow us to execute SQL in a Jupyter notebook. 
# If you get an error here, make sure that mysql and pymysql are installed correctly. 

%load_ext sql

In [2]:
# Establish a connection to the local database using the '%sql' magic command.
# Replace 'password' with our connection password and `db_name` with our database name. 
# If you get an error here, please make sure the database name or password is correct.

%sql mysql+pymysql://root:Dau/2022@localhost:3306/united_nations

'Connected: root@united_nations'

To make a query, we add the `%%sql` command to the start of a cell, create one open line, and then the query like below, and run the cell.

In [3]:
%%sql

SELECT 
    *
FROM
    Access_to_Basic_Services
LIMIT 5;

 * mysql+pymysql://root:***@localhost:3306/united_nations
5 rows affected.


Region,Sub_region,Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions,Land_area,Pct_unemployment
Central and Southern Asia,Central Asia,Kazakhstan,2015,94.67,98.0,17.542806,184.39,2699700.0,4.93
Central and Southern Asia,Central Asia,Kazakhstan,2016,94.67,98.0,17.794055,137.28,2699700.0,4.96
Central and Southern Asia,Central Asia,Kazakhstan,2017,95.0,98.0,18.037776,166.81,2699700.0,4.9
Central and Southern Asia,Central Asia,Kazakhstan,2018,95.0,98.0,18.276452,179.34,2699700.0,4.85
Central and Southern Asia,Central Asia,Kazakhstan,2019,95.0,98.0,18.513673,181.67,2699700.0,4.8


## Exercise

We want to determine the following:

1. What is the GDP per year for each country?
2. What are the rounded-off values of the `Est_gdp_in_billions` column?
3. What is the logarithm of GDP for each country over the time period?
4. What is the square root of GDP for each country over the time period?

### 1. What is the GDP per year for each country?

Calculate the GDP per year for each country using the `Country_name`, `Time_period`, and `Est_gdp_in_billions` columns.

In [10]:
%%sql
SELECT 
DISTINCT (Country_name),
Time_period,
Est_gdp_in_billions AS GDP_per_year_per_country
FROM
Access_to_Basic_Services

LIMIT 15;

 * mysql+pymysql://root:***@localhost:3306/united_nations
15 rows affected.


Country_name,Time_period,GDP_per_year_per_country
Kazakhstan,2015,184.39
Kazakhstan,2016,137.28
Kazakhstan,2017,166.81
Kazakhstan,2018,179.34
Kazakhstan,2019,181.67
Kazakhstan,2020,171.08
Kyrgyzstan,2015,
Kyrgyzstan,2016,
Kyrgyzstan,2017,
Kyrgyzstan,2018,


### 2. What are the rounded-off values of the Est_gdp_in_billions column?

When looking at many billion-dollar figures, the decimal places can be a little distracting. We can round off the numbers in the `Est_gdp_in_billions` column to make them more manageable.

Using the same query executed above, round off the values in the `Est_gdp_in_billions` column using the `ROUND` function.


In [13]:
%%sql
SELECT 
DISTINCT (Country_name),
Time_period,
ROUND (Est_gdp_in_billions, 0)AS GDP_per_year_per_country
FROM
Access_to_Basic_Services

LIMIT 15;

 * mysql+pymysql://root:***@localhost:3306/united_nations
15 rows affected.


Country_name,Time_period,GDP_per_year_per_country
Kazakhstan,2015,184.0
Kazakhstan,2016,137.0
Kazakhstan,2017,167.0
Kazakhstan,2018,179.0
Kazakhstan,2019,182.0
Kazakhstan,2020,171.0
Kyrgyzstan,2015,
Kyrgyzstan,2016,
Kyrgyzstan,2017,
Kyrgyzstan,2018,


### 3. What is the logarithm of GDP for each country over the time period?

In order to calculate the growth rate of GDP over the time period, we can use the logarithm of GDP. This is because using the logarithm allows for easier comparison and analysis of growth rates with more digestible representations of larger numbers. Logarithmic transformations capture proportional changes rather than absolute changes, which is often more meaningful when analysing economic growth rates.

Calculate the logarithmic of the `Est_gdp_in_billions` column using the LOG function.

In [18]:
%%sql
SELECT 
DISTINCT (Country_name),
Time_period,
ROUND (Est_gdp_in_billions, 0)AS GDP_per_year_per_country,
LOG(Est_gdp_in_billions) AS Log_est_gdp_in_billions
FROM
Access_to_Basic_Services

LIMIT 15;

 * mysql+pymysql://root:***@localhost:3306/united_nations
15 rows affected.


Country_name,Time_period,GDP_per_year_per_country,Log_est_gdp_in_billions
Kazakhstan,2015,184.0,5.217053079717073
Kazakhstan,2016,137.0,4.922022635739652
Kazakhstan,2017,167.0,5.116855440165964
Kazakhstan,2018,179.0,5.189283445523902
Kazakhstan,2019,182.0,5.202191854450653
Kazakhstan,2020,171.0,5.142131283358708
Kyrgyzstan,2015,,
Kyrgyzstan,2016,,
Kyrgyzstan,2017,,
Kyrgyzstan,2018,,


### 4. What is the square root of GDP for each country over the time period?

Alternatively, to get a similar effect, we could use the `SQRT` function to calculate the square roots of the values in the `Est_gdp_in_billions` column in order to get a smaller representation of these values.

Calculate the square root of the `Est_gdp_in_billions` column using the `SQRT` function.

In [17]:
%%sql
SELECT 
DISTINCT (Country_name),
Time_period,
ROUND (Est_gdp_in_billions, 0) AS GDP_per_year_per_country,
LOG(Est_gdp_in_billions) AS Log_est_gdp_in_billions	,
SQRT(Est_gdp_in_billions) AS SQRT_est_gdp_in_billions
FROM 
Access_to_Basic_Services

LIMIT 15;

 * mysql+pymysql://root:***@localhost:3306/united_nations
15 rows affected.


Country_name,Time_period,GDP_per_year_per_country,Log_est_gdp_in_billions,SQRT_est_gdp_in_billions
Kazakhstan,2015,184.0,5.217053079717073,13.579027947537334
Kazakhstan,2016,137.0,4.922022635739652,11.716654812701448
Kazakhstan,2017,167.0,5.116855440165964,12.915494570476191
Kazakhstan,2018,179.0,5.189283445523902,13.391788528796294
Kazakhstan,2019,182.0,5.202191854450653,13.478501400378308
Kazakhstan,2020,171.0,5.142131283358708,13.079755349393963
Kyrgyzstan,2015,,,
Kyrgyzstan,2016,,,
Kyrgyzstan,2017,,,
Kyrgyzstan,2018,,,


### Summary

We can also combine all of our queries into a single query to have a single return that includes all of the values.

In [19]:
%%sql

Select Country_name, 
    Time_period, 
    ROUND(Est_gdp_in_billions)AS GDP_per_year_per_country,
    LOG(Est_gdp_in_billions) AS Log_est_gdp_in_billions,
    SQRT(Est_gdp_in_billions) AS SQRT_est_gdp_in_billions
FROM Access_to_Basic_Services;


 * mysql+pymysql://root:***@localhost:3306/united_nations
1048 rows affected.


Country_name,Time_period,GDP_per_year_per_country,Log_est_gdp_in_billions,SQRT_est_gdp_in_billions
Kazakhstan,2015,184.0,5.217053079717073,13.579027947537334
Kazakhstan,2016,137.0,4.922022635739652,11.716654812701448
Kazakhstan,2017,167.0,5.116855440165964,12.915494570476191
Kazakhstan,2018,179.0,5.189283445523902,13.391788528796294
Kazakhstan,2019,182.0,5.202191854450653,13.478501400378308
Kazakhstan,2020,171.0,5.142131283358708,13.079755349393963
Kyrgyzstan,2015,,,
Kyrgyzstan,2016,,,
Kyrgyzstan,2017,,,
Kyrgyzstan,2018,,,


<div align="center" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/ExploreAI_logos/EAI_Blue_Dark.png"  style="width:200px";/>
</div>