In [1]:
# Load and activate the SQL extension to allow us to execute SQL in a Jupyter notebook. 
# If you get an error here, make sure that mysql and pymysql are installed correctly. 

%load_ext sql

In [2]:
# Establish a connection to the local database using the '%sql' magic command.
# Replace 'password' with our connection password and `db_name` with our database name. 
# If you get an error here, please make sure the database name or password is correct.

%sql mysql+pymysql://root:ofge@localhost:3306/united_nations

## Exercise
We can use the following command to check the data types of all the columns in our table.

In [6]:
%%sql

SHOW
    COLUMNS
FROM
    united_nations.Access_to_Basic_Services

 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Field,Type,Null,Key,Default,Extra
Region,varchar(32),YES,,,
Sub_region,varchar(25),YES,,,
Country_name,varchar(37),NO,,,
Time_period,int,NO,,,
Pct_managed_drinking_water_services,"decimal(5,2)",YES,,,
Pct_managed_sanitation_services,"decimal(5,2)",YES,,,
Est_population_in_millions,"decimal(11,6)",YES,,,
Est_gdp_in_billions,"decimal(8,2)",YES,,,
Land_area,"decimal(10,2)",YES,,,
Pct_unemployment,"decimal(5,2)",YES,,,


We can see that the column values are in their respective data types, that is, VARCHAR for Country_name, INT for Time_period, and DECIMAL for the estimated population.

Let us type in the following query which will give us the estimated population for each distinct country, per year.

In [7]:
%%sql

SELECT DISTINCT
    Country_name,
    Time_period,
    Est_population_in_millions
FROM
    united_nations.Access_to_Basic_Services
LIMIT 10;

 * mysql+pymysql://root:***@localhost:3306/united_nations
10 rows affected.


Country_name,Time_period,Est_population_in_millions
Kazakhstan,2015,17.542806
Kazakhstan,2016,17.794055
Kazakhstan,2017,18.037776
Kazakhstan,2018,18.276452
Kazakhstan,2019,18.513673
Kazakhstan,2020,18.755666
Kyrgyzstan,2015,
Kyrgyzstan,2016,
Kyrgyzstan,2017,
Kyrgyzstan,2018,


## 2. Convert to the DECIMAL data type with the preferred scale and precision
We use the CAST function to convert the Estimated population in millions column to the new data type and the results are returned in a new column, Est_population_in_millions_2dp.

In [9]:
%%sql

SELECT DISTINCT
    Country_name,
    Time_period,
    CAST(Est_population_in_millions AS DECIMAL(6,2)) AS Est_population_in_millions_2dp
FROM
    united_nations.Access_to_Basic_Services
LIMIT 20;

 * mysql+pymysql://root:***@localhost:3306/united_nations
20 rows affected.


Country_name,Time_period,Est_population_in_millions_2dp
Kazakhstan,2015,17.54
Kazakhstan,2016,17.79
Kazakhstan,2017,18.04
Kazakhstan,2018,18.28
Kazakhstan,2019,18.51
Kazakhstan,2020,18.76
Kyrgyzstan,2015,
Kyrgyzstan,2016,
Kyrgyzstan,2017,
Kyrgyzstan,2018,
