<div align="right" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img
 src="https://raw.githubusercontent.com/Explore-AI/Pictures/master/alx-courses/aice/assets/Content_page_banner_blue_dots.png"
 alt="ALX Content Header"
 class="full-width-image"
/>
</div>

# SELECT and SELECT WHERE

In this walk-through we demonstrate how to get data out of a table using the SELECT statement.
We also show how to filter data using the where clause.



> ⚠️ This notebook will not run on Google Colab because it cannot connect to a local database. Please make sure that this notebook is running on the same local machine as your MySQL Workbench installation and MySQL `united_nations` database.

## Learning objectives

In this train, we will learn:
- How to use SELECT and SELECT DISTINCT to select columns.
- How to use WHERE to filter data based on a condition.
- Save results sets as new tables.

## Connecting to our MySQL database

Using our `Access_to_Basic_Services` table in our `united_nations` database we created in MySQL Workbench, we want to answer some questions about our dataset. We can apply the same queries we used in MySQL Workbench in this notebook if we connect to our MySQL server by running the cells below.


In [1]:
# Load and activate the SQL extension to allow us to execute SQL in a Jupyter notebook. 
# If you get an error here, make sure that mysql and pymysql are installed correctly. 

%load_ext sql

In [8]:
# Establish a connection to the local database using the '%sql' magic command.
# Replace 'password' with our connection password and `db_name` with our database name. 
# If you get an error here, please make sure the database name or password is correct.

%sql mysql+pymysql://root:12122001UPi@localhost:3306/united_nations


To make a query, we add the `%%sql` command to the start of a cell, create one open line, then the query like below, and run the cell.

In [24]:
%config SqlMagic.displaylimit = None

In [9]:
%%sql

SELECT 
    *
FROM
    Access_to_Basic_Services
LIMIT 5;

Region,Sub_region,Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions,Land_area,Pct_unemployment
Central and Southern Asia,Central Asia,Kazakhstan,2015,94.67,98.0,17.542806,184.39,2699700.0,4.93
Central and Southern Asia,Central Asia,Kazakhstan,2016,94.67,98.0,17.794055,137.28,2699700.0,4.96
Central and Southern Asia,Central Asia,Kazakhstan,2017,95.0,98.0,18.037776,166.81,2699700.0,4.9
Central and Southern Asia,Central Asia,Kazakhstan,2018,95.0,98.0,18.276452,179.34,2699700.0,4.85
Central and Southern Asia,Central Asia,Kazakhstan,2019,95.0,98.0,18.513673,181.67,2699700.0,4.8


## Exercise


Suppose we want to find out which country had the lowest percentage of people with access to managed drinking water services in 2020.

### 1. Exploring the database

Use the `SELECT` statement to display all the columns from the `Access_to_Basic_Services` table. This will help us get a feel for the data we're working with. 

In [25]:
%%sql

SELECT 
    *
FROM
    Access_to_Basic_Services
LIMIT 100; 

Region,Sub_region,Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions,Land_area,Pct_unemployment
Central and Southern Asia,Central Asia,Kazakhstan,2015,94.67,98.0,17.542806,184.39,2699700.0,4.93
Central and Southern Asia,Central Asia,Kazakhstan,2016,94.67,98.0,17.794055,137.28,2699700.0,4.96
Central and Southern Asia,Central Asia,Kazakhstan,2017,95.0,98.0,18.037776,166.81,2699700.0,4.9
Central and Southern Asia,Central Asia,Kazakhstan,2018,95.0,98.0,18.276452,179.34,2699700.0,4.85
Central and Southern Asia,Central Asia,Kazakhstan,2019,95.0,98.0,18.513673,181.67,2699700.0,4.8
Central and Southern Asia,Central Asia,Kazakhstan,2020,95.0,98.0,18.755666,171.08,2699700.0,4.89
Central and Southern Asia,Central Asia,Kyrgyzstan,2015,89.67,96.67,,,,
Central and Southern Asia,Central Asia,Kyrgyzstan,2016,90.33,96.67,,,,
Central and Southern Asia,Central Asia,Kyrgyzstan,2017,91.0,97.33,,,,
Central and Southern Asia,Central Asia,Kyrgyzstan,2018,91.33,97.33,,,,


The previous query may return a large number of rows, which could slow down our system. Modify the query to limit the number of rows returned to 10.

In [12]:
%%sql

SELECT 
    *
FROM
    Access_to_Basic_Services
LIMIT 10; 

Region,Sub_region,Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions,Land_area,Pct_unemployment
Central and Southern Asia,Central Asia,Kazakhstan,2015,94.67,98.0,17.542806,184.39,2699700.0,4.93
Central and Southern Asia,Central Asia,Kazakhstan,2016,94.67,98.0,17.794055,137.28,2699700.0,4.96
Central and Southern Asia,Central Asia,Kazakhstan,2017,95.0,98.0,18.037776,166.81,2699700.0,4.9
Central and Southern Asia,Central Asia,Kazakhstan,2018,95.0,98.0,18.276452,179.34,2699700.0,4.85
Central and Southern Asia,Central Asia,Kazakhstan,2019,95.0,98.0,18.513673,181.67,2699700.0,4.8
Central and Southern Asia,Central Asia,Kazakhstan,2020,95.0,98.0,18.755666,171.08,2699700.0,4.89
Central and Southern Asia,Central Asia,Kyrgyzstan,2015,89.67,96.67,,,,
Central and Southern Asia,Central Asia,Kyrgyzstan,2016,90.33,96.67,,,,
Central and Southern Asia,Central Asia,Kyrgyzstan,2017,91.0,97.33,,,,
Central and Southern Asia,Central Asia,Kyrgyzstan,2018,91.33,97.33,,,,


### 2. Unique country names
Extract a list of unique country names in the database.

In [13]:
%%sql

SELECT DISTINCT
    Country_name
FROM
    Access_to_Basic_Services
LIMIT 10;

Country_name
Kazakhstan
Kyrgyzstan
Tajikistan
Turkmenistan
Uzbekistan
Afghanistan
Bangladesh
Bhutan
India
Iran (Islamic Republic of)


Create a new table called `Country_list` and save the unique country names into this table.

In [29]:
%%sql

CREATE TABLE Country_list(Country VARCHAR(255));
INSERT INTO Country_list(Country)
SELECT DISTINCT 
    Country_name 
FROM 
    united_nations.Access_to_Basic_Services;

### 3. Selecting specific fields

Select the `country_name`, `time_period`, and `pct_managed_drinking_water_services` fields from the `Access_to_Basic_Services` table.

In [14]:
%%sql

SELECT 
    country_name, time_period, pct_managed_drinking_water_services
FROM
    Access_to_Basic_Services
LIMIT 20;

country_name,time_period,pct_managed_drinking_water_services
Kazakhstan,2015,94.67
Kazakhstan,2016,94.67
Kazakhstan,2017,95.0
Kazakhstan,2018,95.0
Kazakhstan,2019,95.0
Kazakhstan,2020,95.0
Kyrgyzstan,2015,89.67
Kyrgyzstan,2016,90.33
Kyrgyzstan,2017,91.0
Kyrgyzstan,2018,91.33


Rename the `pct_managed_drinking_water_services` field to `pct_access_to_water` in your query results.

In [28]:
%%sql
    ALTER TABLE Access_to_Basic_Services
    RENAME COLUMN pct_managed_drinking_water_services to pct_access_to_water

### 4. Filtering and sorting data

Modify your query to only display data for the year `2020`.

In [20]:
%%sql
SELECT 
    *
FROM
    Access_to_Basic_Services
WHERE 
    Time_period = 2020
LIMIT 20;

Region,Sub_region,Country_name,Time_period,Pct_managed_drinking_water_services,Pct_managed_sanitation_services,Est_population_in_millions,Est_gdp_in_billions,Land_area,Pct_unemployment
Central and Southern Asia,Central Asia,Kazakhstan,2020,95.0,98.0,18.755666,171.08,2699700.0,4.89
Central and Southern Asia,Central Asia,Kyrgyzstan,2020,92.67,97.67,,,,
Central and Southern Asia,Central Asia,Tajikistan,2020,85.0,96.33,9.543207,8.13,138790.0,
Central and Southern Asia,Central Asia,Turkmenistan,2020,100.0,99.33,6.250438,,469930.0,
Central and Southern Asia,Central Asia,Uzbekistan,2020,98.0,100.0,34.23205,59.89,440650.0,5.29
Central and Southern Asia,Southern Asia,Afghanistan,2020,80.33,54.0,38.97223,20.14,652230.0,11.71
Central and Southern Asia,Southern Asia,Bangladesh,2020,97.67,54.0,167.420951,373.9,130170.0,
Central and Southern Asia,Southern Asia,Bhutan,2020,97.33,76.67,0.772506,2.33,38140.0,5.03
Central and Southern Asia,Southern Asia,India,2020,91.0,72.33,1396.387127,2667.69,2973190.0,7.86
Central and Southern Asia,Southern Asia,Iran (Islamic Republic of),2020,96.67,88.33,,,,


The previous query may return a large number of rows, which could slow down our system. Modify the query to limit the number of rows returned to 10.

In [23]:
%%sql

SELECT 
    country_name, 
    time_period, 
    pct_managed_drinking_water_services AS pct_access_to_water
FROM 
    united_nations.Access_to_Basic_Services 
WHERE 
    Time_period = 2020
ORDER BY pct_access_to_water
LIMIT 10;

country_name,time_period,pct_access_to_water
Central African Republic,2020,38.33
Democratic Republic of the Congo,2020,47.67
South Sudan,2020,48.33
Angola,2020,52.33
Chad,2020,52.67
Burkina Faso,2020,53.33
Madagascar,2020,56.33
Papua New Guinea,2020,56.67
Somalia,2020,57.33
Niger,2020,57.33


And there is the answer at the top: 

In [None]:
#Answer:

## Solutions

### 1. Exploring the database

Use the `SELECT` statement to display all the columns from the `Access_to_Basic_Services` table. This will help us get a feel for the data we're working with.

In [None]:
%%sql

SELECT 
    * 
FROM 
    united_nations.Access_to_Basic_Services
LIMIT 30;

The previous query may return a large number of rows, which could slow down our system. Modify the query to limit the number of rows returned to 10.

In [None]:
%%sql

SELECT 
    * 
FROM 
    united_nations.Access_to_Basic_Services
LIMIT 10;

### 2. Unique country names
Extract a list of unique country names in the database.

In [None]:
%%sql

SELECT DISTINCT 
    Country_name 
FROM 
    united_nations.Access_to_Basic_Services
    LIMIT 20;

Create a new table called `Country_list` and save the unique country names into this table.

In [None]:
%%sql

CREATE TABLE Country_list(Country VARCHAR(255));
INSERT INTO Country_list(Country)
SELECT DISTINCT 
    Country_name 
FROM 
    united_nations.Access_to_Basic_Services;

### 3. Selecting specific fields

Select the `country_name`, `time_period`, and `pct_managed_drinking_water_services` fields from the `Access_to_Basic_Services` table.

In [None]:
%%sql

SELECT 
    country_name, 
    time_period, 
    pct_managed_drinking_water_services 
FROM 
    united_nations.Access_to_Basic_Services
LIMIT 20;


Rename the `pct_managed_drinking_water_services` field to `pct_access_to_water` in your query results.

In [None]:
%%sql

SELECT 
    country_name, 
    time_period, 
    pct_managed_drinking_water_services AS pct_access_to_water
FROM 
    united_nations.Access_to_Basic_Services
LIMIT 20

### 4. Filtering and sorting data

Modify your query to only display data for the year 2020.

In [None]:
%%sql

SELECT 
    country_name, 
    time_period, 
    pct_managed_drinking_water_services AS pct_access_to_water
FROM 
    united_nations.Access_to_Basic_Services 
WHERE 
    Time_period = 2020
LIMIT 100;

The previous query may return a large number of rows, which could slow down our system. Modify the query to limit the number of rows returned to 10.

In [None]:
%%sql

SELECT 
    country_name, 
    time_period, 
    pct_managed_drinking_water_services AS pct_access_to_water
FROM 
    united_nations.Access_to_Basic_Services 
WHERE 
    Time_period = 2020
ORDER BY pct_access_to_water
LIMIT 10;

## Summary
Congratulations! You have used SQL commands to filter and sort data to answer a specific question. Please review your results and think about what other questions could be answered with this data.

#  

<div align="center" style=" font-size: 80%; text-align: center; margin: 0 auto">
<img src="https://raw.githubusercontent.com/Explore-AI/Pictures/refs/heads/master/ALX_banners/ALX_Navy.png"  style="width:100px"  ;/>
</div>