## Simple selections

It's time to begin writing your own queries! In this first coding exercise, you will use `SELECT` statements to retrieve columns from a database table. You'll be working with the `eurovision table`, which contains data relating to individual country performance at the [Eurovision Song Contest](https://en.wikipedia.org/wiki/Eurovision_Song_Contest) from 1998 to 2012.

After selecting columns, you'll also practice renaming columns, and limiting the number of rows returned.

Instructions

1. `SELECT` the `country` column `FROM` the `eurovision` table.
2. Amend your query to return the `points` column instead of the `country` column.
3. Use `TOP` to change the existing query so that only the first `50` rows are returned.
4. Return a list of unique countries using `DISTINCT`. Give the results an alias of `unique_country`.

In [None]:
-- SELECT the country column FROM the eurovision table
SELECT country
FROM eurovision;

# country
# Israel
# France
# Sweden
# Croatia
# Portugal
# ...

In [None]:
-- Select the points column
SELECT points
FROM eurovision;

# points
# 53
# 107
# 33
# 45
# 57
# ...

In [None]:
-- Limit the number of rows returned
SELECT TOP(50) points 
FROM eurovision;

# points
# 53
# 107
# 33
# 45
# 57
# ...

In [None]:
-- Return unique countries and use an alias
SELECT DISTINCT country AS unique_country 
FROM eurovision;

# unique_country
# Albania
# Andorra
# Armenia
# Austria
# Azerbaijan
# ...

## More selections

Now that you've practiced how to select one column at a time, it's time to practice selecting more than one column. You'll continue working with the `eurovision` table.

Instructions

1. `SELECT` the `country` and `event_year` columns from the `eurovision` table.
2. Use a shortcut to amend the current query, returning ALL rows from ALL columns in the table.
3. This time, restrict the rows to the first half using 'PERCENT', using the same shortcut as before to return all columns.

In [None]:
-- Select country and event_year from eurovision
SELECT country,
       event_year
FROM eurovision;

# country    event_year
# Israel     2009
# France     2009
# Sweden     2009
# Croatia    2009
# Portugal   2009
# ...

In [None]:
-- Amend the code to select all rows and columns
SELECT *
FROM eurovision;

# euro_id   event_year   country    gender   group_type   place   points   host_country   host_region   is_final   sf_number   song_in_english
# 1         2009         Israel     Female   Group        16      53       Away           Away          1          null        1
# 2         2009         France     Female   Solo         8       107      Away           Away          1          null        0
# 3         2009         Sweden     Female   Solo         21      33       Away           Away          1          null        1
# 4         2009         Croatia    Both     Group        18      45       Away           Away          1          null        0
# 5         2009         Portugal   Both     Group        15      57       Away           Away          1          null        0
# ...

In [None]:
-- Return all columns, restricting the percent of rows returned
SELECT TOP (50) PERCENT * 
FROM eurovision;

# euro_id   event_year   country    gender   group_type   place   points   host_country   host_region   is_final   sf_number   song_in_english
# 1         2009         Israel     Female   Group        16      53       Away           Away          1          null        1
# 2         2009         France     Female   Solo         8       107      Away           Away          1          null        0
# 3         2009         Sweden     Female   Solo         21      33       Away           Away          1          null        1
# 4         2009         Croatia    Both     Group        18      45       Away           Away          1          null        0
# 5         2009         Portugal   Both     Group        15      57       Away           Away          1          null        0
# ...

## Order by

In this exercise, you'll practice the use of `ORDER BY` using the `grid` dataset. It's loaded and waiting for you! It contains a subset of wider publicly available information on US power outages.

Some of the main columns include:

- `description`: The reason/ cause of the outage.
- `nerc_region`: The North American Electricity Reliability Corporation was formed to ensure the reliability of the grid and comprises several regional entities).
- `demand_loss_mw`: How much energy was not transmitted/consumed during the outage.

Instructions

1. Select `description` and `event_date` from `grid`. Your query should return the first 5 rows, ordered by `event_date`.
2. Modify your code based on the comments provided on the right.

In [None]:
-- Select the first 5 rows from the specified columns
SELECT TOP(5) description, 
              event_date 
FROM grid 
-- Order your results by the event_date column
ORDER BY event_date;

# description                     event_date
# Electrical Fault at Generator   2011-01-11
# Winter Storm                    2011-01-12
# Firm System Load Shed           2011-01-13
# Vandalism                       2011-01-18
# Vandalism                       2011-01-23

In [None]:
-- Select the top 20 rows from description, nerc_region and event_date
SELECT TOP (20) description,
               nerc_region,
               event_date
FROM grid 
-- Order by nerc_region, affected_customers & event_date
-- Event_date should be in descending order
ORDER BY nerc_region,
         affected_customers,
         event_date DESC;

# description                                               nerc_region   event_date
# Suspected Physical Attack                                 ERCOT         2014-06-12
# Fuel Supply Emergency  Coal                               ERCOT         2014-06-06
# Physical Attack  Vandalism                                ERCOT         2014-06-03
# Suspected Physical Attack                                 FRCC          2013-03-18
# Load Shed of 100+ MW Under Emergency Operational Policy   FRCC          2013-06-17
# ...

## Where

You won't usually want to retrieve _every_ row in your database. You'll have specific information you need in order to answer questions from your boss or colleagues.

The `WHERE` clause is essential for selecting, updating (and deleting!) data from your tables. You'll continue working with the `grid` dataset for this exercise.

Instructions

1. Select the `description` and `event_year` columns.
2. Return rows `WHERE` the description is `'Vandalism'`.

In [None]:
-- Select description and event_year
SELECT description,
       event_year 
FROM grid 
-- Filter the results
WHERE description = 'Vandalism';

# description   event_year
# Vandalism     2014
# Vandalism     2013
# Vandalism     2013
# Vandalism     2013
# Vandalism     2013
# ...

## Where again

When filtering strings, you need to wrap your value in 'single quotes', as you did in the previous exercise. You don't need to do this for numeric values, but you DO need to use single quotes for date columns.

In this course, dates are always represented in the `YYYY-MM-DD` format (Year-Month-Day), which is the default in Microsoft SQL Server.

Instructions

1. Select the `nerc_region` and `demand_loss_mw` columns, limiting the results to those where `affected_customers` is greater than or equal to 500000 (500,000)
2. Update your code to select `description` and `affected_customers`, returning records where the `event_date` was the 22nd December, 2013.
3. Limit the results to those where the `affected_customers` is `BETWEEN` `50000` and `150000`, and order in descending order of `event_date`.

In [None]:
-- Select nerc_region and demand_loss_mw
SELECT nerc_region,
       demand_loss_mw
FROM grid 
-- Retrieve rows where affected_customers is >= 500000  (500,000)
WHERE affected_customers >= 500000;

# nerc_region   demand_loss_mw
# WECC          3900
# WECC          3300
# WECC          9750
# RFC           null
# SERC          4545
# ...

In [None]:
-- Select description and affected_customers
SELECT description, 
       affected_customers 
FROM grid 
-- Retrieve rows where the event_date was the 22nd December, 2013    
WHERE event_date = '2013-12-22';

# description               affected_customers
# Severe Weather  IceSnow   59000
# Severe Weather  IceSnow   50000
# Severe Weather  IceSnow   140735

In [None]:
-- Select description, affected_customers and event date
SELECT description, 
       affected_customers,
       event_date
FROM grid 
-- The affected_customers column should be >= 50000 and <=150000   
WHERE affected_customers BETWEEN 50000 AND 150000 
-- Define the order
ORDER BY event_date DESC;

# description                     affected_customers   event_date
# Severe Weather  Thunderstorms   127000               2014-06-30
# Severe Weather  Thunderstorms   120000               2014-06-30
# Severe Weather  Thunderstorms   138802               2014-06-18
# Severe Weather  Thunderstorms   55951                2014-06-15
# Severe Weather  Thunderstorms   66383                2014-06-10
# ...

## Working with NULL values

A NULL value could mean 'zero' - if something doesn't happen, it can't be logged in a table. However, NULL can also mean 'unknown' or 'missing'. So consider if it is appropriate to replace them in your results. NULL values provide feedback on data quality. If you have NULL values, and you didn't expect to have any, then you have an issue with either how data is captured or how it's entered in the database.

In this exercise, you'll practice filtering for NULL values, excluding them from results, and replacing them with alternative values.

Instructions

1. Use a shortcut to select all columns from `grid`. Then filter the results to only include rows where `demand_loss_mw` is unknown or missing.
2. Adapt your code to return rows where `demand_loss_mw` is not unknown or missing.

In [None]:
-- Retrieve all columns
SELECT * 
FROM grid 
-- Return only rows where demand_loss_mw is missing or unknown  
WHERE demand_loss_mw IS NULL;

# grid_id   description                     event_year   event_date   restore_date   nerc_region   demand_loss_mw   affected_customers
# 1         Severe Weather  Thunderstorms   2014         2014-06-30   2014-07-01     RFC           null             127000
# 3         Fuel Supply Emergency  Coal     2014         2014-06-27   null           MRO           null             null
# 4         Physical Attack  Vandalism      2014         2014-06-24   2014-06-24     SERC          null             null
# 5         Physical Attack  Vandalism      2014         2014-06-19   2014-06-19     SERC          null             null
# 6         Physical Attack  Vandalism      2014         2014-06-18   2014-06-18     WECC          null             null
# ...

In [None]:
-- Retrieve all columns
SELECT * 
FROM grid 
-- Return rows where demand_loss_mw is not missing or unknown   
WHERE demand_loss_mw IS NOT NULL;

# grid_id   description                                             event_year   event_date   restore_date   nerc_region   demand_loss_mw   affected_customers
# 2         Severe Weather  Thunderstorms                           2014         2014-06-30   2014-07-01     MRO           424              120000
# 14        Severe Weather  Thunderstorms                           2014         2014-06-07   2014-06-08     SERC          217              65000
# 16        Severe Weather  Thunderstorms                           2014         2014-06-05   2014-06-07     SERC          494              38500
# 18        Electrical System Islanding                             2014         2014-06-03   2014-06-03     WECC          338              null
# 24        Public Appeal to Reduce Electricity Usage  Wild Fires   2014         2014-05-16   2014-05-16     WECC          3900             1400000
# ...

## Exploring classic rock songs

It's time to rock and roll! In this set of exercises, you'll use the `songlist` table, which contains songs featured on the playlists of 25 classic rock radio stations.

First, let's get familiar with the data.

Instructions

1. Retrieve the `song`, `artist`, and `release_year` columns from the `songlist` table.
2. Make sure there are no `NULL` values in the `release_year` column.
3. Order the results by `artist` and `release_year`.

In [None]:
-- Retrieve the song, artist and release_year columns
SELECT song,
       artist,
       release_year 
FROM songlist

# song                    artist           release_year
# Keep On Loving You      REO Speedwagon   1980
# Keep Pushin  1977       REO Speedwagon   null
# Like You Do             REO Speedwagon   null
# Ridin the Storm Out     REO Speedwagon   null
# Roll With the Changes   REO Speedwagon   null
# ...

In [None]:
-- Retrieve the song, artist and release_year columns
SELECT song, 
       artist, 
       release_year 
FROM songlist 
-- Ensure there are no missing or unknown values in the release_year column
WHERE release_year IS NOT NULL

# song                 artist             release_year
# Keep On Loving You   REO Speedwagon     1980
# Take It on the Run   REO Speedwagon     1981
# Jessies Girl         Rick Springfield   1981
# Back Off Boogaloo    Ringo Starr        1972
# Early 1970 [*]       Ringo Starr        1971
# ...

In [None]:
-- Retrieve the song, artist and release_year columns
SELECT song, 
       artist, 
       release_year 
FROM songlist 
-- Ensure there are no missing or unknown values in the release_year column
WHERE release_year IS NOT NULL 
-- Arrange the results by the artist and release_year columns
ORDER BY artist,
         release_year;
    
# song                    artist         release_year
# Rockin Into the Night   .38 Special    1980
# Hold On Loosely         .38 Special    1981
# Caught Up in You        .38 Special    1982
# Art For Arts Sake       10cc           1975
# Kryptonite              3 Doors Down   2000
# ...

## Exploring classic rock songs - AND/OR

Having familiarized yourself with the `songlist` table, you'll now extend your `WHERE` clause from the previous exercise.

Instructions

1. Extend the `WHERE` clause so that the results are those with a `release_year` greater than or equal to `1980` and less than or equal to `1990`.
2. Update your query to use an `OR` instead of an `AND`.

In [None]:
SELECT song, 
       artist, 
       release_year
FROM songlist 
-- Retrieve records greater than and including 1980
WHERE release_year >= 1980 AND 
-- Also retrieve records up to and including 1990
      release_year <= 1990 
ORDER BY artist, 
         release_year;

# song                    artist        release_year
# Rockin Into the Night   .38 Special   1980
# Hold On Loosely         .38 Special   1981
# Caught Up in You        .38 Special   1982
# Take On Me              a-ha          1985
# Back In Black           AC/DC         1980
# ...

In [None]:
SELECT song, 
       artist, 
       release_year
FROM songlist 
-- Retrieve records greater than and including 1980
WHERE release_year >= 1980 OR 
-- Also retrieve records up to and including 1990
      release_year <= 1990 
ORDER BY artist, 
         release_year;
    
# song                    artist        release_year
# Rockin Into the Night   .38 Special   1980
# Hold On Loosely         .38 Special    1981
# Caught Up in You        .38 Special    1982
# Art For Arts Sake       10cc           1975
# Kryptonite              3 Doors Down   2000
# ...

## Using parentheses in your queries

You can use parentheses to make the intention of your code clearer. This becomes very important when using AND and OR clauses, to ensure your queries return the exact subsets you need.

Instructions

1. Select all artists beginning with `B` who released tracks in `1986`, but also retrieve any records where the `release_year` is greater than `1990`.

In [None]:
SELECT artist, 
       release_year, 
       song 
FROM songlist 
-- Choose the correct artist and specify the release year
WHERE (artist LIKE 'B%' AND release_year = 1986) OR
-- Or return all songs released after 1990
      release_year > 1990 
-- Order the results
ORDER BY release_year, 
         artist, 
         song;
        
# artist         release_year   song
# Beastie Boys   1986           (You Gotta) Fight for Your Right (To Party)
# Beastie Boys   1986           No Sleep Till Brooklyn
# Bon Jovi       1986           Livin On A Prayer
# Bon Jovi       1986           Wanted Dead or Alive
# Bon Jovi       1986           You Give Love A Bad Name
# ...