## Left Join

Now you'll explore the differences between performing an inner join and a left join using the `cities` and `countries` tables.

You'll begin by performing an inner join with the `cities` table on the left and the `countries` table on the right. Remember to alias the name of the city field as `city` and the name of the country field as `country`.

You will then change the query to a left join. Take note of how many records are in each query here!

Instructions

1. Fill in the code based on the instructions in the code comments to complete the inner join. Note how many records are in the result of the join in the query result.
2. Change the code to perform a `LEFT JOIN` instead of an `INNER JOIN`. After executing this query, note how many records the query result contains.

In [None]:
-- Select the city name (with alias), the country code, the country name (with alias), the region, and the city proper population
SELECT c1.name AS city,
       code,
       c2.name AS country,
       region,
       city_proper_pop
-- From left table (with alias)
FROM cities AS c1
-- Join to right table (with alias)
INNER JOIN countries AS c2
-- Match on country code
ON c1.country_code = c2.code
-- Order by descending country code
ORDER BY code DESC;

# city           code   country        region            city_proper_pop
# Harare         ZWE    Zimbabwe       Eastern Africa    1606000
# Lusaka         ZMB    Zambia         Eastern Africa    1742980
# Cape Town      ZAF    South Africa   Southern Africa   3740030
# Johannesburg   ZAF    South Africa   Southern Africa   4434830
# Durban         ZAF    South Africa   Southern Africa   3442360
# ...

In [None]:
SELECT c1.name AS city,
       code,
       c2.name AS country,
       region, 
       city_proper_pop
FROM cities AS c1
-- Join right table (with alias)
LEFT JOIN countries AS c2
-- Match on country code
ON c1.country_code = c2.code
-- Order by descending country code
ORDER BY code DESC;

# city        code   country   region   city_proper_pop
# Taichung    null   null      null     2752410
# Tainan      null   null      null     1885250
# Kaohsiung   null   null      null     2778920
# Bucharest   null   null      null     1883420
# Taipei      null   null      null     2704970
# ...

## Left join (2)

Next, you'll try out another example comparing an inner join to its corresponding left join. Before you begin though, take note of how many records are in both the `countries` and `languages` tables below.

You will begin with an inner join on the `countries` table on the left with the `languages` table on the right. Then you'll change the code to a left join in the next bullet.

Note the use of multi-line comments here using `/*` and `*/`.

Instructions

1. 
    1. Perform an inner join and alias the name of the country field as `country` and the name of the language field as `language`.
    2. Sort based on descending country name.
2. 
    1. Perform a left join instead of an inner join. Observe the result, and also note the change in the number of records in the result.
    2. Carefully review which records appear in the left join result, but not in the inner join result.

In [None]:
/*
Select country name AS country, the country's local name,
the language name AS language, and
the percent of the language spoken in the country
*/
SELECT c.name AS country,
       local_name,
       l.name AS language,
       percent
-- From left table (alias as c)
FROM countries AS c
-- Join to right table (alias as l)
INNER JOIN languages AS l
-- Match on fields
ON c.code = l.code
-- Order by descending country
ORDER BY country DESC;

# country    local_name   language   percent
# Zimbabwe   Zimbabwe     Shona      null
# Zimbabwe   Zimbabwe     Venda      null
# Zimbabwe   Zimbabwe     Tswana     null
# Zimbabwe   Zimbabwe     Tonga      null
# Zimbabwe   Zimbabwe     Sotho      null
# ...

In [None]:
/*
Select country name AS country, the country's local name,
the language name AS language, and
the percent of the language spoken in the country
*/
SELECT c.name AS country,
       local_name,
       l.name AS language,
       percent
-- From left table (alias as c)
FROM countries AS c
-- Join to right table (alias as l)
LEFT JOIN languages AS l
-- Match on fields
ON c.code = l.code
-- Order by descending country
ORDER BY country DESC;

# country    local_name   language   percent
# Zimbabwe   Zimbabwe     Chibarwe   null
# Zimbabwe   Zimbabwe     Shona      null
# Zimbabwe   Zimbabwe     Ndebele    null
# Zimbabwe   Zimbabwe     English    null
# Zimbabwe   Zimbabwe     Chewa      null
# ...

## Left join (3)

You'll now revisit the use of the `AVG()` function introduced in our introductory SQL course. You will use it in combination with left join to determine the average gross domestic product (GDP) per capita **by region** in 2010.

Instructions

1. 
    1. Begin with a left join with the `countries` table on the left and the `economies` table on the right.
    2. Focus only on records with 2010 as the `year`.
2. 
    1. Modify your code to calculate the average GDP per capita `AS avg_gdp` for **each region** in 2010.
    2. Select the region and `avg_gdp` fields.
3. Arrange this data on average GDP per capita for each region in 2010 from highest to lowest average GDP per capita.

In [None]:
-- Select name, region, and gdp_percapita
SELECT name, region, gdp_percapita
-- From countries (alias as c)
FROM countries AS c
-- Left join with economies (alias as e)
LEFT JOIN economies AS e
-- Match on code fields
ON c.code = e.code
-- Focus on 2010
WHERE year = 2010;

# name                   region                      gdp_percapita
# Afghanistan            Southern and Central Asia   539.667
# Angola                 Central Africa              3599.27
# Albania                Southern Europe             4098.13
# United Arab Emirates   Middle East                 34628.6
# Argentina              South America               10413
# Armenia                Middle East                 3121.78

In [None]:
-- Select fields
SELECT region,
       AVG(gdp_percapita) AS avg_gdp
-- From countries (alias as c)
FROM countries AS c
-- Left join with economies (alias as e)
LEFT JOIN economies AS e
-- Match on code fields
ON c.code = e.code
-- Focus on 2010
WHERE year = 2010
-- Group by region
GROUP BY region;

# region            avg_gdp
# Southern Africa   5051.59797363281
# Caribbean         11413.339454064
# Eastern Africa    1757.34816169739
# Southern Europe   22926.4109108665
# Eastern Asia      24962.8076171875
# ...

In [None]:
-- Select fields
SELECT region, 
       AVG(gdp_percapita) AS avg_gdp
-- From countries (alias as c)
FROM countries AS c
-- Left join with economies (alias as e)
LEFT JOIN economies AS e
-- Match on code fields
ON c.code = e.code
-- Focus on 2010
WHERE year = 2010
-- Group by region
GROUP BY region
-- Order by descending avg_gdp
ORDER BY avg_gdp DESC;

# region                      avg_gdp
# Western Europe              58130.9614955357
# Nordic Countries            57073.99765625
# North America               47911.509765625
# Australia and New Zealand   44792.384765625
# British Islands             43588.330078125
# ...

## Right join

Right joins aren't as common as left joins. One reason why is that you can always write a right join as a left join.

Instructions

1. The left join code is commented out here. Your task is to write a new query using rights joins that produces the same result as what the query using left joins produces. Keep this left joins code commented as you write your own query just below it using right joins to solve the problem.

    Note the order of the joins matters in your conversion to using right joins!

In [None]:
-- convert this code to use RIGHT JOINs instead of LEFT JOINs
/*
SELECT cities.name AS city,
       urbanarea_pop,
       countries.name AS country,
       indep_year,
       languages.name AS language,
       percent
FROM cities
LEFT JOIN countries
ON cities.country_code = countries.code
LEFT JOIN languages
ON countries.code = languages.code
ORDER BY city, language;
*/

SELECT cities.name AS city,
       urbanarea_pop,
       countries.name AS country,
       indep_year,
       languages.name AS language,
       percent
FROM languages
RIGHT JOIN countries
ON languages.code = countries.code
RIGHT JOIN cities
ON countries.code = cities.country_code
ORDER BY city, language;

# city        urbanarea_pop   country                indep_year   language   percent
# Abidjan     4765000         Cote d'Ivoire          1960         French     null
# Abidjan     4765000         Cote d'Ivoire          1960         Other      null
# Abu Dhabi   1145000         United Arab Emirates   1971         Arabic     null
# Abu Dhabi   1145000         United Arab Emirates   1971         English    null
# Abu Dhabi   1145000         United Arab Emirates   1971         Hindi      null
# ...

## Full join

In this exercise, you'll examine how your results differ when using a full join versus using a left join versus using an inner join with the `countries` and `currencies` tables.

You will focus on the North American `region` and also where the `name` of the country is missing. Dig in to see what we mean!

Begin with a full join with `countries` on the left and `currencies` on the right. The fields of interest have been `SELECT`ed for you throughout this exercise.

Then complete a similar left join and conclude with an inner join.

Instructions

1. Choose records in which `region` corresponds to North America or is `NULL`.
2. Repeat the same query as before, using a `LEFT JOIN` instead of a `FULL JOIN`. Note what has changed compared to the `FULL JOIN` result!
3. Repeat the same query again but use an `INNER JOIN` instead of a `FULL JOIN`. Note what has changed compared to the `FULL JOIN` and `LEFT JOIN` results!

In [None]:
SELECT name AS country, code, region, basic_unit
-- From countries
FROM countries
-- Join to currencies
FULL JOIN currencies
-- Match on code
USING (code)
-- Where region is North America or null
WHERE region = 'North America' OR region IS NULL
-- Order by region
ORDER BY region;

# country         code   region          basic_unit
# Bermuda         BMU    North America   Bermudian dollar
# United States   USA    North America   United States dollar
# Canada          CAN    North America   Canadian dollar
# Greenland       GRL    North America   null
# null            TMP    null            United States dollar
# ...

In [None]:
SELECT name AS country, code, region, basic_unit
-- From countries
FROM countries
-- Join to currencies
LEFT JOIN currencies
-- Match on code
USING (code)
-- Where region is North America or null
WHERE region = 'North America' OR region IS NULL
-- Order by region
ORDER BY region;

# country         code   region          basic_unit
# Bermuda         BMU    North America   Bermudian dollar
# Canada          CAN    North America   Canadian dollar
# United States   USA    North America   United States dollar
# Greenland       GRL    North America   null

In [None]:
SELECT name AS country, code, region, basic_unit
-- From countries
FROM countries
-- Join to currencies
INNER JOIN currencies
-- Match on code
USING (code)
-- Where region is North America or null
WHERE region = 'North America' OR region IS NULL
-- Order by region
ORDER BY region;

# country         code   region          basic_unit
# Bermuda         BMU    North America   Bermudian dollar
# Canada          CAN    North America   Canadian dollar
# United States   USA    North America   United States dollar

## Full join (2)

You'll now investigate a similar exercise to the last one, but this time focused on using a table with more records on the left than the right. You'll work with the `languages` and `countries` tables.

Begin with a full join with `languages` on the left and `countries` on the right. Appropriate fields have been selected for you again here.

Instructions

1. 
    1. Choose records in which `countries.name` starts with the capital letter `'V'` or is `NULL`.
    2. Arrange by `countries.name` in ascending order to more clearly see the results.
2. Repeat the same query as before, using a `LEFT JOIN` instead of a `FULL JOIN`. Note what has changed compared to the `FULL JOIN` result!
3. Repeat once more, but use an `INNER JOIN` instead of a `LEFT JOIN`. Note what has changed compared to the `FULL JOIN` and `LEFT JOIN` results.

In [None]:
SELECT countries.name, code, languages.name AS language
-- From languages
FROM languages
-- Join to countries
FULL JOIN countries
-- Match on code
USING (code)
-- Where countries.name starts with V or is null
WHERE countries.name LIKE 'V%' OR countries.name IS NULL
-- Order by ascending countries.name
ORDER BY countries.name;

# name      code   language
# Vanuatu   VUT    Tribal Languages
# Vanuatu   VUT    English
# Vanuatu   VUT    French
# Vanuatu   VUT    Other
# Vanuatu   VUT    Bislama
# ...

In [None]:
SELECT countries.name, code, languages.name AS language
-- From languages
FROM languages
-- Join to countries
LEFT JOIN countries
-- Match on code
USING (code)
-- Where countries.name starts with V or is null
WHERE countries.name LIKE 'V%' OR countries.name IS NULL
-- Order by ascending countries.name
ORDER BY countries.name;

# name      code   language
# Vanuatu   VUT    English
# Vanuatu   VUT    Other
# Vanuatu   VUT    French
# Vanuatu   VUT    Tribal Languages
# Vanuatu   VUT    Bislama
# ...

In [None]:
SELECT countries.name, code, languages.name AS language
-- From languages
FROM languages
-- Join to countries
INNER JOIN countries
-- Match using code
USING (code)
-- Where countries.name starts with V or is null
WHERE countries.name LIKE 'V%' OR countries.name IS NULL
-- Order by ascending countries.name
ORDER BY countries.name;

# name      code   language
# Vanuatu   VUT    Tribal Languages
# Vanuatu   VUT    Bislama
# Vanuatu   VUT    English
# Vanuatu   VUT    French
# Vanuatu   VUT    Other
# ...

## Full join (3)

You'll now explore using two consecutive full joins on the three tables you worked with in the previous two exercises.

Instructions

1. Complete a full join with `countries` on the left and `languages` on the right.
2. Next, full join this result with `currencies` on the right.
3. Use `LIKE` to choose the Melanesia and Micronesia regions (Hint: `'M%esia'`).
4. Select the fields corresponding to the country name `AS country`, region, language name `AS language`, and basic and fractional units of currency.

In [None]:
-- Select fields (with aliases)
SELECT c1.name AS country, region, 
       l.name AS language,
       basic_unit, frac_unit
-- From countries (alias as c1)
FROM countries AS c1
-- Join with languages (alias as l)
FULL JOIN languages AS l
-- Match on code
USING (code)
-- Join with currencies (alias as c2)
FULL JOIN currencies AS c2
-- Match on code
USING (code)
-- Where region like Melanesia and Micronesia
WHERE region LIKE 'M%esia';

# country            region       language      basic_unit             frac_unit
# Kiribati           Micronesia   English       Australian dollar      Cent
# Kiribati           Micronesia   Kiribati      Australian dollar      Cent
# Marshall Islands   Micronesia   Other         United States dollar   Cent
# Marshall Islands   Micronesia   Marshallese   United States dollar   Cent
# Nauru              Micronesia   Other         Australian dollar      Cent
# ...

## Review outer joins

A(n) ... join is a join combining the results of a ... join and a ... join.

None of the above are true.

## A table of two cities

This exercise looks to explore languages potentially _and_ most frequently spoken in the cities of Hyderabad, India and Hyderabad, Pakistan.

Instructions

1. 
    1. Create a `CROSS JOIN` with `cities AS c` on the left and `languages AS l` on the right.
    2. Make use of `LIKE` and `Hyder%` to choose Hyderabad in both countries.
    3. Select only the city name `AS city` and language name `AS language`.
2. Use an `INNER JOIN` instead of a `CROSS JOIN`. Think about what the difference will be in the results for this `INNER JOIN` result and the one for the `CROSS JOIN`.

In [None]:
-- Select fields
SELECT c.name AS city, l.name AS language
-- From cities (alias as c)
FROM cities AS c        
-- Join to languages (alias as l)
CROSS JOIN languages AS l
-- Where c.name like Hyderabad
WHERE c.name LIKE 'Hyder%';

# city                language
# Hyderabad (India)   Dari
# Hyderabad           Dari
# Hyderabad (India)   Pashto
# Hyderabad           Pashto
# Hyderabad (India)   Turkic
# ...

In [None]:
-- Select fields
SELECT c.name AS city, l.name AS language
-- From cities (alias as c)
FROM cities AS c      
-- Join to languages (alias as l)
INNER JOIN languages AS l
-- Match on country code
ON c.country_code = l.code
-- Where c.name like Hyderabad
WHERE c.name LIKE 'Hyder%';

# city                language
# Hyderabad (India)   Hindi
# Hyderabad (India)   Bengali
# Hyderabad (India)   Telugu
# Hyderabad (India)   Marathi
# Hyderabad (India)   Tamil
# ...

## Outer challenge

Now that you're fully equipped to use `OUTER JOIN`s, try a challenge problem to test your knowledge!

In terms of life expectancy for 2010, determine the names of the lowest five countries and their regions.

Instructions

1. Select country name `AS country`, `region`, and life expectancy `AS life_exp`.
3. Make sure to use `LEFT JOIN`, `WHERE`, `ORDER BY`, and `LIMIT`.

In [None]:
-- Select fields
SELECT c.name AS country, region, life_expectancy AS life_exp
-- From countries (alias as c)
FROM countries AS c
-- Join to populations (alias as p)
LEFT JOIN populations AS p
-- Match on country code
ON c.code = p.country_code
-- Focus on 2010
WHERE year = 2010
-- Order by life_exp
ORDER BY life_exp
-- Limit to 5 records
LIMIT 5;

# country                    region            life_exp
# Lesotho                    Southern Africa   47.4834
# Central African Republic   Central Africa    47.6253
# Sierra Leone               Western Africa    48.229
# Swaziland                  Southern Africa   48.3458
# Zimbabwe                   Eastern Africa    49.5747