# Joining Data with SQL

## For increased clarity, please view the [Summary_README](https://github.com/ursumarius/sql-datacamp/blob/main/Summary_README.ipynb) in the root directory.

Here you can access every table used in the course. To access each table, you will need to specify the `world` schema in your queries (e.g., `world.countries` for the `countries` table, and `world.languages` for the `languages` table).

--- 
_Note: When using sample integrations such as those that contain course data, you have read-only access. You can run queries, but cannot make any changes such as adding, deleting, or modifying the data (e.g., creating tables, views, etc.)._

## Notes

Concepts I want to keep.

### Join: Inner, Left, Right, Full, Cross

Inner: both left and right tables must have non-null values

Left: the first table must have complete data for that row to be carried forward

Right: the second table must have complete data for it to be carried

Full: both tables must have

Cross: no need to stipulate a column on which to join, because all possible combinations are outputted


**Objective:** Doing some research on Melanesia and Micronesia, and are interested in pulling information about languages and currencies into the data we see for these regions in the countries table. 

Since `languages` and `currencies` exist in separate tables, this will require two consecutive full joins involving the `countries`, `languages` and `currencies` tables.

In [2]:
-- 
SELECT 
	c1.name AS country, 
    region, 
    l.name AS language,
	basic_unit, 
    frac_unit
FROM world.countries as c1 
-- Full join with languages (alias as l)
FULL JOIN world.languages as l
USING (code)
-- Full join with currencies (alias as c2)
FULL JOIN world.currencies as c2
USING (code)
WHERE region LIKE 'M%esia';

Unnamed: 0,country,region,language,basic_unit,frac_unit
0,Kiribati,Micronesia,English,Australian dollar,Cent
1,Kiribati,Micronesia,Kiribati,Australian dollar,Cent
2,Marshall Islands,Micronesia,Other,United States dollar,Cent
3,Marshall Islands,Micronesia,Marshallese,United States dollar,Cent
4,Nauru,Micronesia,Other,Australian dollar,Cent
5,Nauru,Micronesia,English,Australian dollar,Cent
6,Nauru,Micronesia,Nauruan,Australian dollar,Cent
7,New Caledonia,Melanesia,Other,CFP franc,Centime
8,New Caledonia,Melanesia,French,CFP franc,Centime
9,Palau,Micronesia,Other,United States dollar,Cent


**Objective:** compare with itself. 

Careful at giving clear `WHERE` clauses.

By joining the table with itself, we can read multiple rows. We can then transform these entries into columns using Alias.

In [1]:
-- Alias size 2 times
SELECT 
	p1.country_code, 
    p1.size AS size2010, 
    p2.size AS size2015
FROM world.populations AS p1
INNER JOIN world.populations AS p2
ON p1.country_code = p2.country_code
WHERE 
	p1.year = 2010
-- Filter such that p1.year is always five years before p2.year
    AND p2.year = p1.year + 5;

Unnamed: 0,country_code,size2010,size2015
0,ABW,101597,103889.0
1,AFG,27962208,32526562.0
2,AGO,21219954,25021974.0
3,ALB,2913021,2889167.0
4,AND,84419,70473.0
...,...,...,...
212,XKX,1775680,1801800.0
213,YEM,23591972,26832216.0
214,ZAF,50979432,55011976.0
215,ZMB,13917439,16211767.0


### WHERE Subqueries:

**Objective:** We are interested in analyzing `inflation` and `unemployment` rate for certain countries in 2015. 

We are not interested in countries with a form of governemnt that includes the terms `Republic` or `Monarchy`, but are interested in all other forms of government.

In [3]:
-- Select relevant fields
SELECT code, inflation_rate, unemployment_rate
FROM world.economies
WHERE year = 2015 
  AND code NOT IN -- Subquery returning country codes filtered on gov_form
	(SELECT code 
  	FROM world.countries
  	WHERE gov_form LIKE '%Monarchy%' OR gov_form LIKE '%Republic%')
ORDER BY inflation_rate;

Unnamed: 0,code,inflation_rate,unemployment_rate
0,AFG,-1.549,
1,CHE,-1.14,3.178
2,PRI,-0.751,12.0
3,ROU,-0.596,6.812
4,TLS,0.553,
5,MNE,1.204,
6,SRB,1.392,18.2
7,HKG,3.037,3.296
8,ARE,4.07,
9,MAC,4.564,1.825


**Objective:** Determine the top 10 capital cities in `Europe` and the `Americas` by `city_perc`, a calculated metric. `city_perc` is a percentage that calculates the `proper` population in a city as a percentage of the total population in the wider `metro` area.

In [4]:
-- Select fields from cities
SELECT name, country_code, city_proper_pop, metroarea_pop, 
    city_proper_pop / metroarea_pop * 100 AS city_perc
FROM world.cities

-- Use subquery to filter city name
WHERE name in 
	(SELECT capital 
    FROM world.countries
    WHERE continent = 'Europe' OR region LIKE '%America')

-- Add filter condition such that metroarea_pop does not have null values
AND metroarea_pop is not NULL

-- Sort and limit the result
ORDER BY city_perc DESC
LIMIT 10;

Unnamed: 0,name,country_code,city_proper_pop,metroarea_pop,city_perc
0,Lima,PER,8852000,10750000,82.344186
1,Bogota,COL,7878783,9800000,80.395746
2,Moscow,RUS,12197596,16170000,75.433493
3,Vienna,AUT,1863881,2600000,71.687728
4,Montevideo,URY,1305082,1947604,67.009616
5,Caracas,VEN,1943901,2923959,66.481817
6,Rome,ITA,2877215,4353775,66.085523
7,Brasilia,BRA,2556149,3919864,65.210146
8,London,GBR,8673713,13879757,62.491822
9,Budapest,HUN,1759407,2927944,60.090184
