# Data Visualization with Modern Data Science

> Querying Data with SQL

Yao-Jen Kuo <yaojenkuo@ntu.edu.tw> from [DATAINPOINT](https://www.datainpoint.com)

In [1]:
%LOAD sqlite3 db=data/covid19.db timeout=2 shared_cache=true

## The Elements of a SQL Statement

## (Recap) What is a SQL statement

A SQL statement is a combination of

- Keywords.
- Object names(e.g. databases/tables/columns/functions).
- Constants.
- Operators.

## A SQL statement

Trying to identify keywords, object names, constants, and operators, respectively.

In [2]:
SELECT lookup_table.Country_Region,
       SUM(daily_report.Confirmed)*100000 / SUM(lookup_table.Population) AS Incidence_Rate
  FROM daily_report
  JOIN lookup_table
    ON daily_report.Combined_Key = lookup_table.Combined_Key
 GROUP BY lookup_table.Country_Region
 ORDER BY Incidence_Rate DESC
 LIMIT 5;

Country_Region,Incidence_Rate
Andorra,49180
Denmark,47297
Slovenia,42936
San Marino,42338
Israel,41990


## The list of SQL keywords

- [SQLite Keywords](https://www.sqlite.org/lang_keywords.html)
- [Standard SQL Keywords](https://www.w3schools.com/sql/sql_ref_keywords.asp)

## The list of tables within connected database

In [3]:
SELECT name
  FROM sqlite_master
 WHERE name NOT LIKE 'sqlite%';

name
lookup_table
daily_report
time_series


## The list of columns of a specified table

In [4]:
SELECT name
  FROM PRAGMA_TABLE_INFO('daily_report');

name
Combined_Key
Last_Update
Confirmed
Deaths


In [5]:
SELECT name
  FROM PRAGMA_TABLE_INFO('lookup_table');

name
UID
Combined_Key
iso2
iso3
Country_Region
Province_State
Admin2
Lat
Long_
Population


In [6]:
SELECT name
  FROM PRAGMA_TABLE_INFO('time_series');

name
Date
Country_Region
Confirmed
Deaths
Daily_Cases
Daily_Deaths


## Calculated fields

## What is a calculated field

A calculated field is generated via the data from existed fields in connected databases.

## Several ways to generate a calculated field

- Functions.
- Constants.
- Operators.
- `CASE` statement.

## What is a function

In the context of programming, a function is a named sequence of statements that performs a desired operation.

## Using functions with SQL

```sql
SELECT FUNCTION_NAME(columns/constants, parameters);
```

## Functions in SQLite

- Scalar functions.
    - For type checking.
    - For numerics.
    - For texts.
- Date and time functions.
- Aggregate functions.

## Useful scalar functions for type checking

- `TYPEOF(X)`: The function returns a string that indicates the datatype.
- `PRAGMA_TABLE_INFO(table_name)`: The function returns one row for each column in the named table. 

## `TYPEOF(X)` function for a certain column

In [7]:
SELECT TYPEOF(iso2),
       TYPEOF(Lat),
       TYPEOF(Population)
  FROM lookup_table
 LIMIT 1;

TYPEOF(iso2),TYPEOF(Lat),TYPEOF(Population)
text,real,integer


## `PRAGMA_TABLE_INFO(table_name)` for an entire table

In [8]:
SELECT *
  FROM PRAGMA_TABLE_INFO('lookup_table');

cid,name,type,notnull,dflt_value,pk
0,UID,INTEGER,0,,1
1,Combined_Key,TEXT,0,,0
2,iso2,TEXT,0,,0
3,iso3,TEXT,0,,0
4,Country_Region,TEXT,0,,0
5,Province_State,TEXT,0,,0
6,Admin2,TEXT,0,,0
7,Lat,REAL,0,,0
8,Long_,REAL,0,,0
9,Population,INTEGER,0,,0


## Using constants to examine datatype

- `TEXT`
- `INTEGER`
- `REAL`
- `NULL`

## Using `AS` for alias of calculated fields

In [9]:
SELECT 'I am a TEXT' AS text_constant,
       5566 AS integer_constant,
       3.14159 AS real_constant,
       NULL AS null_constant;

text_constant,integer_constant,real_constant,null_constant
I am a TEXT,5566,3.14159,


In [10]:
SELECT TYPEOF('I am a TEXT') AS text_constant,
       TYPEOF(5566) AS integer_constant,
       TYPEOF(3.14159) AS real_constant,
       TYPEOF(NULL) AS null_constant;

text_constant,integer_constant,real_constant,null_constant
text,integer,real,


## Using `CAST(X AS datatype)` to convert data type for query results

Before casting.

In [11]:
SELECT Lat,
       Population
  FROM lookup_table
 LIMIT 1;

Lat,Population
33.93911,38928341


## Using `CAST(X AS datatype)` to convert data type for query results

After casting.

In [12]:
SELECT CAST(Lat AS INTEGER),
       CAST(Population AS REAL)
  FROM lookup_table
 LIMIT 1;

CAST(Lat AS INTEGER),CAST(Population AS REAL)
33,38928341.0


## Operators

- Numeric operators
    - `+`, `-`, `*`, `/`
- Text operators
    - `||`
- Relational operators
    - `=`, `!=`, `>`, `>=`, `<`, `<=`, `IN`, `BETWEEN`, `IS NULL`
- Logical operators
    - `AND`, `OR`, `NOT`

## Beware of using `/` dividing integers

In [13]:
SELECT 2/5,
       2*1.0/5,
       2/5*1.0,
       2/(5*1.0);

2/5,2*1.0/5,2/5*1.0,2/(5*1.0)
0,0.4,0.0,0.4


## Using `||` to concatenate texts

In [14]:
SELECT 'Tony' || ' ' || 'Stark' AS ironman;

ironman
Tony Stark


## Using `IS NULL` to find rows with missing values

In [15]:
SELECT *
  FROM lookup_table
 WHERE Province_State IS NULL
 LIMIT 3;

UID,Combined_Key,iso2,iso3,Country_Region,Province_State,Admin2,Lat,Long_,Population
4,Afghanistan,AF,AFG,Afghanistan,,,33.93911,67.709953,38928341.0
8,Albania,AL,ALB,Albania,,,41.1533,20.1683,2877800.0
10,Antarctica,AQ,ATA,Antarctica,,,-71.9499,23.347,


In [16]:
SELECT *
  FROM lookup_table
 WHERE Province_State IS NOT NULL
 LIMIT 3;

UID,Combined_Key,iso2,iso3,Country_Region,Province_State,Admin2,Lat,Long_,Population
16,"American Samoa, US",AS,ASM,US,American Samoa,,-14.271,-170.132,55641
60,"Bermuda, United Kingdom",BM,BMU,United Kingdom,Bermuda,,32.3078,-64.7505,62273
92,"British Virgin Islands, United Kingdom",VG,VGB,United Kingdom,British Virgin Islands,,18.4207,-64.64,30237


## Other scalar functions for numerics and texts

<https://www.sqlite.org/lang_corefunc.html>

## Common scalar functions for date and time functions

- `DATE(X)`
- `TIME(X)`
- `DATETIME(X)`
- `STRFTIME(format, X)`

Source: <https://www.sqlite.org/lang_datefunc.html>

In [None]:
SELECT DATE('now') AS date_of_now;

In [None]:
SELECT TIME('now', 'localtime') AS time_of_now;

In [None]:
SELECT DATETIME('now', 'localtime') AS datetime_of_now;

## Common format strings for date/time/datetime

- `%d` day of month: 00.
- `%H` hour: 00-24.
- `%j` day of year: 001-366.
- `%m` month: 01-12.
- `%M` minute: 00-59.
- `%S` seconds: 00-59.
- `%w` day of week 0-6 with Sunday==0.
- `%W` week of year: 00-53.
- `%Y` year: 0000-9999.

In [None]:
SELECT DATE('now') AS date_of_now,
       TIME('now', 'localtime') AS time_of_now, 
       STRFTIME('%d', DATE('now')) AS day_part,
       STRFTIME('%H', TIME('now', 'localtime')) AS hour_part,
       STRFTIME('%j', DATE('now')) AS year_day,
       STRFTIME('%m', DATE('now')) AS month_part,
       STRFTIME('%M', TIME('now', 'localtime')) AS minute_part,
       STRFTIME('%S', TIME('now', 'localtime')) AS second_part,
       STRFTIME('%w', DATE('now')) AS weekday,
       STRFTIME('%W', DATE('now')) AS nth_week,
       STRFTIME('%Y', DATE('now')) AS year_part;

## `CASE` statement is a conditional expression

We can add some "if this, then that..." logic to a SQL statement.

```sql
CASE WHEN condition_1 THEN result_1
     WHEN condition_2 THEN result_2
     ELSE result_else END AS calculated_field_alias
```

## `CASE` statements can be used to

- Turn numeric values into categories.
- Turn categories into new categories.

## Turn numeric values into categories

In [21]:
SELECT Combined_Key,
       CASE WHEN Lat >= 0 THEN 'Northern Hemisphere'
            ELSE 'Southern Hemisphere' END AS ns_hemisphere,
       CASE WHEN Long_ >= 0 THEN 'Eastern Hemisphere'
            ELSE 'Western Hemisphere' END AS ew_hemisphere
  FROM lookup_table
 LIMIT 10;

Combined_Key,ns_hemisphere,ew_hemisphere
Afghanistan,Northern Hemisphere,Eastern Hemisphere
Albania,Northern Hemisphere,Eastern Hemisphere
Antarctica,Southern Hemisphere,Eastern Hemisphere
Algeria,Northern Hemisphere,Eastern Hemisphere
"American Samoa, US",Southern Hemisphere,Western Hemisphere
Andorra,Northern Hemisphere,Eastern Hemisphere
Angola,Southern Hemisphere,Eastern Hemisphere
Antigua and Barbuda,Northern Hemisphere,Western Hemisphere
Azerbaijan,Northern Hemisphere,Eastern Hemisphere
Argentina,Southern Hemisphere,Western Hemisphere


## Turn categories into new categories

In [22]:
SELECT Combined_Key,
       CASE WHEN Combined_Key LIKE '%, US' THEN 'US'
            ELSE 'Non-US' END AS us_non_us
  FROM lookup_table
 LIMIT 10;

Combined_Key,us_non_us
Afghanistan,Non-US
Albania,Non-US
Antarctica,Non-US
Algeria,Non-US
"American Samoa, US",US
Andorra,Non-US
Angola,Non-US
Antigua and Barbuda,Non-US
Azerbaijan,Non-US
Argentina,Non-US


## We can also roughly divide the SQL functions into 2 sub categories

1. Scalar functions.
2. Aggregate functions.

## The difference between these 2 categories

The major difference is whether if the rows of output equals to the rows of input.

## Rows of input `iso3` equals to `LOWER(iso3)`

`LOWER()` is a scalar function.

In [23]:
SELECT iso3,
       LOWER(iso3)
  FROM lookup_table
 LIMIT 5;

iso3,LOWER(iso3)
AFG,afg
ALB,alb
ATA,ata
DZA,dza
ASM,asm


## Aggregate functions combine values from multiple rows and return a single result based on an operation on those values

In [24]:
SELECT SUM(Confirmed) AS total_confirmed
  FROM daily_report;

total_confirmed
436993162


## Common aggregate functions

- `AVG(X)`
- `COUNT(X)`: the count of the number of times that X is not NULL.
- `COUNT(*)`: the total number of rows in the group.
- `MAX(X)`
- `MIN(X)`
- `SUM(X)`

Source: <https://www.sqlite.org/lang_aggfunc.html>

In [25]:
SELECT COUNT(Province_State) AS number_of_non_nulls,
       COUNT(*) AS number_of_rows,
       COUNT(*) - COUNT(Province_State) AS number_of_nulls
  FROM lookup_table;

number_of_non_nulls,number_of_rows,number_of_nulls
4019,4218,199


## How to count the number of columns of a table

By querying metadata!

In [26]:
SELECT COUNT(*)
  FROM PRAGMA_TABLE_INFO('lookup_table');

COUNT(*)
10


## Aggregating Data with `GROUP BY`

## `GROUP BY` keyword

`GROUP BY` on its own, eliminates duplicate values from the results, similar to the combination of `DISTINCT` and `ORDER BY`.

In [27]:
SELECT DISTINCT Province_State
  FROM lookup_table
 WHERE Country_Region = 'Australia'
 ORDER BY Province_State;

Province_State
""
Australian Capital Territory
New South Wales
Northern Territory
Queensland
South Australia
Tasmania
Victoria
Western Australia


## `GROUP BY` keyword(Cont'd)

In [28]:
SELECT Province_State
  FROM lookup_table
 WHERE Country_Region = 'Australia'
 GROUP BY Province_State;

Province_State
""
Australian Capital Territory
New South Wales
Northern Territory
Queensland
South Australia
Tasmania
Victoria
Western Australia


## Using `GROUP BY` with `COUNT(*)`

In [29]:
SELECT Country_Region,
       COUNT(*) AS number_of_rows
  FROM lookup_table
 GROUP BY Country_Region
 ORDER BY number_of_rows DESC
 LIMIT 5;

Country_Region,number_of_rows
US,3406
Russia,85
Japan,50
Nigeria,39
India,38


## Using `GROUP BY` with `SUM()`

In [30]:
SELECT Country_Region,
       SUM(Confirmed) AS sum_confirmed
  FROM time_series
 WHERE Date = '2022-02-28'
 GROUP BY Country_Region
 LIMIT 5;

Country_Region,sum_confirmed
Afghanistan,173659
Albania,271563
Algeria,264936
Andorra,37999
Angola,98741


## Filtering an aggregate query using `HAVING`

We are already familiar with using `WHERE` for filtering, but aggregate functions cannot be used in a `WHERE` statement because they operate at the row level, and aggregate functions work across rows.

## Combining `GROUP BY`, `COUNT(*)`, and `HAVING`

In [31]:
SELECT Country_Region,
       COUNT(*) AS number_of_rows
  FROM lookup_table
 GROUP BY Country_Region
HAVING number_of_rows == 1
 LIMIT 5;

Country_Region,number_of_rows
Afghanistan,1
Albania,1
Algeria,1
Andorra,1
Angola,1


## Combining `GROUP BY`, `SUM()`, and `HAVING`

In [32]:
SELECT Country_Region,
       SUM(Confirmed) AS sum_confirmed
  FROM time_series
 WHERE Date = '2022-02-28'
 GROUP BY Country_Region
HAVING sum_confirmed > 10000000;

Country_Region,sum_confirmed
Brazil,28796571
France,22877926
Germany,14912626
India,42931045
Italy,12782836
Russia,16161596
Spain,10977524
Turkey,14089456
US,79044330
United Kingdom,19021076


## Sub-queries

## What is a sub-query?

A sub-query is nested inside another query, we just enclose the sub-query in parentheses and use it where needed.

## Useful sub-query skills

- Filtering with sub-queries in a `WHERE` statement.
- Creating a calculated field with sub-queries in a `SELECT` statement.
- Looking up columns of other tables with sub-queries in a `FROM` statement.

## Filtering with sub-queries in a `WHERE` statement

Which `Country_Region` has the largest `Daily_Cases` in `time_series`?

## Which `Country_Region` has the largest `Daily_Cases` in `time_series`

- First query for the largest `Daily_Cases`.
- Second query for the `Country_Region`.

In [33]:
SELECT MAX(Daily_Cases) AS max_daily_cases
  FROM time_series;

max_daily_cases
1368167


In [34]:
SELECT Country_Region,
       Date
  FROM time_series
 WHERE Daily_Cases = 1368167;

Country_Region,Date
US,2022-01-10


## Which `Country_Region` has the largest `Daily_Cases` in `time_series`(Cont'd)

Sub-query: combining 2 queries into one.

In [35]:
SELECT Country_Region,
       Date
  FROM time_series
 WHERE Daily_Cases = (
                         SELECT MAX(Daily_Cases) AS max_daily_cases
                           FROM time_series
                     );

Country_Region,Date
US,2022-01-10


## Creating a calculated field with sub-queries in a `SELECT` statement

Decomposing `Country_Region` in `lookup_table`.

## Decomposing `Country_Region` in `lookup_table`

- First query for the number of rows(denominator).
- Second query for the number of rows by `Country_Region`(numerator).

In [36]:
SELECT COUNT(*) AS number_of_rows
  FROM lookup_table;

number_of_rows
4218


In [37]:
SELECT Country_Region,
       COUNT(*) / 4218.0 AS country_region_composition
  FROM lookup_table
 GROUP BY Country_Region
 ORDER BY country_region_composition DESC
 LIMIT 5;

Country_Region,country_region_composition
US,0.807491702228544
Russia,0.0201517306780465
Japan,0.0118539592223803
Nigeria,0.0092460881934566
India,0.009009009009009


In [38]:
SELECT Country_Region,
       COUNT(*) / (
                       SELECT COUNT(*) AS number_of_rows
                         FROM lookup_table
                  )
       AS country_region_composition
  FROM lookup_table
 GROUP BY Country_Region
 ORDER BY country_region_composition DESC
 LIMIT 5;

Country_Region,country_region_composition
Zimbabwe,0
Zambia,0
Yemen,0
Winter Olympics 2022,0
Western Sahara,0


## Looking up columns of other tables with sub-queries in a `FROM` statement

Filtering Northern hemisphere observations in `daily_report`.

## Filtering Northeast hemisphere observations in `daily_report`

- First query for `Combined_Keys` of Northern hemisphere.
- Second query for filtering `daily_report`.

In [39]:
SELECT *
  FROM daily_report
 WHERE Combined_Key IN (
           SELECT Combined_Key
             FROM lookup_table
            WHERE Lat >= 0
       )
 LIMIT 5;

Combined_Key,Last_Update,Confirmed,Deaths
"Abbeville, South Carolina, US",2022-03-01 04:21:09,6612,68
"Abruzzo, Italy",2022-03-01 04:21:09,263351,2958
"Acadia, Louisiana, US",2022-03-01 04:21:09,14964,286
"Accomack, Virginia, US",2022-03-01 04:21:09,6902,103
"Ada, Idaho, US",2022-03-01 04:21:09,127102,989


## Joining Tables

## (Recap) What is a relational database

> A relational database is a digital database based on the relational model of data.

Source: <https://en.wikipedia.org/wiki/Relational_database>

## Why relational model

Using the relational model, we can build tables that eliminate duplicate data, are easier to maintain, and provide for increased flexibility in writing queries to get just the data we want.

## Joining tables using `JOIN...ON...` statement

The query examines both tables and then returns columns from both tables where the values match in the columns specified in the `ON` statement.

```sql
SELECT left_table.columns,
       right_table.columns
  FROM left_table 
  JOIN right_table
    ON left_table.join_key = right_table.join_key
```

## Joining `daily_report` with `lookup_table`

```sql
SELECT daily_report.columns,
       lookup_table.columns
  FROM daily_report
  JOIN lookup_table
    ON daily_report.Combined_Key = lookup_table.Combined_Key;
```

In [40]:
SELECT lookup_table.Country_Region,
       SUM(daily_report.Confirmed) AS sum_confirmed
  FROM daily_report
  JOIN lookup_table
    ON daily_report.Combined_Key = lookup_table.Combined_Key
 GROUP BY lookup_table.Country_Region
 LIMIT 5;

Country_Region,sum_confirmed
Afghanistan,173659
Albania,271563
Algeria,264936
Andorra,37999
Angola,98741


## Joining tables with key columns: Primary key

One or more columns whose values uniquely identify each row in a table.

1. Each column in the key must have a unique value for each row.
2. No column in the key can have missing values.

## Joining tables with key columns: Foreign key

One or more columns in a table that match the primary key of another table. Foreign key ensures that we don't end up with rows in one table that have no relation to rows in the other tables we can join them to.

## We can examine key columns via metadata

In [41]:
SELECT *
  FROM PRAGMA_TABLE_INFO('daily_report')
 WHERE pk >= 1;

cid,name,type,notnull,dflt_value,pk
0,Combined_Key,TEXT,0,,1


In [42]:
SELECT *
  FROM PRAGMA_TABLE_INFO('lookup_table')
 WHERE pk >= 1;

cid,name,type,notnull,dflt_value,pk
0,UID,INTEGER,0,,1


In [43]:
SELECT *
  FROM PRAGMA_TABLE_INFO('time_series')
 WHERE pk >= 1;

cid,name,type,notnull,dflt_value,pk
0,Date,TEXT,0,,1
1,Country_Region,TEXT,0,,2


## `JOIN` returns rows from the left and the right table where matching values are found

In [44]:
SELECT tw_kr.Combined_Key,
       tw_kr.Confirmed,
       tw_sg.Population
  FROM (
           SELECT *
             FROM daily_report
            WHERE Combined_Key IN ('Taiwan', 'Korea, South') 
       )
       AS tw_kr
       JOIN
       (
           SELECT *
             FROM lookup_table
            WHERE iso2 IN ('TW', 'SG') 
       )
       AS tw_sg ON tw_kr.Combined_Key = tw_sg.Combined_Key;

Combined_Key,Confirmed,Population
Taiwan,20489,23816775


## In contrast to `JOIN`, the `LEFT JOIN` keyword returns all rows from the left table and display blank rows from the other table if no matching values are found

```sql
SELECT left_table.columns,
       right_table.columns
  FROM left_table 
  LEFT JOIN right_table
    ON left_table.join_key = right_table.join_key
```

In [45]:
SELECT tw_kr.Combined_Key,
       tw_kr.Confirmed,
       tw_sg.Population
  FROM (
           SELECT *
             FROM daily_report
            WHERE Combined_Key IN ('Taiwan', 'Korea, South') 
       )
       AS tw_kr
       LEFT JOIN
       (
           SELECT *
             FROM lookup_table
            WHERE iso2 IN ('TW', 'SG') 
       )
       AS tw_sg ON tw_kr.Combined_Key = tw_sg.Combined_Key;

Combined_Key,Confirmed,Population
"Korea, South",3273449,
Taiwan,20489,23816775.0


## Other `JOIN` types that are not directly supported by SQLite

- `RIGHT JOIN`
- `FULL JOIN`

## We can perform `RIGHT JOIN` in SQLite by switching left table to right

In [46]:
SELECT tw_kr.Combined_Key,
       tw_kr.Confirmed,
       tw_sg.Population
  FROM (
           SELECT *
             FROM lookup_table
            WHERE iso2 IN ('TW', 'SG')
       )
       AS tw_sg
       LEFT JOIN
       (
           SELECT *
             FROM daily_report
            WHERE Combined_Key IN ('Taiwan', 'Korea, South')
       )
       AS tw_kr ON tw_kr.Combined_Key = tw_sg.Combined_Key;

Combined_Key,Confirmed,Population
Taiwan,20489.0,23816775
,,5850343


## Joining tables is like concatenating tables horizontally

![Imgur](https://i.imgur.com/hq7fS67.png)

Source: [Pandas User Guide](https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html)

## We can also concatenating tables vertically via `UNION` keyword

![Imgur](https://i.imgur.com/B7xawvp.png)

Source: [Pandas User Guide](https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html)

In [47]:
SELECT Country_Region,
       iso2 AS iso,
       'iso2' AS iso_type
  FROM lookup_table
 UNION
SELECT Country_Region,
       iso3 AS iso,
       'iso3' AS iso_type
  FROM lookup_table
 LIMIT 10;

Country_Region,iso,iso_type
Afghanistan,AF,iso2
Afghanistan,AFG,iso3
Albania,AL,iso2
Albania,ALB,iso3
Algeria,DZ,iso2
Algeria,DZA,iso3
Andorra,AD,iso2
Andorra,AND,iso3
Angola,AGO,iso3
Angola,AO,iso2


## We can perform `FULL JOIN` in SQLite via `UNION`

In [48]:
SELECT tw_kr.Combined_Key,
       tw_kr.Confirmed,
       tw_sg.Population
  FROM (
           SELECT *
             FROM daily_report
            WHERE Combined_Key IN ('Taiwan', 'Korea, South') 
       )
       AS tw_kr
       LEFT JOIN
       (
           SELECT *
             FROM lookup_table
            WHERE iso2 IN ('TW', 'SG') 
       )
       AS tw_sg ON tw_kr.Combined_Key = tw_sg.Combined_Key
 UNION
SELECT tw_kr.Combined_Key,
       tw_kr.Confirmed,
       tw_sg.Population
  FROM (
           SELECT *
             FROM lookup_table
            WHERE iso2 IN ('TW', 'SG')
       )
       AS tw_sg
       LEFT JOIN
       (
           SELECT *
             FROM daily_report
            WHERE Combined_Key IN ('Taiwan', 'Korea, South')
       )
       AS tw_kr ON tw_kr.Combined_Key = tw_sg.Combined_Key;

Combined_Key,Confirmed,Population
,,5850343.0
"Korea, South",3273449.0,
Taiwan,20489.0,23816775.0


## Putting what we have so far all together

SQL is about the order of keywords, so follow this convention:

```sql
SELECT column_names,
       CASE WHEN conditions THEN result
            ...
            ELSE result_else END AS alias
  FROM left_table
  JOIN right_table
    ON table_name.join_key = right_table.join_key
 WHERE conditions
 GROUP BY column_names
HAVING aggregated_conditions
 UNION ...
 ORDER BY column_names
 LIMIT n;
```