# 進階的 SQL 五十道練習

> 快速複習資料查詢語言

[數聚點](https://www.datainpoint.com) | 郭耀仁 <yaojenkuo@datainpoint.com>

In [1]:
%LOAD mysql db=imdb user=root password='hahowsql'

## 從資料表選擇欄位

## 透過保留字 `SELECT` 與 `FROM`

- `SELECT` 是「選擇」欄位。
- `FROM` 是指定「從」哪一個資料庫中的哪一個資料表查詢資料。

```sql
SELECT columns
  FROM database.table;
```

In [None]:
SELECT id,
       title,
       release_year
  FROM imdb.movies;

## 或者指定資料庫後再從資料表選擇欄位

```sql
USE database;

SELECT columns
  FROM table;
```

In [3]:
USE imdb;

In [None]:
SELECT id,
       title,
       release_year
  FROM movies;

## 調整查詢結果顯示列數

## 透過保留字 `LIMIT` 與 `OFFSET`

- `LIMIT m` 是調整查詢結果僅顯示 `m` 列。
- `OFFSET n` 是略過前 `n` 列觀測值。

```sql
SELECT columns
  FROM table
 LIMIT m
OFFSET n
``` 

## 保留字 `LIMIT` 與 `OFFSET` 的例子

- 例如 `LIMIT 5 OFFSET 6` 會顯示第 7 列到第 11 列觀測值。
- 例如 `LIMIT 8 OFFSET 9` 會顯示第 10 列到第 17 列觀測值。
- 例如 `LIMIT 6 OFFSET 4` 會顯示第 5 列到第 10 列觀測值。

In [5]:
SELECT id,
       title
  FROM movies
 LIMIT 5
OFFSET 6;

id,title
7,The Lord of the Rings: The Return of the King
8,Pulp Fiction
9,The Lord of the Rings: The Fellowship of the Ring
10,"The Good, the Bad and the Ugly"
11,Forrest Gump


In [6]:
SELECT id,
       title
  FROM movies
 LIMIT 8
OFFSET 9;

id,title
10,"The Good, the Bad and the Ugly"
11,Forrest Gump
12,Fight Club
13,The Lord of the Rings: The Two Towers
14,Inception
15,Spider-Man: Across the Spider-Verse
16,Star Wars: Episode V - The Empire Strikes Back
17,The Matrix


In [7]:
SELECT id,
       title
  FROM movies
 LIMIT 6
OFFSET 4;

id,title
5,12 Angry Men
6,Schindler's List
7,The Lord of the Rings: The Return of the King
8,Pulp Fiction
9,The Lord of the Rings: The Fellowship of the Ring
10,"The Good, the Bad and the Ugly"


## 排序查詢結果

## 透過保留字 `ORDER BY`

- 以指定變數的大小來排序查詢結果。
- 預設的排序方式為遞增。
- 加上保留字 `DESC` 讓查詢結果遞減排序。

```sql
SELECT columns
  FROM table
 ORDER BY columns,
          columns DESC;
```

In [8]:
SELECT title,
       runtime
  FROM movies
 ORDER BY runtime
 LIMIT 5;

title,runtime
Sherlock Jr,45
The Kid,68
The General,78
Before Sunset,80
Toy Story,81


In [9]:
SELECT title,
       runtime
  FROM movies
 ORDER BY runtime DESC
 LIMIT 5;

title,runtime
Gone with the Wind,238
Once Upon a Time in America,229
Lawrence of Arabia,218
Ben-Hur,212
Seven Samurai,207


## 篩選觀測值

## 透過保留字 `WHERE`

以條件（Conditions）作為篩選觀測值的依據。

```sql
SELECT columns
  FROM table
 WHERE conditions;
```

In [10]:
SELECT title,
       rating,
       release_year
  FROM movies
 WHERE release_year >= 2022;

title,rating,release_year
Spider-Man: Across the Spider-Verse,8.9,2023
Oppenheimer,8.8,2023
Top Gun: Maverick,8.3,2022


## 衍生計算欄位

## 生成衍生計算欄位的三種方式

- 透過運算符。
- 透過條件邏輯。
- 透過函數。

## 透過運算符生成衍生計算欄位

- 運算符的種類：
    - 數值運算符 e.g. `+`。
    - 關係運算符 e.g. `=`。
    - 邏輯運算符 e.g. `AND`。

In [11]:
SELECT title,
       runtime,
       runtime DIV 60 AS hours,
       runtime % 60 AS minutes
  FROM movies
 LIMIT 5;

title,runtime,hours,minutes
The Shawshank Redemption,142,2,22
The Godfather,175,2,55
The Dark Knight,152,2,32
The Godfather Part II,202,3,22
12 Angry Men,96,1,36


## 條件邏輯

## 透過 `CASE WHEN ... THEN ... ELSE ...` 敘述式

```sql
SELECT CASE WHEN condition_1 THEN result_1
            WHEN condition_2 THEN result_2
            ELSE result_3
        END AS alias
  FROM table;
```

In [12]:
SELECT title,
       runtime,
       CASE WHEN runtime > 180 THEN 'Long'
            WHEN runtime BETWEEN 90 AND 180 THEN 'Medium'
            ELSE 'Short'
        END AS movie_length
  FROM movies
 LIMIT 5;

title,runtime,movie_length
The Shawshank Redemption,142,Medium
The Godfather,175,Medium
The Dark Knight,152,Medium
The Godfather Part II,202,Long
12 Angry Men,96,Medium


## 函數

## 函數的大分類

- 通用函數。
- 聚合函數。

```sql
SELECT FUNCTION(columns, parameters) AS alias;
```

In [13]:
SELECT title,
       runtime,
       FLOOR(runtime / 60) AS hours,
       MOD(runtime, 60) AS minutes
  FROM movies
 LIMIT 5;

title,runtime,hours,minutes
The Shawshank Redemption,142,2,22
The Godfather,175,2,55
The Dark Knight,152,2,32
The Godfather Part II,202,3,22
12 Angry Men,96,1,36


In [14]:
SELECT AVG(runtime) AS avg_runtime
  FROM movies;

avg_runtime
129.052


## 分組聚合

## 透過 `GROUP BY` 保留字

以指定變數的「獨一值」為組別，依據組別作聚合函數的運算。

```sql
SELECT columns,
       AGGREGATE_FUNTION(columns, parameters) AS alias
  FROM table
 GROUP BY columns;
```

In [None]:
SELECT release_year,
       AVG(runtime) AS avg_runtime
  FROM movies
 GROUP BY release_year;

## 加入 `HAVING` 保留字

針對聚合函數的運算結果再進行篩選。

```sql
SELECT columns,
       AGGREGATE_FUNTION(columns, parameters) AS alias
  FROM table
 GROUP BY columns
HAVING conditions;
```

In [16]:
SELECT release_year,
       AVG(runtime) AS avg_runtime
  FROM movies
 GROUP BY release_year
HAVING AVG(runtime) >= 150;

release_year,avg_runtime
1972,175.0
1974,166.0
2002,150.0
2023,160.0
1946,150.0
1962,160.0
1968,157.0
1984,165.3333
1963,157.5
1983,150.5


## 子查詢

## 將敘述的查詢結果回傳給另一段敘述的結構

- 子查詢的結果可以作為另一段查詢的
    - 條件。
    - 計算內容。
    - 資料表。

## 在 imdb 資料庫中，哪些電影是由 Christopher Nolan 所執導

- 查詢一：從 `directors` 資料表查詢 Christopher Nolan 的導演流水編號。
- 查詢二：根據查詢一，從 `movies_directors` 資料表查詢電影流水編號。
- 查詢三：根據查詢二，從 `movies` 資料表查詢電影名稱。

In [17]:
SELECT id
  FROM directors
 WHERE name = 'Christopher Nolan';

id
24


In [18]:
SELECT movie_id
  FROM movies_directors
 WHERE director_id = 24;

movie_id
3
14
23
25
43
56
70
129


In [19]:
SELECT title,
       release_year
  FROM movies
 WHERE id IN (3, 14, 23, 25, 43, 56, 70, 129)
 ORDER BY release_year DESC;

title,release_year
Oppenheimer,2023
Interstellar,2014
The Dark Knight Rises,2012
Inception,2010
The Dark Knight,2008
The Prestige,2006
Batman Begins,2005
Memento,2000


## 改寫為子查詢的結構

In [20]:
SELECT title,
       release_year
  FROM movies
 WHERE id IN (SELECT movie_id
                FROM movies_directors
               WHERE director_id = (SELECT id
                                      FROM directors
                                     WHERE name = 'Christopher Nolan'))
 ORDER BY release_year DESC;

title,release_year
Oppenheimer,2023
Interstellar,2014
The Dark Knight Rises,2012
Inception,2010
The Dark Knight,2008
The Prestige,2006
Batman Begins,2005
Memento,2000


## 連接

## 透過保留字 `JOIN` 與 `ON`

- `JOIN` 譯作連接、結合或合併。
- 將不同資料表的欄位，透過共通的連接鍵（Join key）合併在同一個查詢結果中。
- 連接鍵通常會設計為子資料表（Child table）的外鍵（Foreign key）、母資料表（Parent table）的主鍵（Primary key）。
- 主鍵通常命名為 `id`；外鍵通常命名為母資料表的單數型接流水編號 `{parent_table_singular}_id`
- 透過實體關係圖（ER-Diagram）可以快速掌握連接鍵。

## `imdb` 學習資料庫的實體關係圖

![](imdb-erd.png)

## `covid19` 學習資料庫的實體關係圖

![](covid19-erd.png)

## MySQL 支援的連接

- `JOIN`: 內部連接，保留兩個資料表交集的觀測值。
- `LEFT JOIN`: 左外部連接，保留左資料表（寫在 `FROM` 後面）的所有觀測值。
- `RIGHT JOIN`: 右外部連接，保留右資料表（寫在 `JOIN` 後面）的所有觀測值。

```sql
SELECT columns
  FROM left_table
  JOIN | LEFT JOIN | RIGHT JOIN right_table
    ON left_table.join_key = right_table.primary_key
```

In [21]:
SELECT movies.title,
       directors.name AS directed_by,
       movies.release_year
  FROM movies_directors
  JOIN movies
    ON movies_directors.movie_id = movies.id
  JOIN directors
    ON movies_directors.director_id = directors.id
 WHERE directors.name = 'Christopher Nolan'
 ORDER BY movies.release_year DESC;

title,directed_by,release_year
Oppenheimer,Christopher Nolan,2023
Interstellar,Christopher Nolan,2014
The Dark Knight Rises,Christopher Nolan,2012
Inception,Christopher Nolan,2010
The Dark Knight,Christopher Nolan,2008
The Prestige,Christopher Nolan,2006
Batman Begins,Christopher Nolan,2005
Memento,Christopher Nolan,2000


## 重點統整

```sql
SELECT DISTINCT columns AS alias,
       CASE WHEN condition_1 THEN result_1
            WHEN condition_2 THEN result_2
            ...
            ELSE result_n END AS alias
  FROM left_table
  JOIN | LEFT JOIN | RIGHT JOIN right_table
    ON left_table.join_key = right_table.primary_key
 WHERE conditions
 GROUP BY columns
HAVING conditions
 ORDER BY columns DESC
 LIMIT m
OFFSET n;
```