# SQL 的五十道練習

> 篩選資料

郭耀仁 <yaojenkuo@datainpoint.com>，[數據交點](https://www.datainpoint.com/)

In [1]:
%LOAD ../databases/imdb.db

In [2]:
ATTACH "../databases/nba.db" AS nba;

In [3]:
ATTACH "../databases/twElection2020.db" AS twElection2020;

In [4]:
ATTACH "../databases/covid19.db" AS covid19;

## 以 `WHERE` 敘述篩選資料

## 截至目前，我們所撰寫的 SQL 幾乎都是將資料表中「所有」的觀測值回傳為查詢結果。

## 在應用場景中，更常見的是需要資料表中「特定」的觀測值，例如：

- 在 nba 資料庫中找出 Fantasy Game 想要選的球員。
- 在 imdb 資料庫中找出 1994 年上映的經典電影。
- 在 twElection2020 資料庫中找出台北市的資料。

## 加入 `WHERE` 敘述能夠以「條件」作為篩選觀測值的依據

```sql
-- WHERE 敘述
SELECT column_name
  FROM column_name
 WHERE condition;
```

## 撰寫條件的時候，我們要暸解兩個觀念：

1. 比較運算符：能夠產生布林的運算符號。
2. 布林（Boolean）：用來表示比較結果的兩個值（真、假）。

## 基礎比較運算符

|比較運算符|作用描述|
|--------|-------|
|`=`|相等|
|`!=`|不相等|
|`>`|大於|
|`<`|小於|
|`>=`|大於等於|
|`<=`|小於等於|

## 比較結果為「真」的布林，SQLite 以 `1` 表示

In [5]:
SELECT 55 = 55 AS bool_true;

bool_true
1


## 比較結果為「假」的布林，SQLite 以 `0` 表示

In [6]:
SELECT 55 != 55 AS bool_false;

bool_false
0


## 針對資料表的欄位撰寫條件會產生由布林所組成的查詢結果

In [7]:
SELECT confName = 'East' AS bool
  FROM teams;

bool
1
1
1
0
1
0
0
0
0
0


## 將條件撰寫在 `WHERE` 敘述之後，會篩選布林為真（`1`）的觀測值回傳

In [8]:
SELECT confName = 'East' AS bool
  FROM teams
 WHERE confName = 'East';

bool
1
1
1
1
1
1
1
1
1
1


In [9]:
-- 篩選東區的球隊
SELECT fullName
  FROM teams
 WHERE confName = 'East';

fullName
Atlanta Hawks
Boston Celtics
Cleveland Cavaliers
Chicago Bulls
Miami Heat
Milwaukee Bucks
Brooklyn Nets
New York Knicks
Orlando Magic
Indiana Pacers


## 撰寫條件不只可以針對文字類型的變數，亦能夠針對數值類型的變數

In [10]:
SELECT personId,
       ppg > 20 AS bool
  FROM career_summaries
 LIMIT 10;

personId,bool
2544,1
2546,1
2617,0
2730,0
2738,0
2772,0
101108,0
101150,0
200746,0
200752,0


In [11]:
-- 篩選生涯場均得分超過 20 分的球員
SELECT personId,
       ppg
  FROM career_summaries
 WHERE ppg > 20;

personId,ppg
2544,27.0
2546,23.4
201142,27.1
201566,23.2
201933,21.5
201935,25.2
201939,23.6
202326,20.9
202331,20.1
202681,22.6


## 特徵比對

## 除了基礎比較運算符，對文字類型的變數撰寫條件時，還能夠使用具備特徵比對（Pattern matching）性質的比較運算符 `LIKE`。

## 使用 `LIKE` 比較運算符的時候需要搭配萬用字元（Wildcards）

|萬用字元|作用描述|
|-------|------|
|`%`|表示任意文字，包含空字串|
|`_`|表示剛好一個文字|

In [12]:
-- 篩選名字是 L 開頭的球員
SELECT firstName
  FROM players
 WHERE firstName LIKE 'L%';

firstName
LeBron
Lou
LaMarcus
Langston
Larry
Lonzo
Lauri
Luke
Luke
Landry


In [13]:
-- 篩選名字是 L 開頭後面接五個字元的球員
SELECT firstName
  FROM players
 WHERE firstName LIKE 'L_____'; -- 五個底線

firstName
LeBron
Landry
Lonnie
LaMelo


## 邏輯運算符

## 當 `WHERE` 敘述之後的條件超過一個的時候，必須使用邏輯運算符連結這些條件。

## 基礎的邏輯運算符有：

- `AND` 代表條件的交集。
- `OR` 代表條件的聯集。
- `BETWEEN` 處理數值條件的交集。
- `IN` 處理文字條件的聯集。
- `NOT` 反轉真假。

## 使用 `AND` 連結兩個條件時，要兩皆為真（`1`）才為真，其餘狀況均為假（`0`）

In [14]:
SELECT heightMeters >= 1.9 AS condition_1,
       heightMeters <= 2.0 AS condition_2,
       heightMeters >= 1.9 AND heightMeters <= 2.0 AS condition_1_and_2
  FROM players
 LIMIT 10;

condition_1,condition_2,condition_1_and_2
1,0,0
1,0,0
1,0,0
1,0,0
1,1,1
1,0,0
0,1,0
0,1,0
1,0,0
1,0,0


In [15]:
SELECT heightMeters
  FROM players
 WHERE heightMeters >= 1.9 AND 
       heightMeters <= 2.0
 LIMIT 10;

heightMeters
1.98
1.9
1.96
1.98
1.9
1.9
1.93
1.9
1.96
1.9


## 採取 `AND` 連結兩個數值比較條件時，可以使用 `BETWEEN` 邏輯運算符來縮減程式碼

In [16]:
SELECT heightMeters
  FROM players
 WHERE heightMeters BETWEEN 1.9 AND 2.0
 LIMIT 10;

heightMeters
1.98
1.9
1.96
1.98
1.9
1.9
1.93
1.9
1.96
1.9


## 使用 `OR` 連結兩個條件時，要兩皆為假（`0`）才為假，其餘狀況均為真（`1`）

In [17]:
SELECT divName = 'Atlantic' AS condition_1,
       divName = 'Pacific' AS condition_2,
       divName = 'Atlantic' OR divName = 'Pacific' AS condition_1_or_2
  FROM teams;

condition_1,condition_2,condition_1_or_2
0,0,0
1,0,1
0,0,0
0,0,0
0,0,0
0,0,0
0,0,0
0,1,1
0,0,0
0,1,1


In [18]:
SELECT fullName,
       divName
  FROM teams
 WHERE divName = 'Atlantic' OR
       divName = 'Pacific';

fullName,divName
Boston Celtics,Atlantic
Golden State Warriors,Pacific
LA Clippers,Pacific
Los Angeles Lakers,Pacific
Brooklyn Nets,Atlantic
New York Knicks,Atlantic
Philadelphia 76ers,Atlantic
Phoenix Suns,Pacific
Sacramento Kings,Pacific
Toronto Raptors,Atlantic


## 採取 `OR` 連結兩個文字比較條件時，可以使用 `IN` 邏輯運算符來縮減程式碼

In [19]:
SELECT fullName,
       divName
  FROM teams
 WHERE divName IN ('Atlantic', 'Pacific');

fullName,divName
Boston Celtics,Atlantic
Golden State Warriors,Pacific
LA Clippers,Pacific
Los Angeles Lakers,Pacific
Brooklyn Nets,Atlantic
New York Knicks,Atlantic
Philadelphia 76ers,Atlantic
Phoenix Suns,Pacific
Sacramento Kings,Pacific
Toronto Raptors,Atlantic


## 使用 `NOT` 將條件的比較結果反轉，亦即真假互換

In [20]:
SELECT 55 = 55 AS true,
       NOT 55 = 55 AS not_true;

true,not_true
1,0


In [21]:
SELECT divName = 'Atlantic' AS condition_1,
       divName = 'Pacific' AS condition_2,
       NOT (divName = 'Atlantic' OR divName = 'Pacific') AS not_condition_1_or_2
  FROM teams;

condition_1,condition_2,not_condition_1_or_2
0,0,1
1,0,0
0,0,1
0,0,1
0,0,1
0,0,1
0,0,1
0,1,0
0,0,1
0,1,0


In [22]:
SELECT fullName,
       divName
  FROM teams
 WHERE divName NOT IN ('Atlantic', 'Pacific');

fullName,divName
Atlanta Hawks,Southeast
Cleveland Cavaliers,Central
New Orleans Pelicans,Southwest
Chicago Bulls,Central
Dallas Mavericks,Southwest
Denver Nuggets,Northwest
Houston Rockets,Southwest
Miami Heat,Southeast
Milwaukee Bucks,Central
Minnesota Timberwolves,Northwest


## 遺漏值的比較運算符

## `NULL` 遺漏值（或稱空值）不適用基礎比較運算符

In [23]:
SELECT NULL = NULL AS null_equals_null,
       NULL != NULL AS null_does_not_equal_null;

null_equals_null,null_does_not_equal_null
,


## 要判斷是否為遺漏值，必須使用 `IS NULL` 作為比較運算符

In [24]:
SELECT NULL IS NULL AS null_equals_null,
       NULL IS NOT NULL AS null_does_not_equal_null;

null_equals_null,null_does_not_equal_null
1,0


In [25]:
SELECT Province_State,
       Country_Region
  FROM daily_report
 WHERE Province_State = NULL;

Province_State,Country_Region


In [26]:
SELECT Province_State,
       Country_Region
  FROM daily_report
 WHERE Province_State IS NULL
 LIMIT 10;

Province_State,Country_Region
,Afghanistan
,Albania
,Algeria
,Andorra
,Angola
,Antigua and Barbuda
,Argentina
,Armenia
,Austria
,Azerbaijan


In [27]:
SELECT Province_State,
       Country_Region
  FROM daily_report
 WHERE Province_State IS NOT NULL
 LIMIT 10;

Province_State,Country_Region
Australian Capital Territory,Australia
New South Wales,Australia
Northern Territory,Australia
Queensland,Australia
South Australia,Australia
Tasmania,Australia
Victoria,Australia
Western Australia,Australia
Antwerp,Belgium
Brussels,Belgium


## 重點統整

- 加入 `WHERE` 敘述能夠以「條件」作為篩選觀測值的依據。
- 對文字類型的變數撰寫條件能夠使用特徵比對的運算符 `LIKE` 搭配萬用字元。
- 當 `WHERE` 敘述之後的條件超過一個的時候，必須使用邏輯運算符連結這些條件。
- 判斷是否為遺漏值，必須使用 `IS NULL` 作為比較運算符。

## 目前我們會的 SQL

```sql
SELECT DISTINCT column_name AS alias,
       column_name (+, -, *, /, %, ||) column_name AS alias,
       FUNCTION_NAME(column_name) AS alias
  FROM table_name
 WHERE condition (AND, OR, NOT, BETWEEN, IN, IS NULL)
       condition
 ORDER BY column_name ASC|DESC
 LIMIT m;
```