# SQL 的五十道練習

> 示範練習

郭耀仁 <yaojenkuo@datainpoint.com>，[數據交點](https://www.datainpoint.com)

In [1]:
import sqlite3
import unittest
import numpy as np
import pandas as pd
conn = sqlite3.connect('databases/nba.db')
conn.execute("""ATTACH 'databases/covid19.db' AS covid19""")
conn.execute("""ATTACH 'databases/twElection2020.db' AS twElection2020""")
conn.execute("""ATTACH 'databases/imdb.db' AS imdb""")

<sqlite3.Cursor at 0x7fc9a6a6f5e0>

## 從 2020 總統大選資料表 `presidential` 中計算三組候選人各獲得幾票，並依照他們的號碼排序。

- 預期輸入：SQL 查詢語法。
- 預期輸出：(3, 3) 的查詢結果。

```
  number candidate  total_votes
0      1    宋楚瑜/余湘       608590
1      2   韓國瑜/張善政      5522119
2      3   蔡英文/賴清德      8170231
```

In [2]:
summarize_presidential_2020 =\
"""
-- SQL 查詢語法起點
-- SQL 查詢語法終點
"""

## 從 2020-12-31 的每日報告資料表 `daily_report` 中計算各個國家新冠肺炎的確診人數為多少，並由大排到小（遞減排序）。

- 預期輸入：SQL 查詢語法。
- 預期輸出：(191, 2) 的查詢結果。

```
       Country_Region  n_confirmed
0                  US     20032035
1               India     10266674
2              Brazil      7675973
3              Russia      3127347
4              France      2677666
..                ...          ...
186   Solomon Islands           17
187        MS Zaandam            9
188  Marshall Islands            4
189             Samoa            2
190           Vanuatu            1

[191 rows x 2 columns]
```

In [3]:
aggregate_confirmed_by_countries =\
"""
-- SQL 查詢語法起點
-- SQL 查詢語法終點
"""

## 從 NBA 球員資料表 `players` 中找出身高最高的球員是誰。

- 預期輸入：SQL 查詢語法。
- 預期輸出：(1, 3) 的查詢結果。

```
  firstName lastName  heightMeters
0     Tacko     Fall          2.26
```

In [4]:
find_the_tallest_nba_player =\
"""
-- SQL 查詢語法起點
-- SQL 查詢語法終點
"""

## 從 250 部經典電影資料表 `top_rated_movies` 中找出男星 Tom Hanks 在哪些電影中有演出？

- 預期輸入：SQL 查詢語法。
- 預期輸出：(6, 3) 的查詢結果。

```
                 title  release_year  rating
0         Forrest Gump          1994     8.8
1            Toy Story          1995     8.3
2  Saving Private Ryan          1998     8.6
3       The Green Mile          1999     8.6
4  Catch Me If You Can          2002     8.1
5          Toy Story 3          2010     8.2

```

In [5]:
find_tom_hanks_is_casting =\
"""
-- SQL 查詢語法起點
-- SQL 查詢語法終點
"""

## 執行測試！

Kernel -> Restart & Run All.

In [6]:
class TestDemoExercises(unittest.TestCase):
    def test_summarize_presidential_2020(self):
        presidential_2020 = pd.read_sql(summarize_presidential_2020, conn)
        self.assertEqual(presidential_2020.shape, (3, 3))
        np.testing.assert_equal(presidential_2020['candidate'].values,
                                 np.array(['宋楚瑜/余湘', '韓國瑜/張善政', '蔡英文/賴清德']))
        np.testing.assert_equal(presidential_2020['total_votes'].values,
                                 np.array([608590, 5522119, 8170231]))
    def test_find_the_tallest_nba_player(self):
        the_tallest_nba_player = pd.read_sql(find_the_tallest_nba_player, conn)
        self.assertEqual(the_tallest_nba_player.shape, (1, 3))
        self.assertEqual(the_tallest_nba_player['firstName'][0], 'Tacko')
        self.assertEqual(the_tallest_nba_player['lastName'][0], 'Fall')
        self.assertAlmostEqual(the_tallest_nba_player['heightMeters'][0], 2.26)
    def test_aggregate_confirmed_by_countries(self):
        confirmed_by_countries = pd.read_sql(aggregate_confirmed_by_countries, conn)
        self.assertEqual(confirmed_by_countries.shape, (191, 2))
        np.testing.assert_equal(confirmed_by_countries['Country_Region'].values[:5], 
                               np.array(['US', 'India', 'Brazil', 'Russia', 'France']))
        np.testing.assert_equal(confirmed_by_countries['Country_Region'].values[-5:], 
                               np.array(['Solomon Islands', 'MS Zaandam', 'Marshall Islands', 'Samoa', 'Vanuatu']))
        np.testing.assert_equal(confirmed_by_countries['n_confirmed'].values[:5], 
                               np.array([20032035, 10266674, 7675973, 3127347, 2677666]))
        np.testing.assert_equal(confirmed_by_countries['n_confirmed'].values[-5:], 
                               np.array([17, 9, 4, 2, 1]))
    def test_find_tom_hanks_is_casting(self):
        tom_hanks_is_casting = pd.read_sql(find_tom_hanks_is_casting, conn)
        self.assertEqual(tom_hanks_is_casting.shape, (6, 3))
        np.testing.assert_equal(tom_hanks_is_casting['title'].values,
                               np.array(['Forrest Gump', 'Toy Story', 'Saving Private Ryan', 'The Green Mile', 'Catch Me If You Can', 'Toy Story 3']))
        np.testing.assert_equal(tom_hanks_is_casting['release_year'].values,
                               np.array([1994, 1995, 1998, 1999, 2002, 2010]))
        np.testing.assert_equal(tom_hanks_is_casting['rating'].values,
                               np.array([8.8, 8.3, 8.6, 8.6, 8.1, 8.2]))

suite = unittest.TestLoader().loadTestsFromTestCase(TestDemoExercises)
runner = unittest.TextTestRunner(verbosity=2)
test_results = runner.run(suite)
number_of_failures = len(test_results.failures)
number_of_errors = len(test_results.errors)
number_of_test_runs = test_results.testsRun
number_of_successes = number_of_test_runs - (number_of_failures + number_of_errors)

test_aggregate_confirmed_by_countries (__main__.TestDemoExercises) ... ERROR
test_find_the_tallest_nba_player (__main__.TestDemoExercises) ... ERROR
test_find_tom_hanks_is_casting (__main__.TestDemoExercises) ... ERROR
test_summarize_presidential_2020 (__main__.TestDemoExercises) ... ERROR

ERROR: test_aggregate_confirmed_by_countries (__main__.TestDemoExercises)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-6-269a3a945c22>", line 16, in test_aggregate_confirmed_by_countries
    confirmed_by_countries = pd.read_sql(aggregate_confirmed_by_countries, conn)
  File "/Users/kuoyaojen/pyda/lib/python3.6/site-packages/pandas/io/sql.py", line 489, in read_sql
    chunksize=chunksize,
  File "/Users/kuoyaojen/pyda/lib/python3.6/site-packages/pandas/io/sql.py", line 1728, in read_query
    columns = [col_desc[0] for col_desc in cursor.description]
TypeError: 'NoneType' object is not iterable

ERROR: test_find_the_

In [7]:
print("您在 {} 道 SQL 練習題中答對了 {} 道。".format(number_of_test_runs, number_of_successes))

您在 4 道 SQL 練習題中答對了 0 道。
