# Python 資料分析

> 以 Pandas 處理表格式資料

[數據交點](https://www.datainpoint.com/) | 郭耀仁 <yaojenkuo@datainpoint.com>

## 練習題指引

- 第一個程式碼儲存格會將可能用得到的模組（套件）以及單元測試 `unittest` 載入。
- 如果練習題需要載入檔案，檔案與練習題存放在同個資料夾中，意即我們可以指定工作目錄來載入。
- 練習題已經定義好函數或者類別的名稱以及參數名稱，我們只需要寫作主體。
- 函數或者類別的 `"""docstring"""` 部分會描述測試如何進行。
- 觀察 `"""docstring"""` 的部分能夠暸解輸入以及預期輸出之間的關係，能幫助我們更暸解題目。
- 請在 `### BEGIN SOLUTION` 與 `### END SOLUTION` 這兩個單行註解之間寫作函數或者類別的主體。
- 執行測試的方式為點選上方選單的 Kernel -> Restart Kernel And Run All Cells -> Restart。
- 可以每寫一題就執行測試，也可以全部寫完再執行測試。
- 練習題閒置超過 10 分鐘會自動斷線，這時只要重新點選練習題連結即可重新啟動。

In [1]:
import os
import json
import unittest
import numpy as np
import pandas as pd

## 01. Define a function named `create_nba_teams` that is able to create a DataFrame as expected given a JSON file `teams.json`.

- Expected inputs: a JSON file `teams.json`.
- Expected outputs: a (30, 5) DataFrame.

```
   tricode confName    divName           city                fullName
0      ATL     East  Southeast        Atlanta           Atlanta Hawks
1      BOS     East   Atlantic         Boston          Boston Celtics
2      BKN     East   Atlantic       Brooklyn           Brooklyn Nets
3      CHA     East  Southeast      Charlotte       Charlotte Hornets
4      CHI     East    Central        Chicago           Chicago Bulls
5      CLE     East    Central      Cleveland     Cleveland Cavaliers
6      DAL     West  Southwest         Dallas        Dallas Mavericks
7      DEN     West  Northwest         Denver          Denver Nuggets
8      DET     East    Central        Detroit         Detroit Pistons
9      GSW     West    Pacific   Golden State   Golden State Warriors
10     HOU     West  Southwest        Houston         Houston Rockets
11     IND     East    Central        Indiana          Indiana Pacers
12     LAC     West    Pacific             LA             LA Clippers
13     LAL     West    Pacific    Los Angeles      Los Angeles Lakers
14     MEM     West  Southwest        Memphis       Memphis Grizzlies
15     MIA     East  Southeast          Miami              Miami Heat
16     MIL     East    Central      Milwaukee         Milwaukee Bucks
17     MIN     West  Northwest      Minnesota  Minnesota Timberwolves
18     NOP     West  Southwest    New Orleans    New Orleans Pelicans
19     NYK     East   Atlantic       New York         New York Knicks
20     OKC     West  Northwest  Oklahoma City   Oklahoma City Thunder
21     ORL     East  Southeast        Orlando           Orlando Magic
22     PHI     East   Atlantic   Philadelphia      Philadelphia 76ers
23     PHX     West    Pacific        Phoenix            Phoenix Suns
24     POR     West  Northwest       Portland  Portland Trail Blazers
25     SAC     West    Pacific     Sacramento        Sacramento Kings
26     SAS     West  Southwest    San Antonio       San Antonio Spurs
27     TOR     East   Atlantic        Toronto         Toronto Raptors
28     UTA     West  Northwest           Utah               Utah Jazz
29     WAS     East  Southeast     Washington      Washington Wizards
```

In [2]:
def create_nba_teams(json_file_path):
    """
    >>> nba_teams = create_nba_teams('teams.json')
    >>> print(type(nba_teams))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(nba_teams.shape)
    (30, 5)
    """
    ### BEGIN SOLUTION
    with open(json_file_path) as f:
        teams_json = json.load(f)
    teams_df = pd.DataFrame(teams_json['league']['standard'])
    teams_df_selected = teams_df[['tricode', 'confName', 'divName', 'city', 'fullName']]
    return teams_df_selected
    ### END SOLUTION

## 02. Define a function named `find_east_teams` that is able to create a DataFrame as expected given a JSON file `teams.json`.

- Expected inputs: a JSON file `teams.json`.
- Expected outputs: a (15, 5) DataFrame.

```
   tricode confName    divName          city             fullName
0      ATL     East  Southeast       Atlanta        Atlanta Hawks
1      BOS     East   Atlantic        Boston       Boston Celtics
2      BKN     East   Atlantic      Brooklyn        Brooklyn Nets
3      CHA     East  Southeast     Charlotte    Charlotte Hornets
4      CHI     East    Central       Chicago        Chicago Bulls
5      CLE     East    Central     Cleveland  Cleveland Cavaliers
6      DET     East    Central       Detroit      Detroit Pistons
7      IND     East    Central       Indiana       Indiana Pacers
8      MIA     East  Southeast         Miami           Miami Heat
9      MIL     East    Central     Milwaukee      Milwaukee Bucks
10     NYK     East   Atlantic      New York      New York Knicks
11     ORL     East  Southeast       Orlando        Orlando Magic
12     PHI     East   Atlantic  Philadelphia   Philadelphia 76ers
13     TOR     East   Atlantic       Toronto      Toronto Raptors
14     WAS     East  Southeast    Washington   Washington Wizards
```

In [3]:
def find_east_teams(json_file_path):
    """
    >>> east_teams = find_east_teams('teams.json')
    >>> print(type(east_teams))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(east_teams.shape)
    (15, 5)
    """
    ### BEGIN SOLUTION
    teams_df = create_nba_teams(json_file_path)
    east_teams = teams_df[teams_df['confName'] == 'East']
    return east_teams.reset_index(drop=True)
    ### END SOLUTION

## 03. Define a function named `create_head_coaches` that is able to create a DataFrame as expected given a JSON file `coaches.json`.

- Expected inputs: a JSON file `coaches.json`.
- Expected outputs: a (30, 3) DataFrame.

```
   team_tricode  first_name    last_name
0           PHI         Doc       Rivers
1           POR    Chauncey      Billups
2           MIL        Mike  Budenholzer
3           CHI       Billy      Donovan
4           CLE  John-Blair  Bickerstaff
5           BOS         Ime        Udoka
6           LAC      Tyronn          Lue
7           MEM      Taylor      Jenkins
8           ATL        Nate     McMillan
9           MIA        Erik    Spoelstra
10          CHA       James      Borrego
11          UTA        Quin       Snyder
12          SAC        Luke       Walton
13          NYK         Tom    Thibodeau
14          LAL       Frank        Vogel
15          ORL      Jamahl       Mosley
16          DAL       Jason         Kidd
17          BKN       Steve         Nash
18          DEN     Michael       Malone
19          IND        Rick     Carlisle
20          NOP      Willie        Green
21          DET       Dwane        Casey
22          TOR        Nick        Nurse
23          HOU     Stephen        Silas
24          SAS       Gregg     Popovich
25          PHX       Monty     Williams
26          OKC        Mark   Daigneault
27          MIN       Chris        Finch
28          GSW       Steve         Kerr
29          WAS         Wes       Unseld
```

In [4]:
def create_head_coaches(json_file_path):
    """
    >>> head_coaches = create_head_coaches('coaches.json')
    >>> print(type(head_coaches))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(head_coaches.shape)
    (30, 3)
    """
    ### BEGIN SOLUTION
    with open(json_file_path) as f:
        coaches_json = json.load(f)
    coach_list = coaches_json['league']['standard']
    first_names, last_names, team_tricodes = [], [], []
    for coach in coach_list:
        if not coach['isAssistant']:
            first_names.append(coach['firstName'])
            last_names.append(coach['lastName'])
            team_tricodes.append(coach['teamSitesOnly']['teamTricode'])
    out = pd.DataFrame()
    out['team_tricode'] = team_tricodes
    out['first_name'] = first_names
    out['last_name'] = last_names
    return out
    ### END SOLUTION

## 04. Define a function named `create_nba_player_heights` that is able to create a DataFrame as expected given a JSON file `players.json`.

PS You have to exclude the players who is not active(`isActive == False`).

- Expected inputs: a JSON file `players.json`.
- Expected outputs: a (503, 3) DataFrame.

```
    first_name         last_name  height_meter
0     Precious           Achiuwa          2.03
1       Steven             Adams          2.11
2          Bam           Adebayo          2.06
3      Ty-Shon         Alexander          1.90
4      Nickeil  Alexander-Walker          1.98
..         ...               ...           ...
498   Thaddeus             Young          2.03
499       Trae             Young          1.85
500       Omer         Yurtseven          2.13
501       Cody            Zeller          2.11
502      Ivica             Zubac          2.13

[503 rows x 3 columns]
```

In [5]:
def create_nba_player_heights(json_file_path):
    """
    >>> nba_player_heights = create_nba_player_heights('players.json')
    >>> print(type(nba_player_heights))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(nba_player_heights.shape)
    (503, 3)
    """
    ### BEGIN SOLUTION
    with open(json_file_path) as f:
        players_json = json.load(f)
    player_list = players_json['league']['standard']
    first_names, last_names, height_meters = [], [], []
    for player in player_list:
        if player['isActive']:
            first_names.append(player['firstName'])
            last_names.append(player['lastName'])
            height_meters.append(player['heightMeters'])
    out = pd.DataFrame()
    out['first_name'] = first_names
    out['last_name'] = last_names
    out['height_meter'] = np.array(height_meters, dtype=float)
    return out
    ### END SOLUTION

## 05. Define a function named `find_tallest_shortest_players` that is able to create a DataFrame as expected given a JSON file `players.json`.

PS You have to exclude the players who is not active(`isActive == False`).

- Expected inputs: a JSON file `players.json`.
- Expected outputs: a (5, 4) DataFrame.

```
  first_name last_name  height_meter       tag
0    Facundo  Campazzo          1.78  shortest
1      Tacko      Fall          2.26   tallest
2      Jared    Harper          1.78  shortest
3     Markus    Howard          1.78  shortest
4    Tremont    Waters          1.78  shortest
```

In [6]:
def find_tallest_shortest_players(json_file_path):
    """
    >>> tallest_shortest_players = find_tallest_shortest_players('players.json')
    >>> print(type(tallest_shortest_players))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(tallest_shortest_players.shape)
    (5, 4)
    """
    ### BEGIN SOLUTION
    nba_player_heights = create_nba_player_heights(json_file_path)
    max_height = nba_player_heights['height_meter'].max()
    min_height = nba_player_heights['height_meter'].min()
    condition = (nba_player_heights['height_meter'] == max_height) | (nba_player_heights['height_meter'] == min_height)
    out = nba_player_heights[condition]
    tag = out['height_meter'].map(lambda x: 'tallest' if x == max_height else 'shortest').values
    ncols = out.shape[1]
    out.insert(ncols, 'tag', tag)
    return out.reset_index(drop=True)
    ### END SOLUTION

## 06. Define a function named `calculate_death_rate_by_countries` according to the following formula given `07-22-2021.csv`.

\begin{equation}
\text{Death Rate} = \frac{\text{Deaths}}{\text{Confirmed}}
\end{equation}

- Expected inputs: a CSV file `07-22-2021.csv`.
- Expected outputs: a Series of length 195.

```
Country_Region
Vanuatu                 0.250000
MS Zaandam              0.222222
Yemen                   0.195972
Peru                    0.093219
Mexico                  0.087693
                          ...   
Summer Olympics 2020    0.000000
Samoa                   0.000000
Solomon Islands         0.000000
Marshall Islands        0.000000
Palau                        NaN
Length: 195, dtype: float64
```

In [7]:
def calculate_death_rate_by_countries(csv_file_path):
    """
    >>> death_rate_by_countries = calculate_death_rate_by_countries('07-22-2021.csv')
    >>> print(type(death_rate_by_countries))
    <class 'pandas.core.series.Series'>
    >>> print(death_rate_by_countries.size)
    195
    """
    ### BEGIN SOLUTION
    daily_report = pd.read_csv(csv_file_path)
    groupby_summary = daily_report.groupby('Country_Region')[['Confirmed', 'Deaths']].sum()
    out = groupby_summary['Deaths'] / groupby_summary['Confirmed']
    return out.sort_values(ascending=False)
    ### END SOLUTION

## 07. Define a function named `calculate_confirmed_rate_by_countries` according to the following formula given `07-22-2021.csv` and `UID_ISO_FIPS_LookUp_Table.csv`.

\begin{equation}
\text{Confirmed Rate} = \frac{\text{Confirmed}}{\text{Population}}
\end{equation}

- Expected inputs: None.
- Expected outputs: a (195, 3) DataFrame.

```
                      Confirmed  Population  Confirmed_Rate
Country_Region                                             
Summer Olympics 2020         91         0.0             inf
Diamond Princess            712         0.0             inf
MS Zaandam                    9         0.0             inf
Andorra                   14464     77265.0        0.187200
Seychelles                17747     98340.0        0.180466
...                         ...         ...             ...
Samoa                         3    196130.0        0.000015
Vanuatu                       4    292680.0        0.000014
Micronesia                    1    113815.0        0.000009
Tanzania                    509  59734213.0        0.000009
Palau                         0     18008.0        0.000000

[195 rows x 3 columns]
```

In [8]:
def calculate_confirmed_rate_by_countries():
    """
    >>> confirmed_rate_by_countries = calculate_confirmed_rate_by_countries()
    >>> print(type(confirmed_rate_by_countries))
    pandas.core.frame.DataFrame
    >>> print(confirmed_rate_by_countries.shape)
    (195, 3)
    """
    ### BEGIN SOLUTION
    daily_report = pd.read_csv("07-22-2021.csv")
    lookup_table = pd.read_csv("UID_ISO_FIPS_LookUp_Table.csv")
    confirmed_by_countries = pd.DataFrame(daily_report.groupby('Country_Region')['Confirmed'].sum())
    population_by_countries = pd.DataFrame(lookup_table.groupby('Country_Region')['Population'].sum())
    out = confirmed_by_countries.join(population_by_countries)
    out['Confirmed_Rate'] = out['Confirmed'] / out['Population']
    return out.sort_values('Confirmed_Rate', ascending=False)
    ### END SOLUTION

## 08. Define a function named `find_death_confirmed_rate_of_taiwan` that is able to retrieve the death rate/confirmed rate of Taiwan given `07-22-2021.csv` and `UID_ISO_FIPS_LookUp_Table.csv`.

PS Taiwan might not be "Taiwan" in COVID-19 Data Repository by CSSE at Johns Hopkins University.

- Expected inputs: None.
- Expected outputs: a (1, 5) DataFrame.

```
  Country_Region  Confirmed  Population  Confirmed_Rate  Death_Rate
0        Taiwan*      15511  23816775.0        0.000651    0.050416
```

In [9]:
def find_death_confirmed_rate_of_taiwan():
    """
    >>> death_confirmed_rate_of_taiwan = find_death_confirmed_rate_of_taiwan()
    >>> print(type(death_confirmed_rate_of_taiwan))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(death_confirmed_rate_of_taiwan.shape)
    (1, 5)
    """
    ### BEGIN SOLUTION
    death_rate_by_countries = calculate_death_rate_by_countries('07-22-2021.csv')
    death_rate_by_countries = pd.DataFrame(death_rate_by_countries)
    death_rate_by_countries.columns = ["Death_Rate"]
    confirmed_rate_by_countries = calculate_confirmed_rate_by_countries()
    joined_df = confirmed_rate_by_countries.join(death_rate_by_countries)
    tw_df = joined_df[joined_df.index == 'Taiwan*']
    return tw_df.reset_index()
    ### END SOLUTION

## 09. Define a function named `summarize_time_series` that is able to summarize the summation of confirmed and deaths cases by `Country/Region` given `time_series_covid19_confirmed_global.csv` and `time_series_covid19_deaths_global.csv`.

- Expected inputs: None.
- Expected outputs: a (106860, 4) DataFrame.

```
       Country/Region       Date  Confirmed  Deaths
0         Afghanistan 2020-01-22          0       0
1         Afghanistan 2020-01-23          0       0
2         Afghanistan 2020-01-24          0       0
3         Afghanistan 2020-01-25          0       0
4         Afghanistan 2020-01-26          0       0
...               ...        ...        ...     ...
106855       Zimbabwe 2021-07-18      83619    2622
106856       Zimbabwe 2021-07-19      85732    2697
106857       Zimbabwe 2021-07-20      88415    2747
106858       Zimbabwe 2021-07-21      91120    2809
106859       Zimbabwe 2021-07-22      93421    2870

[106860 rows x 4 columns]
```

In [10]:
def summarize_time_series():
    """
    >>> summarized_time_series = summarize_time_series()
    >>> print(type(summarized_time_series))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(summarized_time_series.shape)
    (106860, 4)
    """
    ### BEGIN SOLUTION
    time_series_confirmed = pd.read_csv('time_series_covid19_confirmed_global.csv')
    time_series_deaths = pd.read_csv('time_series_covid19_deaths_global.csv')
    time_series_confirmed_selected = time_series_confirmed.drop(['Lat', 'Long'], axis=1)
    time_series_deaths_selected = time_series_deaths.drop(['Lat', 'Long'], axis=1)
    idVars = ['Province/State', 'Country/Region']
    transposed_confirmed = pd.melt(time_series_confirmed_selected, id_vars=idVars, var_name='Date', value_name='Confirmed')
    transposed_deaths = pd.melt(time_series_deaths_selected, id_vars=idVars, var_name='Date', value_name='Deaths')
    transposed_confirmed_groupby = pd.DataFrame(transposed_confirmed.groupby(['Country/Region', 'Date'])['Confirmed'].sum())
    transposed_deaths_groupby = pd.DataFrame(transposed_deaths.groupby(['Country/Region', 'Date'])['Deaths'].sum())
    out = transposed_confirmed_groupby.join(transposed_deaths_groupby).reset_index()
    out['Date'] = pd.to_datetime(out['Date'])
    return out.sort_values(['Country/Region', 'Date']).reset_index(drop=True)
    ### END SOLUTION

## 10. Define a function named `calculate_daily_cases_of_taiwan` that is able to calculate the daily cases of Taiwan a DataFrame as expected given `time_series_covid19_confirmed_global.csv` and `time_series_covid19_deaths_global.csv`.

- Expected inputs: None.
- Expected outputs: a (548, 5) DataFrame.

```
           Country/Region  Confirmed  Deaths  Daily_Confirmed  Daily_Deaths
Date                                                                       
2020-01-22         Taiwan          1       0              NaN           NaN
2020-01-23         Taiwan          1       0              0.0           0.0
2020-01-24         Taiwan          3       0              2.0           0.0
2020-01-25         Taiwan          3       0              0.0           0.0
2020-01-26         Taiwan          4       0              1.0           0.0
...                   ...        ...     ...              ...           ...
2021-07-18         Taiwan      15408     768             18.0           4.0
2021-07-19         Taiwan      15429     769             21.0           1.0
2021-07-20         Taiwan      15453     773             24.0           4.0
2021-07-21         Taiwan      15478     778             25.0           5.0
2021-07-22         Taiwan      15511     782             33.0           4.0

[548 rows x 5 columns]
```

In [11]:
def calculate_daily_cases_of_taiwan():
    """
    >>> daily_cases_of_taiwan = calculate_daily_cases_of_taiwan()
    >>> print(type(daily_cases_of_taiwan))
    <class 'pandas.core.frame.DataFrame'>
    >>> print(daily_cases_of_taiwan.shape)
    (548, 5)
    """
    ### BEGIN SOLUTION
    ts = summarize_time_series()
    ts_tw = ts[ts['Country/Region'] == 'Taiwan*']
    daily_confirmed = np.diff(ts_tw['Confirmed'].values.astype(float))
    daily_deaths = np.diff(ts_tw['Deaths'].values.astype(float))
    daily_confirmed = np.insert(daily_confirmed, 0, np.nan)
    daily_deaths = np.insert(daily_deaths, 0, np.nan)
    ts_tw.insert(ts_tw.shape[1], 'Daily_Confirmed', daily_confirmed)
    ts_tw.insert(ts_tw.shape[1], 'Daily_Deaths', daily_deaths)
    ts_tw_set_index = ts_tw.set_index('Date')
    out = ts_tw_set_index.replace(to_replace={"\*": ""}, regex=True)
    return out
    ### END SOLUTION

## 執行測試！

Kernel -> Restart Kernel And Run All Cells -> Restart

In [12]:
class TestDataFrameWranglingWithPandasAdvanced(unittest.TestCase):
    def test_01_create_nba_teams(self):
        nba_teams = create_nba_teams('teams.json')
        self.assertIsInstance(nba_teams, pd.core.frame.DataFrame)
        self.assertEqual(nba_teams.shape, (30, 5))
        column_names = nba_teams.columns
        self.assertTrue('tricode' in column_names)
        self.assertTrue('confName' in column_names)
        self.assertTrue('divName' in column_names)
        self.assertTrue('city' in column_names)
        self.assertTrue('fullName' in column_names)
    def test_02_find_east_teams(self):
        east_teams = find_east_teams('teams.json')
        self.assertIsInstance(east_teams, pd.core.frame.DataFrame)
        self.assertEqual(east_teams.shape, (15, 5))
        div_names = east_teams['divName'].values
        self.assertTrue('Atlantic' in div_names)
        self.assertTrue('Southeast' in div_names)
        self.assertTrue('Central' in div_names)
    def test_03_create_head_coaches(self):
        head_coaches = create_head_coaches('coaches.json')
        self.assertIsInstance(head_coaches, pd.core.frame.DataFrame)
        self.assertEqual(head_coaches.shape, (30, 3))
        tricodes = head_coaches.iloc[:, 0].values
        self.assertTrue('PHI' in tricodes)
        self.assertTrue('BKN' in tricodes)
        self.assertTrue('WAS' in tricodes)
    def test_04_create_nba_player_heights(self):
        nba_player_heights = create_nba_player_heights('players.json')
        self.assertIsInstance(nba_player_heights, pd.core.frame.DataFrame)
        self.assertEqual(nba_player_heights.shape, (503, 3))
    def test_05_find_tallest_shortest_players(self):
        tallest_shortest_players = find_tallest_shortest_players('players.json')
        self.assertIsInstance(tallest_shortest_players, pd.core.frame.DataFrame)
        self.assertEqual(tallest_shortest_players.shape, (5, 4))
        tags = tallest_shortest_players.iloc[:, 3].values
        self.assertTrue('tallest' in tags)
        self.assertTrue('shortest' in tags)
        first_names = tallest_shortest_players.iloc[:, 0].values
        self.assertTrue('Tacko' in first_names)
        last_names = tallest_shortest_players.iloc[:, 1].values
        self.assertTrue('Fall' in last_names)
    def test_06_calculate_death_rate_by_countries(self):
        death_rate_by_countries = calculate_death_rate_by_countries('07-22-2021.csv')
        self.assertIsInstance(death_rate_by_countries, pd.core.series.Series)
        self.assertEqual(death_rate_by_countries.size, 195)
        ser_index = death_rate_by_countries.index
        self.assertTrue('Vanuatu' in ser_index)
        self.assertTrue('MS Zaandam' in ser_index)
        self.assertTrue('Yemen' in ser_index)
        self.assertTrue('Mexico' in ser_index)
        self.assertTrue('Sudan' in ser_index)
    def test_07_calculate_confirmed_rate_by_countries(self):
        confirmed_rate_by_countries = calculate_confirmed_rate_by_countries()
        self.assertIsInstance(confirmed_rate_by_countries, pd.core.frame.DataFrame)
        self.assertEqual(confirmed_rate_by_countries.shape, (195, 3))
        df_index = confirmed_rate_by_countries.index
        self.assertTrue('Andorra' in df_index)
        self.assertTrue('Montenegro' in df_index)
        self.assertTrue('Czechia' in df_index)
        df_columns = confirmed_rate_by_countries.columns
        self.assertTrue('Confirmed' in df_columns)
        self.assertTrue('Population' in df_columns)
        self.assertTrue('Confirmed_Rate' in df_columns)
    def test_08_find_death_confirmed_rate_of_taiwan(self):
        death_confirmed_rate_of_taiwan = find_death_confirmed_rate_of_taiwan()
        self.assertIsInstance(death_confirmed_rate_of_taiwan, pd.core.frame.DataFrame)
        self.assertEqual(death_confirmed_rate_of_taiwan.shape, (1, 5))
    def test_09_summarize_time_series(self):
        summarized_time_series = summarize_time_series()
        self.assertIsInstance(summarized_time_series, pd.core.frame.DataFrame)
        self.assertEqual(summarized_time_series.shape, (106860, 4))
    def test_10_calculate_daily_cases_of_taiwan(self):
        daily_cases_of_taiwan = calculate_daily_cases_of_taiwan()
        self.assertIsInstance(daily_cases_of_taiwan, pd.core.frame.DataFrame)
        self.assertEqual(daily_cases_of_taiwan.shape, (548, 5))

suite = unittest.TestLoader().loadTestsFromTestCase(TestDataFrameWranglingWithPandasAdvanced)
runner = unittest.TextTestRunner(verbosity=2)
test_results = runner.run(suite)
number_of_failures = len(test_results.failures)
number_of_errors = len(test_results.errors)
number_of_test_runs = test_results.testsRun
number_of_successes = number_of_test_runs - (number_of_failures + number_of_errors)
cwd = os.getcwd()
folder_name = cwd.split("/")[-1]
with open("../exercise_index.json", "r") as content:
    exercise_index = json.load(content)
chapter_name = exercise_index[folder_name]

test_01_create_nba_teams (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_02_find_east_teams (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_03_create_head_coaches (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_04_create_nba_player_heights (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_05_find_tallest_shortest_players (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_06_calculate_death_rate_by_countries (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_07_calculate_confirmed_rate_by_countries (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_08_find_death_confirmed_rate_of_taiwan (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_09_summarize_time_series (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok
test_10_calculate_daily_cases_of_taiwan (__main__.TestDataFrameWranglingWithPandasAdvanced) ... ok

----------------------------------------------------------

In [13]:
print("你在「{}」章節中的 {} 道 Python 練習答對了 {} 題。".format(chapter_name, number_of_test_runs, number_of_successes))

你在「以 Pandas 處理表格式資料」章節中的 10 道 Python 練習答對了 10 題。
