### 练习题：统计全班的成绩

假设一个团队里有 5 名学员，成绩如下表所示。你可以用 NumPy 统计下这些人在语文、英语、数学中的平均成绩、最小成绩、最大成绩、方差、标准差。然后把这些人的总成绩排序，得出名次进行成绩输出。

![alt text](https://static001.geekbang.org/resource/image/44/5c/442a89eed30c13b543e5f717c538325c.jpg?wh=1142*1031)

In [86]:
import numpy as np

persontype = np.dtype({
    'names': ['name', 'language', 'english', 'math'],
    'formats': ['U32', 'f', 'f', 'f'],
})

peoples = np.array(
    [
        ('ZhangFei', 66, 65, 30),
        ('GuanYv', 95, 85, 98),
        ('ZhaoYun', 93, 92, 96),
        ('HuangZhong', 90, 88, 77),
        ('DianWei', 80, 90, 90),
    ],
    dtype=persontype,
)

dash_line = '=' * 40

# 各科平均、最小、最大，方差，标准差
tfuncs = [np.average, np.min, np.max, np.var, np.std]

for tfunc in tfuncs:
    for tname in persontype.names[1:]:
        print(f'subject: {tname:>8}, {tfunc.__name__} score: {tfunc(peoples[:][tname]):.2f}')

print(dash_line)

# 各科排名
for tname in persontype.names[1:]:
    rank = np.sort(peoples, order=tname)[::-1][['name', tname]]
    for i, (name, score) in enumerate(zip(rank['name'], rank[tname])):
        print(f'subject: {tname:>8}, rank: {i+1}, name: {name:>10}, score: {score:.2f}')

print(dash_line)

# 总分排名
total_scores = np.zeros_like(peoples['language'])
for tname in persontype.names[1:]:
    total_scores += peoples[tname]
sorted_indices = np.argsort(total_scores)[::-1]
sorted_total_scores = total_scores[sorted_indices]
sorted_names = peoples['name'][sorted_indices]

for i, (name, total_score) in enumerate(zip(sorted_names, sorted_total_scores)):
    print(f'rank: {i+1}, name: {name:>10}, total_score: {total_score:.2f}')

subject: language, average score: 84.80
subject:  english, average score: 84.00
subject:     math, average score: 78.20
subject: language, min score: 66.00
subject:  english, min score: 65.00
subject:     math, min score: 30.00
subject: language, max score: 95.00
subject:  english, max score: 92.00
subject:     math, max score: 98.00
subject: language, var score: 114.96
subject:  english, var score: 95.60
subject:     math, var score: 634.56
subject: language, std score: 10.72
subject:  english, std score: 9.78
subject:     math, std score: 25.19
subject: language, rank: 1, name:     GuanYv, score: 95.00
subject: language, rank: 2, name:    ZhaoYun, score: 93.00
subject: language, rank: 3, name: HuangZhong, score: 90.00
subject: language, rank: 4, name:    DianWei, score: 80.00
subject: language, rank: 5, name:   ZhangFei, score: 66.00
subject:  english, rank: 1, name:    ZhaoYun, score: 92.00
subject:  english, rank: 2, name:    DianWei, score: 90.00
subject:  english, rank: 3, name: 

### 练习题

对于下表的数据，请使用 Pandas 中的 DataFrame 进行创建，并对数据进行清洗。同时新增一列“总和”计算每个人的三科成绩之和。

![image.png](https://p9-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/1833733e77fd4e0bbbfa8597235972c4~tplv-k3u1fbpfcp-jj-mark:0:0:0:0:q75.image#?w=423&h=241&s=38864&e=png&b=fefdfd)

In [104]:
import pandas as pd
from pandas import DataFrame

dash_line = '=' * 40

data = [
    ['ZhangFei', 66, 65],
    ['GuanYv', 95, 85, 98],
    ['ZhaoYun', 95, 92, 96],
    ['HuangZhong', 90, 88, 77],
    ['DianWei', 80, 90, 90],
    ['DianWei', 80, 90, 90],
]

df = pd.DataFrame(data, columns=['name', 'language', 'english', 'math'])

print(df)

dfuni = df.drop_duplicates()
dfuni = dfuni.fillna(0)
print(dash_line)
print(dfuni)

def plus(df):
    df['sum'] = 0
    for s in df.iloc[1:-1]:
        df['sum'] += s
    return df

dfsum = dfuni.apply(plus, axis=1)

print(dash_line)
print(dfsum)

         name  language  english  math
0    ZhangFei        66       65   NaN
1      GuanYv        95       85  98.0
2     ZhaoYun        95       92  96.0
3  HuangZhong        90       88  77.0
4     DianWei        80       90  90.0
5     DianWei        80       90  90.0
         name  language  english  math
0    ZhangFei        66       65   0.0
1      GuanYv        95       85  98.0
2     ZhaoYun        95       92  96.0
3  HuangZhong        90       88  77.0
4     DianWei        80       90  90.0
         name  language  english  math    sum
0    ZhangFei        66       65   0.0  131.0
1      GuanYv        95       85  98.0  278.0
2     ZhaoYun        95       92  96.0  283.0
3  HuangZhong        90       88  77.0  255.0
4     DianWei        80       90  90.0  260.0
