# 常用统计

## 求和：`df.sum()`

- `axis=0`，默认为0，逐行求取；1代表逐列求取。
- `skipna=1`，默认为1，表示NaN会转换为0。



In [6]:
import pandas as pd
import numpy as np

classes = ["A", "B", "C"]
score = pd.DataFrame({
    "class":[classes[x] for x in np.random.randint(0, len(classes), 3)],
    "language":np.random.randint(0, 100, 3),
    "math":np.random.randint(0, 100, 3)
})

score

Unnamed: 0,class,language,math
0,B,91,48
1,A,25,34
2,B,77,1


In [7]:
score['language'].sum()

193

## 求均值：`df.mean()`

求取均值的时候会自动过滤掉`np.nan`但不会过滤掉0，如果想过滤掉0那么需要提前使用`df.replace(0, np.NaN)`将0替换为np.nan。

参考：

- [mean calculation in pandas excluding zeros](https://stackoverflow.com/questions/33217636/mean-calculation-in-pandas-excluding-zeros)

In [17]:
import pandas as pd
import numpy as np

score = pd.DataFrame({
    "language":np.random.randint(0, 100, 3),
    "math":np.random.randint(0, 100, 3)
})

score

Unnamed: 0,language,math
0,20,49
1,19,8
2,65,97


In [18]:
score.loc[2, 'math'] = np.NaN
score

Unnamed: 0,language,math
0,20,49.0
1,19,8.0
2,65,


In [20]:
# 默认`skipna=1`，跳过针对NaN的处理
score.mean()

language    34.666667
math        28.500000
dtype: float64

# 其他统计

## 方差和标准差


In [21]:
score = pd.DataFrame({
    "language":np.random.randint(0, 100, 3),
    "math":np.random.randint(0, 100, 3)
})

score

Unnamed: 0,language,math
0,7,78
1,79,73
2,74,70


In [22]:
score.var()

language    1616.333333
math          16.333333
dtype: float64

In [23]:
score.std()

language    40.203648
math         4.041452
dtype: float64

# 数据格式化

## 设置小数位数和百分比

In [26]:
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'A': np.random.random(3),
    'B': np.random.random(3)
})

df

Unnamed: 0,A,B
0,0.060626,0.682458
1,0.82548,0.319137
2,0.687598,0.977458


In [27]:
df.round({'A':1, 'B':2})

Unnamed: 0,A,B
0,0.1,0.68
1,0.8,0.32
2,0.7,0.98


In [28]:
df['A'].apply(lambda x : format(x, '.2%'))

0     6.06%
1    82.55%
2    68.76%
Name: A, dtype: object

In [31]:
df['A'].map(lambda x : '{:.2%}'.format(x))

0     6.06%
1    82.55%
2    68.76%
Name: A, dtype: object