### agg()函数
DataFrame.agg()或DataFrame.groupby.agg()、DataFrame.resample.agg()来完成灵活的列的聚合操作，不仅可以同时统计标准结果（sum\mean\std...），还能自定义函数进行统计。<br/>
(1) groupby是按行分组，agg则是按列聚合，使得pandas能够灵活进行数据集操作和统计； <br/>
(2) agg和apply一样，能够接受用户自定义的函数，进一步地提高了pandas的灵活性。

#### 1、和groupby()/resample()连用
参考链接：https://segmentfault.com/a/1190000012394176

In [1]:
import pandas as pd

df = pd.read_excel("https://github.com/chris1610/pbpython/blob/master/data/sample-salesv3.xlsx?raw=True")  # 从网络read_excel
df["date"] = pd.to_datetime(df['date'])  # 将"date"转为pandas里的datetime类型(为了能够resample)
print(df.head())

   account number                         name       sku  quantity  \
0          740150                   Barton LLC  B1-20000        39   
1          714466              Trantow-Barrows  S2-77896        -1   
2          218895                    Kulas Inc  B1-69924        23   
3          307599  Kassulke, Ondricka and Metz  S1-65481        41   
4          412290                Jerde-Hilpert  S2-34077         6   

   unit price  ext price                date  
0       86.69    3380.91 2014-01-01 07:21:51  
1       63.16     -63.16 2014-01-01 10:00:47  
2       90.70    2086.10 2014-01-01 13:24:58  
3       21.05     863.05 2014-01-01 15:05:22  
4       83.21     499.26 2014-01-01 23:26:55  


In [2]:
df1 = df.set_index('date').resample('M')['ext price'].sum()
print('对ext price进行聚合统计--按月求和')
print(df1)
df2 = df.set_index('date').groupby('name')['ext price'].resample("M").sum()
print('对ext price进行聚合统计--group by name的按月求和')
print(df2)
df3 = df.set_index('date').resample("M")[['unit price', 'ext price']].agg(['sum', 'mean'])
print('对不同列进行多种聚合统计--按月')
print(df3)
df4 = df.groupby('name').agg({'ext price':['sum', 'mean', 'std'], 'unit price':['mean']})
print('对不同列进行不同类型的聚合统计--group by name')
print(df4)
print('对不同列进行多种聚合统计--group by name的按月')
print('注意：对ResamplerGroupby(或RollingGroupby)对象，用.agg([])求多个统计值会出错（单个是可以的）！！！')
print('github issue: https://github.com/pandas-dev/pandas/issues/15072')
try:
    df5 = df.set_index('date').groupby('name').resample("M")[['unit price', 'ext price']].agg(['sum', 'mean'])
except Exception as e:
    print(e)

对ext price进行聚合统计--按月求和
date
2014-01-31    185361.66
2014-02-28    146211.62
2014-03-31    203921.38
2014-04-30    174574.11
2014-05-31    165418.55
2014-06-30    174089.33
2014-07-31    191662.11
2014-08-31    153778.59
2014-09-30    168443.17
2014-10-31    171495.32
2014-11-30    119961.22
2014-12-31    163867.26
Freq: M, Name: ext price, dtype: float64
对ext price进行聚合统计--group by name的按月求和
name                             date      
Barton LLC                       2014-01-31     6177.57
                                 2014-02-28    12218.03
                                 2014-03-31     3513.53
                                 2014-04-30    11474.20
                                 2014-05-31    10220.17
                                 2014-06-30    10463.73
                                 2014-07-31     6750.48
                                 2014-08-31    17541.46
                                 2014-09-30    14053.61
                                 2014-10-31     9351.68
  

#### 2、用自定义的函数进行统计

In [3]:
get_count = lambda x: x[x>10].count()  # lambda函数
get_count.__name__ = "count(lt 10)"
def get_max(x):  # 普通函数
    return x.value_counts(dropna=False).index[0]


df.agg({'ext price': ['sum', 'mean'], 
        'quantity': ['sum', get_count], 
        'unit price': ['mean'], 
        'sku': [get_max]})

Unnamed: 0,ext price,quantity,unit price,sku
count(lt 10),,1158.0,,
get_max,,,,S2-77896
mean,1345.856,,55.007527,
sum,2018784.0,36463.0,,
