# 什么是环比和同比增长？

先说**环比增长**  

环比增长就是当月与上月相比较的增长情况，比如2021年10月的数据，与2021年9月的数据相比较，这就是环比增长。  

再说**同比增长**  

同比增长就是今年当月与上一年同月比较的增长情况。比如2021年10月的数据，与2020年10月的数据相比较，这就是同比增长。  

# 两者有什么区别？

环比增长和同比增长区别就是两点： 

一是比较年份不同，通常情况**环比是同一年的数据**比较，只有1月份是上一年12月份相比；同比是**当年数据和上一年数据**比较。 

二是比较月份不同，环比是本月与上月数据比较，同比则是两年的同一个月份数据比较。 

# 环比增长和同比增长如何计算？

计算方式如下： 

（1）同比增长  

同比增长=本期数-同期数）÷同期数*100%  

（2）环比增长

环比增长=（本期数－上期数）÷上期数×100%  


举个例子：2021年9月北京房价是85000元一平米，10月北京房价是82000元一平米，而2020年9月北京房价是75000元一平米，10月是79000元平米，那2021年10月北京房价环比增长了多少，同比又是多少？   
 
环比增长= (82000-85000)/85000 = -0.035  

同比增长= (82000-79000)/79000 = 0.038 

环比和同比数据都可以是正数，也可以是负数。 

**正数表示增长，负数表示下跌**。 

以上数据乘以100就是增长或者下跌的百分数了。 

计算方式都一样，不同行业，更换相关数据即可。 

# 使用pandas计算环比和同比的方法

在进行业务数据分析时，往往需要使用pandas计算环比、同比及增长率等指标，为了能够更加方便的进行的统计数据，整理方法如下。  

## 1.数据准备
为方便进行演示，此处提前生成需要进行统计的数据，数据已经是按照时间维度进行排序。

In [1]:
import pandas as pd
import numpy as np

months = pd.date_range(start='2010-01-01', end='2020-12-31', freq='M')
month_num = months.shape[0]
print(month_num, '个月')
test_df = pd.DataFrame({'month': months, 'v': 100*np.random.rand(month_num, 1).reshape(month_num)})
test_df

132 个月


Unnamed: 0,month,v
0,2010-01-31,19.966536
1,2010-02-28,47.576077
2,2010-03-31,64.095895
3,2010-04-30,39.735408
4,2010-05-31,81.231337
...,...,...
127,2020-08-31,84.394486
128,2020-09-30,65.352021
129,2020-10-31,90.816197
130,2020-11-30,0.232423


## 2.环比计算
### 2.1 方法1

In [2]:
test_df['v_last']=test_df['v'].shift(1)
test_df['month_erlier_1']=test_df['v']/test_df['v_last'] - 1
test_df

Unnamed: 0,month,v,v_last,month_erlier_1
0,2010-01-31,19.966536,,
1,2010-02-28,47.576077,19.966536,1.382791
2,2010-03-31,64.095895,47.576077,0.347229
3,2010-04-30,39.735408,64.095895,-0.380063
4,2010-05-31,81.231337,39.735408,1.044306
...,...,...,...,...
127,2020-08-31,84.394486,24.896987,2.389747
128,2020-09-30,65.352021,84.394486,-0.225636
129,2020-10-31,90.816197,65.352021,0.389646
130,2020-11-30,0.232423,90.816197,-0.997441


### 2.2 方法2

In [3]:
test_df['m_m_diff']=test_df['v'].diff()
test_df['month_erlier_2']=test_df['m_m_diff']/test_df['v'].shift(1)
test_df

Unnamed: 0,month,v,v_last,month_erlier_1,m_m_diff,month_erlier_2
0,2010-01-31,19.966536,,,,
1,2010-02-28,47.576077,19.966536,1.382791,27.609541,1.382791
2,2010-03-31,64.095895,47.576077,0.347229,16.519817,0.347229
3,2010-04-30,39.735408,64.095895,-0.380063,-24.360487,-0.380063
4,2010-05-31,81.231337,39.735408,1.044306,41.495929,1.044306
...,...,...,...,...,...,...
127,2020-08-31,84.394486,24.896987,2.389747,59.497499,2.389747
128,2020-09-30,65.352021,84.394486,-0.225636,-19.042465,-0.225636
129,2020-10-31,90.816197,65.352021,0.389646,25.464176,0.389646
130,2020-11-30,0.232423,90.816197,-0.997441,-90.583775,-0.997441


### 2.3 方法3

In [4]:
test_df['month_erlier_3']=test_df['v'].pct_change()
test_df

Unnamed: 0,month,v,v_last,month_erlier_1,m_m_diff,month_erlier_2,month_erlier_3
0,2010-01-31,19.966536,,,,,
1,2010-02-28,47.576077,19.966536,1.382791,27.609541,1.382791,1.382791
2,2010-03-31,64.095895,47.576077,0.347229,16.519817,0.347229,0.347229
3,2010-04-30,39.735408,64.095895,-0.380063,-24.360487,-0.380063,-0.380063
4,2010-05-31,81.231337,39.735408,1.044306,41.495929,1.044306,1.044306
...,...,...,...,...,...,...,...
127,2020-08-31,84.394486,24.896987,2.389747,59.497499,2.389747,2.389747
128,2020-09-30,65.352021,84.394486,-0.225636,-19.042465,-0.225636,-0.225636
129,2020-10-31,90.816197,65.352021,0.389646,25.464176,0.389646,0.389646
130,2020-11-30,0.232423,90.816197,-0.997441,-90.583775,-0.997441,-0.997441


## 3.同比计算
继续使用上述构建的数据源进行计算。

### 3.1 方法1

In [5]:
test_df["last_year_v"]=test_df['v'].shift(12)
test_df['year_erlier_1']=test_df['v']/test_df['last_year_v'] - 1
test_df

Unnamed: 0,month,v,v_last,month_erlier_1,m_m_diff,month_erlier_2,month_erlier_3,last_year_v,year_erlier_1
0,2010-01-31,19.966536,,,,,,,
1,2010-02-28,47.576077,19.966536,1.382791,27.609541,1.382791,1.382791,,
2,2010-03-31,64.095895,47.576077,0.347229,16.519817,0.347229,0.347229,,
3,2010-04-30,39.735408,64.095895,-0.380063,-24.360487,-0.380063,-0.380063,,
4,2010-05-31,81.231337,39.735408,1.044306,41.495929,1.044306,1.044306,,
...,...,...,...,...,...,...,...,...,...
127,2020-08-31,84.394486,24.896987,2.389747,59.497499,2.389747,2.389747,99.052091,-0.147979
128,2020-09-30,65.352021,84.394486,-0.225636,-19.042465,-0.225636,-0.225636,38.051781,0.717450
129,2020-10-31,90.816197,65.352021,0.389646,25.464176,0.389646,0.389646,82.214969,0.104619
130,2020-11-30,0.232423,90.816197,-0.997441,-90.583775,-0.997441,-0.997441,87.849989,-0.997354


### 3.2 方法2

In [24]:
test_df["year_diff"]=test_df['v'].diff(12)
#test_df['year_diff'].fillna(0,inplace=True)
test_df['year_erlier_2']=test_df['year_diff']/test_df['v'].shift(12)
test_df

Unnamed: 0,month,v,v_last,month_erlier_1,m_m_diff,month_erlier_2,month_erlier_3,last_year_v,year_erlier_1,year_diff,year_erlier_2,year_erlier_3
0,2010-01-31,19.966536,,,,,,,,,,
1,2010-02-28,47.576077,19.966536,1.382791,27.609541,1.382791,1.382791,,,,,
2,2010-03-31,64.095895,47.576077,0.347229,16.519817,0.347229,0.347229,,,,,
3,2010-04-30,39.735408,64.095895,-0.380063,-24.360487,-0.380063,-0.380063,,,,,
4,2010-05-31,81.231337,39.735408,1.044306,41.495929,1.044306,1.044306,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...
127,2020-08-31,84.394486,24.896987,2.389747,59.497499,2.389747,2.389747,99.052091,-0.147979,-14.657605,-0.147979,-0.147979
128,2020-09-30,65.352021,84.394486,-0.225636,-19.042465,-0.225636,-0.225636,38.051781,0.717450,27.300240,0.717450,0.717450
129,2020-10-31,90.816197,65.352021,0.389646,25.464176,0.389646,0.389646,82.214969,0.104619,8.601228,0.104619,0.104619
130,2020-11-30,0.232423,90.816197,-0.997441,-90.583775,-0.997441,-0.997441,87.849989,-0.997354,-87.617567,-0.997354,-0.997354


### 3.3 方法3

In [7]:
test_df['year_erlier_3']=test_df["v"].pct_change(periods=12)
test_df

Unnamed: 0,month,v,v_last,month_erlier_1,m_m_diff,month_erlier_2,month_erlier_3,last_year_v,year_erlier_1,year_diff,year_erlier_2,year_erlier_3
0,2010-01-31,19.966536,,,,,,,,0.000000,0.000000,
1,2010-02-28,47.576077,19.966536,1.382791,27.609541,1.382791,1.382791,,,0.000000,0.000000,
2,2010-03-31,64.095895,47.576077,0.347229,16.519817,0.347229,0.347229,,,0.000000,0.000000,
3,2010-04-30,39.735408,64.095895,-0.380063,-24.360487,-0.380063,-0.380063,,,0.000000,0.000000,
4,2010-05-31,81.231337,39.735408,1.044306,41.495929,1.044306,1.044306,,,0.000000,0.000000,
...,...,...,...,...,...,...,...,...,...,...,...,...
127,2020-08-31,84.394486,24.896987,2.389747,59.497499,2.389747,2.389747,99.052091,-0.147979,-14.657605,-0.147979,-0.147979
128,2020-09-30,65.352021,84.394486,-0.225636,-19.042465,-0.225636,-0.225636,38.051781,0.717450,27.300240,0.717450,0.717450
129,2020-10-31,90.816197,65.352021,0.389646,25.464176,0.389646,0.389646,82.214969,0.104619,8.601228,0.104619,0.104619
130,2020-11-30,0.232423,90.816197,-0.997441,-90.583775,-0.997441,-0.997441,87.849989,-0.997354,-87.617567,-0.997354,-0.997354


## 4.关于pct_change()函数
pct_change主要涉及一下参数：

periods=1，用来设置计算的周期。  
fill_method=‘pad’，如何在计算百分比变化之前处理缺失值(NA)。  
limit=None，设置停止填充条件，即当遇到填充的连续缺失值的数量n时，停止此处填充  
freq=None，从时间序列 API 中使用的增量（例如 ‘M’ 或 BDay())  
### 4.1 使用例子1

In [12]:
#构建数据
months = pd.date_range(start='2020-01-01', end='2020-12-31', freq='M')
test_df2 = pd.DataFrame({'month': months,
                         'v': 100*np.random.rand(months.shape[0], 1).reshape(months.shape[0])})
test_df2.loc[((test_df2.index>5) & (test_df2.index<9) ),'v']=np.nan
test_df2.loc[test_df2.index==3,'v']=np.nan
test_df2.loc[test_df2.index==10,'v']=np.nan
test_df2

Unnamed: 0,month,v
0,2020-01-31,1.423735
1,2020-02-29,87.849192
2,2020-03-31,42.760506
3,2020-04-30,
4,2020-05-31,68.371061
5,2020-06-30,95.850213
6,2020-07-31,
7,2020-08-31,
8,2020-09-30,
9,2020-10-31,22.547071


计算环比：

In [17]:
#向下进行填充，当连续缺失值的数量大于2时不进行填充
# filling NAs with last valid observation forward to next valid
test_df2['month_erlier_1'] = test_df2['v'].pct_change(1,fill_method='ffill',limit=2)
test_df2

Unnamed: 0,month,v,month_erlier_1
0,2020-01-31,1.423735,
1,2020-02-29,87.849192,60.70334
2,2020-03-31,42.760506,-0.513251
3,2020-04-30,,0.0
4,2020-05-31,68.371061,0.59893
5,2020-06-30,95.850213,0.401912
6,2020-07-31,,0.0
7,2020-08-31,,0.0
8,2020-09-30,,
9,2020-10-31,22.547071,


### 4.2 使用例子2

In [18]:
# 生成样本数据
test_df3 = pd.DataFrame({'2020': 100*np.random.rand(5).reshape(5),
                         '2019': 100*np.random.rand(5).reshape(5),
                         '2018':  100*np.random.rand(5).reshape(5)})
test_df3

Unnamed: 0,2020,2019,2018
0,18.62355,99.163046,16.735136
1,27.258955,61.940332,42.035729
2,68.149457,44.146108,65.547646
3,40.121564,87.582442,95.435632
4,31.814846,39.561549,46.616974


计算同环比：

In [19]:
test_df3.pct_change(axis='columns',periods=-1)

Unnamed: 0,2020,2019,2018
0,-0.812193,4.92544,
1,-0.559916,0.473516,
2,0.543725,-0.326504,
3,-0.541899,-0.082288,
4,-0.195814,-0.151349,


In [20]:
test_df3.pct_change(axis='columns',periods=1)

Unnamed: 0,2020,2019,2018
0,,4.324605,-0.831236
1,,1.272293,-0.321351
2,,-0.352216,0.484789
3,,1.182927,0.089666
4,,0.243493,0.17834


### 4.3 使用例子3

In [21]:
#构建数据样本
months = pd.date_range(start='2020-01-01', end='2020-12-31', freq='M')

test_df4 = pd.DataFrame({'v': 100*np.random.rand(months.shape[0], 1).reshape(months.shape[0])}, index=months)
test_df4

Unnamed: 0,v
2020-01-31,80.499444
2020-02-29,25.238978
2020-03-31,65.631277
2020-04-30,69.010749
2020-05-31,81.933116
2020-06-30,39.740206
2020-07-31,60.881351
2020-08-31,72.80313
2020-09-30,8.722079
2020-10-31,57.490831


计算季度末环比：

In [22]:
test_df4["v"].pct_change(freq="Q")

2020-01-31         NaN
2020-02-29         NaN
2020-03-31   -0.184699
2020-04-30         NaN
2020-05-31         NaN
2020-06-30   -0.394493
2020-07-31         NaN
2020-08-31         NaN
2020-09-30   -0.780523
2020-10-31         NaN
2020-11-30         NaN
2020-12-31    6.551663
Freq: M, Name: v, dtype: float64

计算过程解释：  
2020-03-31行处的值：使用3月份和1月份进行环比，即55.717305/84.492806-1  
2020-06-30行处的值：使用6月份和3月份进行环比  

## 5. 小总结

**计算环比增长**  

方法一：  

```python
for i in range(0,len(data)):
    if i == 0:
        data['huanbi'][i] = 'null'
    else:
        data['huanbi'][i] = format((data['mony'][i] - data['mony'][i-1])/data['mony'][i-1],'.2%')
        #format(res,'.2%') 小数格式化为百分数
```

方法二：  

使用`diff(periods=1, axis=0)）` 一阶差分函数   

periods：移动的幅度 默认值为1   

axis:移动的方向，{0 or ‘index’, 1 or ‘columns’}，如果为0或者’index’，则上下移动，如果为1或者’columns’，则左右移动。默认列向移动  

```python
data['huanbi_1'] = data.mony.diff()
```  

方法三：  

使用pct_change()  

```python
data['huanbi_1'] = data.mony.pct_change()  
data.fillna(0,inplace=True)
``` 

**计算同比增长**  

使用一阶差分函数diff()
```python
data['tongbi_shu'] = data.mony.diff(12)
data.fillna(0,inplace=True)
data['tongbi'] = data['tongbi_shu']/(data['mony'] - data['tongbi_shu'])
```


以上就是时候用pandas进行计算同比和环比的方法，请在使用过程中，结合数据情况先进行数据清洗后，再选择合适的方法进行计算。