重建索引会更改 DataFrame 的行标签和列标签，重新索引意味着符合数据以匹配特定轴上的一组给定的标签。可以通过索引来实现多个操作:

* 重新排序现有数据以匹配一组新的标签
* 在么有标签数据的标签位置插入缺失值标记

In [3]:
import pandas as pd
import numpy as np

N = 20
df = pd.DataFrame({
    'A': pd.date_range(start='2016-01-01', periods=N, freq='D'),
    'x': np.linspace(0, stop=N-1, num=N),
    'y': np.random.rand(N),
    'C': np.random.choice(['Low', 'Medium', 'High'], N).tolist(),
    'D': np.random.normal(100, 10, size=(N)).tolist()
})
print(df)
df_reindexed = df.reindex(index=[0, 2, 5], columns=['A', 'C', 'B'])
print(df_reindexed)

            A       C           D     x         y
0  2016-01-01  Medium  106.265157   0.0  0.295538
1  2016-01-02     Low   78.237318   1.0  0.153346
2  2016-01-03     Low   94.963131   2.0  0.697968
3  2016-01-04    High  104.917026   3.0  0.331079
4  2016-01-05     Low   86.994989   4.0  0.775861
5  2016-01-06     Low  103.636295   5.0  0.616176
6  2016-01-07  Medium   97.255238   6.0  0.477146
7  2016-01-08  Medium   86.008271   7.0  0.968608
8  2016-01-09  Medium  107.270366   8.0  0.928698
9  2016-01-10  Medium   99.670510   9.0  0.871012
10 2016-01-11     Low   93.518821  10.0  0.649499
11 2016-01-12     Low  111.249369  11.0  0.009892
12 2016-01-13    High   87.917952  12.0  0.741582
13 2016-01-14     Low  107.717665  13.0  0.534884
14 2016-01-15    High  114.813900  14.0  0.278571
15 2016-01-16    High   84.657928  15.0  0.579414
16 2016-01-17  Medium   99.428451  16.0  0.273731
17 2016-01-18    High  103.114903  17.0  0.199907
18 2016-01-19    High  107.326929  18.0  0.410198


## 重建索引与其他对象对齐

有时可能希望采取一个对象和重新索引，其轴被标记为与另一个对象相同。

In [6]:
df1 = pd.DataFrame(np.random.randn(10, 3), columns=['col1', 'col2', 'col3'])
df2 = pd.DataFrame(np.random.randn(7, 3), columns=['col1', 'col2', 'col3'])

print(df1)
print(df2)
# 重建index，列名应该匹配，否则将为整个列标签添加NaN
df1 = df1.reindex_like(df2)
print(df1)

       col1      col2      col3
0 -2.154776 -2.544893  0.179949
1 -0.377879  1.075211 -1.745613
2 -1.631973 -0.707619  0.298654
3  1.189085 -0.179246  2.101088
4 -1.476757 -2.125389  1.072406
5  0.164640 -0.931041 -0.879949
6  1.169516  1.944424 -0.314656
7  0.273267 -0.360891 -0.285548
8 -0.990223 -0.616278  0.935378
9 -0.701698  0.613181 -1.498197
       col1      col2      col3
0 -0.703012 -0.251066 -0.237921
1  0.232135 -0.418187  0.520454
2  0.055573  0.432991 -0.133974
3 -0.179175  1.098769 -0.911175
4 -1.119787  0.172555 -0.768496
5  0.322172 -0.428272  0.605681
6  1.140175  0.222858 -0.315845
       col1      col2      col3
0 -2.154776 -2.544893  0.179949
1 -0.377879  1.075211 -1.745613
2 -1.631973 -0.707619  0.298654
3  1.189085 -0.179246  2.101088
4 -1.476757 -2.125389  1.072406
5  0.164640 -0.931041 -0.879949
6  1.169516  1.944424 -0.314656


### 填充时重新加注

reindex() 采用可选参数，它是一个填充方法，其值如下:

* pad/ffill    - 向前填充值
* bfill/backfill    - 向后填充值
* nearest    - 从最近的索引值填充

In [8]:
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3'])
print(df2.reindex_like(df1))
df2.reindex_like(df1, method='ffill')

       col1      col2      col3
0  0.662009  0.834706 -0.849691
1  1.450630 -0.353223 -0.974180
2       NaN       NaN       NaN
3       NaN       NaN       NaN
4       NaN       NaN       NaN
5       NaN       NaN       NaN


Unnamed: 0,col1,col2,col3
0,0.662009,0.834706,-0.849691
1,1.45063,-0.353223,-0.97418
2,1.45063,-0.353223,-0.97418
3,1.45063,-0.353223,-0.97418
4,1.45063,-0.353223,-0.97418
5,1.45063,-0.353223,-0.97418


### 重建索引时的填充限制

限制参数在重建索引时提供对填充的而外控制。限制指定连续匹配的最大计数。

In [10]:
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3'])

print(df2.reindex_like(df1, method='ffill', limit=1))

       col1      col2      col3
0  1.554232 -0.248377 -1.346286
1 -1.972903 -1.052737  0.078422
2 -1.972903 -1.052737  0.078422
3       NaN       NaN       NaN
4       NaN       NaN       NaN
5       NaN       NaN       NaN


## 重命名

rename() 方法允许基于一些映射(字典或系列)或任意函数来重新标记一个轴

In [15]:
df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3'])
print(df1)
print(df1.rename(columns={'col1': 'c1', 'col2': 'c2'}, 
                 index={0: 'apple', 1: 'banana', 2: 'durian'}))


       col1      col2      col3
0 -0.841419 -1.047778  0.526923
1 -0.240298  2.084587  2.137572
2 -1.502691  0.141445 -0.562457
3  0.233206  1.348206 -0.061894
4 -0.043736  0.090903  0.285587
5 -0.303958 -0.654622 -0.519046
              c1        c2      col3
apple  -0.841419 -1.047778  0.526923
banana -0.240298  2.084587  2.137572
durian -1.502691  0.141445 -0.562457
3       0.233206  1.348206 -0.061894
4      -0.043736  0.090903  0.285587
5      -0.303958 -0.654622 -0.519046
