### pandas的时间索引(DatetimeIndex)
pandas为DataFrame(以及Series)提供了时间索引，这是因为很多应用场景的索引确实都是时间点，专门为此提供函数能够提高代码的编写和运行效率。<br/>
(1) DatetimeIndex继承了datatime.datetime这个类，因此可以将DatetimeIndex视为一个datetime.datetime类型的index，index的每一个元素叫Timestamp； <br/>
(2) resample这样的时间函数，只有在TimeIndex的情况下才能使用； <br/>
(3) 在TimeIndex的基础上，查看一个标准的时间段(freq来决定时间段长短)，产生了period类--更宏观地看时间，目前的思考，用period作为index则是PeriodIndex； <br/>
(4) 可以用date_range和period_range来生成相应的Index，DatetimeIndex和PeriodIndex也可以进行频率转换(asfreq方法)。

In [1]:
import pandas as pd
import datetime
import numpy as np

# pandas的时间点(pd.Timestamp)在概念上和datetime.datetime对应
now = datetime.datetime.now()
pd_now = pd.to_datetime(now)
type(pd_now)

pandas._libs.tslibs.timestamps.Timestamp

In [2]:
# resample等pandas的时间处理函数要求index是datetime(pandas的Timestamp)格式，因此需要转换一下
df = pd.DataFrame({'date': ['2018-12-01', '2018-12-02', '2018-12-03', '2018-12-04'],
                            'number': list(range(4))})
# df1: 直接resample会出现错误
df1 = df.copy()
df1.set_index('date', inplace=True)
print(df1.index)
try:
    df1 = df1.resample('8H')
except Exception as error:
    print('--cannot resample df1--')
    print(f'--error: {error}--')
    
# df2: 用pd.to_datetime函数转换
df2 = df.copy()
df2['date'] = pd.to_datetime(df2['date'])
df2.set_index('date', inplace=True)
print(df2.index)
df2 = df2.resample('8H').asfreq()
print(df2)

# df3: 用pd.DatetimeIndex设置
df3 = df.copy()
df3.set_index('date', inplace=True)
df3.index = pd.DatetimeIndex(df3.index)
print(df3.index)

#df4: 用datetime.datetime.strptime函数转换
df4 = df.copy()
df4['date'] = df4['date'].apply(lambda x: datetime.datetime.strptime(x, '%Y-%m-%d'))
df4.set_index('date', inplace=True)
print(df4.index)
df4 = df4.resample('8H').ffill()
print(df4)

Index(['2018-12-01', '2018-12-02', '2018-12-03', '2018-12-04'], dtype='object', name='date')
--cannot resample df1--
--error: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'--
DatetimeIndex(['2018-12-01', '2018-12-02', '2018-12-03', '2018-12-04'], dtype='datetime64[ns]', name='date', freq=None)
                     number
date                       
2018-12-01 00:00:00     0.0
2018-12-01 08:00:00     NaN
2018-12-01 16:00:00     NaN
2018-12-02 00:00:00     1.0
2018-12-02 08:00:00     NaN
2018-12-02 16:00:00     NaN
2018-12-03 00:00:00     2.0
2018-12-03 08:00:00     NaN
2018-12-03 16:00:00     NaN
2018-12-04 00:00:00     3.0
DatetimeIndex(['2018-12-01', '2018-12-02', '2018-12-03', '2018-12-04'], dtype='datetime64[ns]', name='date', freq=None)
DatetimeIndex(['2018-12-01', '2018-12-02', '2018-12-03', '2018-12-04'], dtype='datetime64[ns]', name='date', freq=None)
                     number
date                       
2018-12-01 00:00:00       0

In [3]:
# Period类
## 用pd.Period.now()生成一个Period类
print('用pd.Period.now()生成一个Period类')
now_day_H = pd.Period.now(freq="H")
print(now_day_H)
print(type(now_day_H))
now_day_Q = pd.Period.now(freq="Q")
print(now_day_Q)
print(now_day_Q.start_time)  # start_time方法获得一个datetime.datetime类型结果
print(type(now_day_Q.start_time))
print(now_day_Q.start_time.month)  # 在start_time方法结果上再运用datetime.datetime的方法
print(now_day_Q.end_time)
print(type(now_day_Q.end_time))
now_day_W = pd.Period.now(freq="W-FRI")
print(now_day_W)
now_day_A = pd.Period.now(freq="A")
print(now_day_A)

## 用Timestamps.to_period()生成一个Period类
print('用Timestamps.to_period()生成一个Period类')
now = datetime.datetime.now()
pd_now = pd.to_datetime(now)
print(pd_now.to_period('H'))
print(pd_now.to_period('H').end_time)

用pd.Period.now()生成一个Period类
2018-12-25 18:00
<class 'pandas._libs.tslibs.period.Period'>
2018Q4
2018-10-01 00:00:00
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
10
2018-12-31 23:59:59.999999999
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
2018-12-22/2018-12-28
2018
用Timestamps.to_period()生成一个Period类
2018-12-25 18:00
2018-12-25 18:59:59.999999999


In [4]:
# pd.date_range和pd.period_range
dr = pd.date_range(start='2017-01-01', end='2018-01-01', freq='M')
print(dr)
pr = pd.period_range(start='2017-01-01', end='2018-01-01', freq='M')
print(pr)

# DatetimeIndex的asfreq()方法
print('DatetimeIndex的df：')
df = pd.DataFrame({'A': range(len(dr)), 'B': np.random.randn(len(dr))}, index=dr)
print(df)
df = df.asfreq('15D', method='ffill', how='end')
print('DatetimeIndex.asfreq()作用之后(相当于升采样)')
print(df)

# PeriodIndex的asfreq()方法
print('PeriodIndex的df：')
df = pd.DataFrame({'A': range(len(pr)), 'B': np.random.randn(len(pr))}, index=pr)
print(df)
df = df.asfreq('Q', how='end')
df = df.groupby(df.index).sum()
print('PeriodIndex.asfreq()，然后再groupby().sum()作用之后(相当于降采样)')
print(df)

DatetimeIndex(['2017-01-31', '2017-02-28', '2017-03-31', '2017-04-30',
               '2017-05-31', '2017-06-30', '2017-07-31', '2017-08-31',
               '2017-09-30', '2017-10-31', '2017-11-30', '2017-12-31'],
              dtype='datetime64[ns]', freq='M')
PeriodIndex(['2017-01', '2017-02', '2017-03', '2017-04', '2017-05', '2017-06',
             '2017-07', '2017-08', '2017-09', '2017-10', '2017-11', '2017-12',
             '2018-01'],
            dtype='period[M]', freq='M')
DatetimeIndex的df：
             A         B
2017-01-31   0 -0.006997
2017-02-28   1  1.088747
2017-03-31   2 -0.781522
2017-04-30   3 -0.398976
2017-05-31   4  0.759961
2017-06-30   5 -2.219370
2017-07-31   6 -0.226897
2017-08-31   7 -1.566285
2017-09-30   8  2.713642
2017-10-31   9 -1.388524
2017-11-30  10  1.402581
2017-12-31  11 -0.486642
DatetimeIndex.asfreq()作用之后(相当于升采样)
             A         B
2017-01-31   0 -0.006997
2017-02-15   0 -0.006997
2017-03-02   1  1.088747
2017-03-17   1  1.088747
2017-04-01 