# 使用Python处理时序

## 1. DatetimeIndex
### 1.1 创建DatetimeIndex
1.这个DatetimeIndex也就是时序索引，pandas为了创建时序索引，提供了date_range函数，几个参数是 开始日期、频率参数（freq）、周期数（periods）/或者结束日期 \
2.to_datetime函数可以将字符串（object）转为datetime类型。同时，也可以在read_csv()方法中指定参数，index_col指定索引列，parse_dates指定转化为datetime类型的列。我在例子中，统一使用类型转化的方式来处理。

In [1]:
import pandas as pd 
import numpy as np 
pd.options.plotting.backend = "plotly"

# 使用pandas创建时序索引 -----------------------
di_1 = pd.date_range('2022-09-04',periods=5, freq='W')
di_2 = pd.date_range('2022-09-04', '2022-09-30', freq='D')
# 根据di_1时序索引来创建一个dataframe，时序data_frame
df_ti_1 = pd.DataFrame(
    data = [183, 562, 18, 97, 49] ,
    columns = ["visitors"],
    index = di_1 
)
df_ti_1 


# 更改数据类型，使用to_datetime函数完成变更 
msft = pd.read_csv("data\MSFT.csv")
msft.loc[:,"Date"] = pd.to_datetime(msft["Date"])
msft.loc[:,"Volume"] = msft["Volume"].astype("int")
# msft.info()
msft

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,1986-03-13,0.088542,0.101563,0.088542,0.097222,0.062205,1031788800
1,1986-03-14,0.097222,0.102431,0.097222,0.100694,0.064427,308160000
2,1986-03-17,0.100694,0.103299,0.100694,0.102431,0.065537,133171200
3,1986-03-18,0.102431,0.103299,0.098958,0.099826,0.063871,67766400
4,1986-03-19,0.099826,0.100694,0.097222,0.098090,0.062760,47894400
...,...,...,...,...,...,...,...
8617,2020-05-20,184.809998,185.850006,183.940002,185.660004,185.660004,31261300
8618,2020-05-21,185.399994,186.669998,183.289993,183.429993,183.429993,29119500
8619,2020-05-22,183.190002,184.460007,182.539993,183.509995,183.509995,20826900
8620,2020-05-26,186.339996,186.500000,181.100006,181.570007,181.570007,36073600


### 1.2 筛选时序dataframe & 时区处理
1.如果dateframe是以时序作为索引的，可以方便的做筛选，给字符串参数可以按照年、月、日筛选 \
2.pd.DateOffset表示一个时间差，可以是小时、分钟等。\
3.使用tz_localize可以用用来设定时区。 

In [9]:
# msft = msft.set_index("Date")      #只能执行一次
msft.loc["1987-01":"2000-06", "High"].plot()
msft_close = msft.loc[:,["Adj Close"]].copy()
msft_close.index = msft_close.index + pd.DateOffset(hours=6)   # 索引都增加 6小时
msft_close = msft_close.tz_localize("America/New_York")   # 时区切换
# msft_close.info()
msft_close

# 筛选出2020-01的股价变化
msft.loc[ "2020-01" ,: ]
msft_202001 = msft.loc[ "2020-01" , "Low":"Volume" ]

Unnamed: 0_level_0,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-01-02,158.330002,160.619995,159.737595,22622100
2020-01-03,158.059998,158.619995,157.748581,21116200
2020-01-06,156.509995,159.029999,158.156342,20813700
2020-01-07,157.320007,157.580002,156.71431,21634100
2020-01-08,157.949997,160.089996,159.210495,27746500
2020-01-09,161.029999,162.089996,161.199509,21385000
2020-01-10,161.179993,161.339996,160.453644,20725900
2020-01-13,161.259995,163.279999,162.38298,21626500
2020-01-14,161.720001,162.130005,161.239304,23477400
2020-01-15,162.570007,163.179993,162.283539,21417900


# 2.时序操作
1.pandas使用shift方法，将值下移一行（除了索引列，都是值）。shift的参数为正时，就向下移动，为负时，向上移动。 \
2.pandas内置方法pct_change ，在默认情况下，会计算相对前一行数据的百分比变化率。 

In [21]:
# 以下2个计算返回相同的计算结果
msft_202001_rate = msft_202001/msft_202001.shift(1) - 1 
msft_202001_pct_change = msft_202001.pct_change()


Unnamed: 0_level_0,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-01-02,,,,
2020-01-03,-0.001705,-0.012452,-0.012452,-0.066568
2020-01-06,-0.009806,0.002585,0.002585,-0.014325
2020-01-07,0.005175,-0.009118,-0.009118,0.039416
2020-01-08,0.004005,0.015928,0.015928,0.282535
2020-01-09,0.0195,0.012493,0.012493,-0.229272
2020-01-10,0.000931,-0.004627,-0.004627,-0.030821
2020-01-13,0.000496,0.012024,0.012024,0.043453
2020-01-14,0.002853,-0.007043,-0.007043,0.085585
2020-01-15,0.005256,0.006476,0.006476,-0.087723


In [20]:
msft_202001_pct_change

Unnamed: 0_level_0,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-01-02,,,,
2020-01-03,-0.001705,-0.012452,-0.012452,-0.066568
2020-01-06,-0.009806,0.002585,0.002585,-0.014325
2020-01-07,0.005175,-0.009118,-0.009118,0.039416
2020-01-08,0.004005,0.015928,0.015928,0.282535
2020-01-09,0.0195,0.012493,0.012493,-0.229272
2020-01-10,0.000931,-0.004627,-0.004627,-0.030821
2020-01-13,0.000496,0.012024,0.012024,0.043453
2020-01-14,0.002853,-0.007043,-0.007043,0.085585
2020-01-15,0.005256,0.006476,0.006476,-0.087723
