# 使用Python处理时序

## 1. DatetimeIndex
### 1.1 创建DatetimeIndex
1.这个DatetimeIndex也就是时序索引，pandas为了创建时序索引，提供了date_range函数，几个参数是 开始日期、频率参数（freq）、周期数（periods）/或者结束日期 \
2.to_datetime函数可以将字符串（object）转为datetime类型。同时，也可以在read_csv()方法中指定参数，index_col指定索引列，parse_dates指定转化为datetime类型的列。我在例子中，统一使用类型转化的方式来处理。

In [29]:
import pandas as pd 
import numpy as np 
pd.options.plotting.backend = "plotly"

# 使用pandas创建时序索引 -----------------------
di_1 = pd.date_range('2022-09-04',periods=5, freq='W')
di_2 = pd.date_range('2022-09-04', '2022-09-30', freq='D')
# 根据di_1时序索引来创建一个dataframe，时序data_frame
df_ti_1 = pd.DataFrame(
    data = [183, 562, 18, 97, 49] ,
    columns = ["visitors"],
    index = di_1 
)
df_ti_1 


# 更改数据类型，使用to_datetime函数完成变更 
msft = pd.read_csv("data\MSFT.csv")
msft.loc[:,"Date"] = pd.to_datetime(msft["Date"])
msft.loc[:,"Volume"] = msft["Volume"].astype("int")
# msft.info()
msft

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume
0,1986-03-13,0.088542,0.101563,0.088542,0.097222,0.062205,1031788800
1,1986-03-14,0.097222,0.102431,0.097222,0.100694,0.064427,308160000
2,1986-03-17,0.100694,0.103299,0.100694,0.102431,0.065537,133171200
3,1986-03-18,0.102431,0.103299,0.098958,0.099826,0.063871,67766400
4,1986-03-19,0.099826,0.100694,0.097222,0.098090,0.062760,47894400
...,...,...,...,...,...,...,...
8617,2020-05-20,184.809998,185.850006,183.940002,185.660004,185.660004,31261300
8618,2020-05-21,185.399994,186.669998,183.289993,183.429993,183.429993,29119500
8619,2020-05-22,183.190002,184.460007,182.539993,183.509995,183.509995,20826900
8620,2020-05-26,186.339996,186.500000,181.100006,181.570007,181.570007,36073600


### 1.2 筛选时序dataframe
1.如果dateframe是以时序作为索引的，可以方便的做筛选，给字符串参数可以按照年、月、日筛选 \
2.pd.DateOffset表示一个时间差，可以是小时、分钟等。\
3.使用tz_localize可以用用来设定时区

In [48]:
# msft = msft.set_index("Date")      #只能执行一次
msft.loc["1987-01":"2000-06", "High"].plot()
msft_close = msft.loc[:,["Adj Close"]].copy()
msft_close.index = msft_close.index + pd.DateOffset(hours=6)   # 索引都增加 6小时
msft_close = msft_close.tz_localize("America/New_York")   # 时区切换
# msft_close.info()
msft_close

Unnamed: 0_level_0,Adj Close
Date,Unnamed: 1_level_1
1986-03-13 06:00:00-05:00,0.062205
1986-03-14 06:00:00-05:00,0.064427
1986-03-17 06:00:00-05:00,0.065537
1986-03-18 06:00:00-05:00,0.063871
1986-03-19 06:00:00-05:00,0.062760
...,...
2020-05-20 06:00:00-04:00,185.660004
2020-05-21 06:00:00-04:00,183.429993
2020-05-22 06:00:00-04:00,183.509995
2020-05-26 06:00:00-04:00,181.570007
