In [5]:
import numpy as np
import pandas as pd
import datetime

# 1、创建

## 1.1 to_datetime
`pd.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=None, box=True, format=None, exact=True, unit=None, infer_datetime_format=False, origin='unix', cache=True)`

- arg：integer, float, string, datetime, list, tuple, 1-d array, Series or DataFrame/dict-like
- errors：{'ignore', 'raise', 'coerce'}, 默认为 'raise'
    - 如果为ignore，遇到无法解析的字符串会返回原字符串
    - 如果为raise，遇到无法解析的字符串会抛出异常
    - 如果为coerce，遇到无法解析的字符串会转为NaT
- dayfirst：指定解析顺序（如果 arg 参数为字符串或类似于列表的对象）。如果为True，10/11/12 会被解析为 2012/11/10
- yearfirst：指定解析顺序（如果 arg 参数为字符串或类似于列表的对象）。如果为True，10/11/12 会被解析为 2010/11/12
    - 如果 dayfirst 和 yearfirst 都为True的话，yearfirst优先级高（默认）
- format：指定解析格式。
    - pd.to_datetime('12-2010-10 00:00', format='%d-%Y-%m %H:%M') 会被解析为 2010-10-12 00:00:00
    
返回类型依赖于输入：
- 输入标量，返回Timestap
- 输入数组，返回DatetimeIndex
- 输入一个Series/DataFrame，返回Series

In [7]:
# 输入标量，返回Timestap
pd.to_datetime('2019')

Timestamp('2019-01-01 00:00:00')

In [8]:
# 输入数组，返回DatetimeIndex
pd.to_datetime(['20190101', '20190201', '20190301'])

DatetimeIndex(['2019-01-01', '2019-02-01', '2019-03-01'], dtype='datetime64[ns]', freq=None)

In [9]:
# 输入一个Series，返回Series
s = pd.Series(['20190101', '20190201', '20190301'])
pd.to_datetime(s)

0   2019-01-01
1   2019-02-01
2   2019-03-01
dtype: datetime64[ns]

也可以通过DataFrame来创建时间序列，但是需要通过列名称来指定时间单位：

 `year`, `month`, `day`是必选列名
 
 `hour` , `minute`, `second`, `millisecond`, `microsecond`, `nanosecond`是可选列名

In [10]:
# 输入一个DataFrame，返回一个Series
df = pd.DataFrame({'year': [2018, 2019],'month': [3, 4], 'day': [6, 8],'hour': [3, 1], 'minute': [10, 20]})
pd.to_datetime(df)

pd.to_datetime(df[['year', 'month', 'day']])

0   2018-03-06 03:10:00
1   2019-04-08 01:20:00
dtype: datetime64[ns]

0   2018-03-06
1   2019-04-08
dtype: datetime64[ns]

In [29]:
# pandas中的时间解析是很灵活的
datestrs = ['2019-07-06 12:00:00', '1/09/2019', '20190101', 'Jul 31, 2019', np.datetime64('2018-01-01'), datetime.datetime.now()]
pd.to_datetime(datestrs)

DatetimeIndex([       '2019-07-06 12:00:00',        '2019-01-09 00:00:00',
                      '2019-01-01 00:00:00',        '2019-07-31 00:00:00',
                      '2018-01-01 00:00:00', '2019-09-08 09:52:13.906080'],
              dtype='datetime64[ns]', freq=None)

---
## 1.2 date_range
`pd.date_range(start=None, end=None, periods=None, freq=None, tz=None, normalize=False, name=None, closed=None, **kwargs)`

生成固定频率的`DatetimeIndex`

- start：string或datetime-like，默认值是None，表示日期的起点
- end：string或datetime-like，默认值是None，表示日期的终点
- periods：integer或None，默认值是None，表示你要从这个函数产生多少个日期索引值；如果是None的话，那么start和end必须不能为None
- freq：string或DateOffset，默认值是’D’，表示以自然日为单位，这个参数用来指定计时单位，比如’5H’表示每隔5个小时计算一次。在[这里](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases)可以看到所有可选值。
    - Y 表示年
    - M 表示月
    - D 表示日
    - W 表示周
    - H 表示时
    - T 表示分
    - S 表示秒
    - B 表示工作日
- tz：string或None，表示时区，例如：’Asia/Hong_Kong’
- normalize：bool，默认值为False，如果为True的话，那么在产生时间索引值之前会先把start和end都转化为当日的午夜0点
- name：str，默认值为None，给返回的时间索引指定一个名字
- closed：string或者None，默认值为None，表示start和end这个区间端点是否包含在区间内，可以有三个值，’left’表示左闭右开区间，’right’表示左开右闭区间，None表示两边都是闭区间


`start`, `end`, `periods`, `freq`这四个参数至少需要指定三个，其中`freq`默认为D

In [11]:
pd.date_range('2019-01-01', periods=3, freq='T') # freq 默认为D，可以写成3D，表示间隔为3天

DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 00:01:00',
               '2019-01-01 00:02:00'],
              dtype='datetime64[ns]', freq='T')

In [12]:
pd.date_range('20190101', periods=4, freq='10T')

DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 00:10:00',
               '2019-01-01 00:20:00', '2019-01-01 00:30:00'],
              dtype='datetime64[ns]', freq='10T')

---
## 1.3 bdate_range
`pd.bdate_range(start=None, end=None, periods=None, freq='B', tz=None, normalize=True, name=None, weekmask=None, holidays=None, closed=None, **kwargs)`

按照工作日计算

In [30]:
pd.bdate_range(start='2018-01-01', end='2019-01-01')

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-08', '2018-01-09', '2018-01-10',
               '2018-01-11', '2018-01-12',
               ...
               '2018-12-19', '2018-12-20', '2018-12-21', '2018-12-24',
               '2018-12-25', '2018-12-26', '2018-12-27', '2018-12-28',
               '2018-12-31', '2019-01-01'],
              dtype='datetime64[ns]', length=262, freq='B')

In [31]:
pd.date_range(start='2018-01-01', end='2019-01-01')

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',
               '2018-01-09', '2018-01-10',
               ...
               '2018-12-23', '2018-12-24', '2018-12-25', '2018-12-26',
               '2018-12-27', '2018-12-28', '2018-12-29', '2018-12-30',
               '2018-12-31', '2019-01-01'],
              dtype='datetime64[ns]', length=366, freq='D')

# 2、索引

In [32]:
d = pd.date_range('20180101', '20190601')
ds = pd.Series(np.random.randn(len(d)), index=d)
ds

2018-01-01    0.583192
2018-01-02    0.359886
2018-01-03    0.561006
2018-01-04    1.419344
2018-01-05    1.198088
                ...   
2019-05-28    0.919082
2019-05-29    1.327074
2019-05-30    0.132059
2019-05-31    1.492528
2019-06-01   -0.888241
Freq: D, Length: 517, dtype: float64

In [33]:
ds.index

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',
               '2018-01-09', '2018-01-10',
               ...
               '2019-05-23', '2019-05-24', '2019-05-25', '2019-05-26',
               '2019-05-27', '2019-05-28', '2019-05-29', '2019-05-30',
               '2019-05-31', '2019-06-01'],
              dtype='datetime64[ns]', length=517, freq='D')

In [34]:
ds['2019-05-01']

-0.17297847005890407

In [35]:
ds['2019-05-01':]

2019-05-01   -0.172978
2019-05-02   -0.208900
2019-05-03   -0.253992
2019-05-04   -1.345923
2019-05-05    0.446289
2019-05-06    1.581078
2019-05-07   -1.675349
2019-05-08    1.167537
2019-05-09    0.091150
2019-05-10    0.540887
2019-05-11   -0.790960
2019-05-12   -0.419958
2019-05-13   -0.942920
2019-05-14    0.266244
2019-05-15    0.266832
2019-05-16    1.761917
2019-05-17   -0.332028
2019-05-18   -0.544633
2019-05-19    0.873388
2019-05-20    0.937147
2019-05-21    0.404847
2019-05-22    0.089006
2019-05-23   -0.622744
2019-05-24    0.602965
2019-05-25   -1.019080
2019-05-26    0.011174
2019-05-27    1.320115
2019-05-28    0.919082
2019-05-29    1.327074
2019-05-30    0.132059
2019-05-31    1.492528
2019-06-01   -0.888241
Freq: D, dtype: float64

In [36]:
ds['2019-05']

2019-05-01   -0.172978
2019-05-02   -0.208900
2019-05-03   -0.253992
2019-05-04   -1.345923
2019-05-05    0.446289
2019-05-06    1.581078
2019-05-07   -1.675349
2019-05-08    1.167537
2019-05-09    0.091150
2019-05-10    0.540887
2019-05-11   -0.790960
2019-05-12   -0.419958
2019-05-13   -0.942920
2019-05-14    0.266244
2019-05-15    0.266832
2019-05-16    1.761917
2019-05-17   -0.332028
2019-05-18   -0.544633
2019-05-19    0.873388
2019-05-20    0.937147
2019-05-21    0.404847
2019-05-22    0.089006
2019-05-23   -0.622744
2019-05-24    0.602965
2019-05-25   -1.019080
2019-05-26    0.011174
2019-05-27    1.320115
2019-05-28    0.919082
2019-05-29    1.327074
2019-05-30    0.132059
2019-05-31    1.492528
Freq: D, dtype: float64

In [37]:
ds['2019-04':'2019-05']

2019-04-01   -1.960343
2019-04-02    0.255718
2019-04-03   -0.339238
2019-04-04    0.319220
2019-04-05    0.646551
                ...   
2019-05-27    1.320115
2019-05-28    0.919082
2019-05-29    1.327074
2019-05-30    0.132059
2019-05-31    1.492528
Freq: D, Length: 61, dtype: float64

In [38]:
ds['2019']

2019-01-01    2.372522
2019-01-02    0.744837
2019-01-03   -0.696702
2019-01-04    1.076816
2019-01-05    0.338946
                ...   
2019-05-28    0.919082
2019-05-29    1.327074
2019-05-30    0.132059
2019-05-31    1.492528
2019-06-01   -0.888241
Freq: D, Length: 152, dtype: float64

---
# 3、时间/日期 属性



| 属性         | 描述                                                  |
| ---------------- | ------------------------------------------------------------ |
| year             | The year of the datetime                                     |
| month            | The month of the datetime                                    |
| day              | The days of the datetime                                     |
| hour             | The hour of the datetime                                     |
| minute           | The minutes of the datetime                                  |
| second           | The seconds of the datetime                                  |
| microsecond      | The microseconds of the datetime                             |
| nanosecond       | The nanoseconds of the datetime                              |
| date             | Returns datetime.date (does not contain timezone information) |
| time             | Returns datetime.time (does not contain timezone information) |
| timetz           | Returns datetime.time as local time with timezone information |
| dayofyear        | The ordinal day of year                                      |
| weekofyear       | The week ordinal of the year                                 |
| week             | The week ordinal of the year                                 |
| dayofweek        | The number of the day of the week with Monday=0, Sunday=6    |
| weekday          | The number of the day of the week with Monday=0, Sunday=6    |
| weekday_name     | The name of the day in a week (ex: Friday)                   |
| quarter          | Quarter of the date: Jan-Mar = 1, Apr-Jun = 2, etc.          |
| days_in_month    | The number of days in the month of the datetime              |
| is_month_start   | Logical indicating if first day of month (defined by frequency) |
| is_month_end     | Logical indicating if last day of month (defined by frequency) |
| is_quarter_start | Logical indicating if first day of quarter (defined by frequency) |
| is_quarter_end   | Logical indicating if last day of quarter (defined by frequency) |
| is_year_start    | Logical indicating if first day of year (defined by frequency) |
| is_year_end      | Logical indicating if last day of year (defined by frequency) |
| is_leap_year     | Logical indicating if the date belongs to a leap year        |



In [41]:
today = pd.to_datetime(datetime.datetime.now())
today

Timestamp('2019-09-08 09:57:05.850356')

In [43]:
today.year
today.month
today.day

2019

9

8