# 11.4 Time Zone Handling（时区处理）

In [1]:
import pytz

In [2]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

想要从pytz中得到一个时区对象（time zone object），使用pytz.timezone:

In [3]:
tz = pytz.timezone('America/New_York')
tz

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

# 1 Time Zone Localization and Conversion（时区定位和转换）

默认的，pandas中的时间序列是time zone naive（朴素时区）。例如，考虑下面的时间序列：

In [4]:
import pandas as pd
import numpy as np

In [5]:
rng = pd.date_range('3/9/2012 9:30', periods=6, freq='D')

In [6]:
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts

2012-03-09 09:30:00    1.996308
2012-03-10 09:30:00    1.215591
2012-03-11 09:30:00   -0.471395
2012-03-12 09:30:00    0.565172
2012-03-13 09:30:00   -1.093636
2012-03-14 09:30:00    0.409585
Freq: D, dtype: float64

索引的tz部分是None：

In [7]:
print(ts.index.tz)

None


日期范围也能通过时区集合（time zone set）来创建：

In [8]:
pd.date_range('3/9/2012 9:30', periods=10, freq='D', tz='UTC')

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00', '2012-03-16 09:30:00+00:00',
               '2012-03-17 09:30:00+00:00', '2012-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

使用tz_localize方法，可以实现从朴素到本地化（naive to localized）的转变：

In [9]:
ts

2012-03-09 09:30:00    1.996308
2012-03-10 09:30:00    1.215591
2012-03-11 09:30:00   -0.471395
2012-03-12 09:30:00    0.565172
2012-03-13 09:30:00   -1.093636
2012-03-14 09:30:00    0.409585
Freq: D, dtype: float64

In [10]:
ts_utc = ts.tz_localize('UTC')
ts_utc

2012-03-09 09:30:00+00:00    1.996308
2012-03-10 09:30:00+00:00    1.215591
2012-03-11 09:30:00+00:00   -0.471395
2012-03-12 09:30:00+00:00    0.565172
2012-03-13 09:30:00+00:00   -1.093636
2012-03-14 09:30:00+00:00    0.409585
Freq: D, dtype: float64

In [11]:
ts_utc.index

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

一旦时间序列被定位到某个时区，那么它就可以被转换为任何其他时区，使用tz_convert：

In [12]:
ts_utc.tz_convert('America/New_York')

2012-03-09 04:30:00-05:00    1.996308
2012-03-10 04:30:00-05:00    1.215591
2012-03-11 05:30:00-04:00   -0.471395
2012-03-12 05:30:00-04:00    0.565172
2012-03-13 05:30:00-04:00   -1.093636
2012-03-14 05:30:00-04:00    0.409585
Freq: D, dtype: float64

在处理时间序列的时候，我们可以先把时间定位到纽约时间，然后转换到柏林时间：

In [13]:
ts_eastern = ts.tz_localize('America/New_York')

In [14]:
ts_eastern.tz_convert('UTC')

2012-03-09 14:30:00+00:00    1.996308
2012-03-10 14:30:00+00:00    1.215591
2012-03-11 13:30:00+00:00   -0.471395
2012-03-12 13:30:00+00:00    0.565172
2012-03-13 13:30:00+00:00   -1.093636
2012-03-14 13:30:00+00:00    0.409585
Freq: D, dtype: float64

In [15]:
ts_eastern.tz_convert('Europe/Berlin')

2012-03-09 15:30:00+01:00    1.996308
2012-03-10 15:30:00+01:00    1.215591
2012-03-11 14:30:00+01:00   -0.471395
2012-03-12 14:30:00+01:00    0.565172
2012-03-13 14:30:00+01:00   -1.093636
2012-03-14 14:30:00+01:00    0.409585
Freq: D, dtype: float64

tz_localize和tz_convert也是DatetimeIndex上的实例方法（instance methods）：

In [16]:
ts.index.tz_localize('Asia/Shanghai')

DatetimeIndex(['2012-03-09 09:30:00+08:00', '2012-03-10 09:30:00+08:00',
               '2012-03-11 09:30:00+08:00', '2012-03-12 09:30:00+08:00',
               '2012-03-13 09:30:00+08:00', '2012-03-14 09:30:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq='D')

# 2 Operations with Time Zone−Aware Timestamp Objects（时区的操作-意识到时间戳对象）

In [21]:
stamp = pd.Timestamp('2011-03-12 04:00')

In [22]:
stamp_utc = stamp.tz_localize('utc')

In [23]:
stamp_utc.tz_convert('America/New_York')

Timestamp('2011-03-11 23:00:00-0500', tz='America/New_York')

在创建Timestamp的时候，我们可以传递一个时区：

In [24]:
stamp_moscow = pd.Timestamp('2011-03-12 04:00', tz='Europe/Moscow')
stamp_moscow

Timestamp('2011-03-12 04:00:00+0300', tz='Europe/Moscow')

有时区的Timestamp对象内部存储了一个UTC时间戳，这个值是从Unix纪元（即1907年1月1日）到现在的纳秒；这个UTC值在即使换了不同的时区，也是不变的：

In [25]:
stamp_utc.value

1299902400000000000

In [26]:
stamp_utc.tz_convert('America/New_York').value

1299902400000000000

在使用pandas的DateOffset对象进行算数运算的时候，如果夏令时存在，pandas也会考虑进去。这里我们构建一个时间戳，正好出现在夏令时转换前。首先，在变为夏令时的前30分钟：

In [27]:
from pandas.tseries.offsets import Hour

In [28]:
stamp = pd.Timestamp('2012-03-12 01:30', tz='US/Eastern')
stamp

Timestamp('2012-03-12 01:30:00-0400', tz='US/Eastern')

In [29]:
stamp + Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

变为夏令时的90分钟前：

In [30]:
stamp = pd.Timestamp('2012-11-04 00:30', tz='US/Eastern')
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [31]:
stamp + 2 * Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

# 3 Operations Between Diferent Time Zones（不同时区间的运算）

如果两个不同时区的时间序列被合并，那么结果为UTC。因为时间戳是以UTC为背后机制的，这种变化是直接的，不需要手动转换：

In [32]:
rng = pd.date_range('3/7/2012 9:30', periods=10, freq='B')

In [33]:
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts

2012-03-07 09:30:00   -0.383176
2012-03-08 09:30:00    1.960457
2012-03-09 09:30:00   -0.942952
2012-03-12 09:30:00   -0.957582
2012-03-13 09:30:00   -1.667041
2012-03-14 09:30:00    0.060621
2012-03-15 09:30:00    0.715665
2012-03-16 09:30:00   -0.037389
2012-03-19 09:30:00   -0.011725
2012-03-20 09:30:00    0.688127
Freq: B, dtype: float64

In [34]:
ts1 = ts[:7].tz_localize('Europe/London')
ts2 = ts1[2:].tz_convert('Europe/Moscow')
result = ts1 + ts2

In [35]:
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='B')