Working with time zones is generally considered one of the most unpleasant parts of
time series manipulation. As a result, many time series users choose to work with
time series in coordinated universal time or UTC, which is the successor to Greenwich
Mean Time and is the current international standard. Time zones are expressed as
offsets from UTC; for example, New York is four hours behind UTC during daylight
saving time and five hours behind the rest of the year.

In Python, time zone information comes from the third-party pytz library (installa‐
ble with pip or conda), which exposes the Olson database, a compilation of world
time zone information. This is especially important for historical data because the
daylight saving time (DST) transition dates (and even UTC offsets) have been
changed numerous times depending on the whims of local governments. In the Uni‐
ted States, the DST transition times have been changed many times since 1900!

For detailed information about the pytz library, you’ll need to look at that library’s
documentation. As far as this book is concerned, pandas wraps pytz’s functionality so
you can ignore its API outside of the time zone names. Time zone names can be
found interactively and in the docs:

In [1]:
import pandas as pd
import numpy as np
import pytz

In [2]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

In [4]:
# To get a time zone object from pytz, use pytz.timezone:
tz = pytz.timezone('America/New_York')
tz

# Methods in pandas will accept either time zone names or these objects.

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

### Time Zone Localization and Conversion

In [5]:
# By default, time series in pandas are time zone naive. For example, consider the following time series:
rng = pd.date_range('3/9/2012 9:30', periods=6, freq='D')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts

2012-03-09 09:30:00   -0.554217
2012-03-10 09:30:00    0.634853
2012-03-11 09:30:00    1.008662
2012-03-12 09:30:00   -0.467357
2012-03-13 09:30:00   -1.749476
2012-03-14 09:30:00   -0.391256
Freq: D, dtype: float64

In [6]:
# The index’s tz field is None:
print(ts.index.tz)

None


In [7]:
# Date ranges can be generated with a time zone set:
pd.date_range('3/9/2012 9:30', periods=10, freq='D', tz='UTC')

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00', '2012-03-16 09:30:00+00:00',
               '2012-03-17 09:30:00+00:00', '2012-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

Conversion from naive to localized is handled by the tz_localize method:

In [8]:
ts

2012-03-09 09:30:00   -0.554217
2012-03-10 09:30:00    0.634853
2012-03-11 09:30:00    1.008662
2012-03-12 09:30:00   -0.467357
2012-03-13 09:30:00   -1.749476
2012-03-14 09:30:00   -0.391256
Freq: D, dtype: float64

In [10]:
ts_utc = ts.tz_localize('UTC')
ts_utc

2012-03-09 09:30:00+00:00   -0.554217
2012-03-10 09:30:00+00:00    0.634853
2012-03-11 09:30:00+00:00    1.008662
2012-03-12 09:30:00+00:00   -0.467357
2012-03-13 09:30:00+00:00   -1.749476
2012-03-14 09:30:00+00:00   -0.391256
Freq: D, dtype: float64

In [11]:
ts_utc.index

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

Once a time series has been localized to a particular time zone, it can be converted to
another time zone with tz_convert:

In [12]:
ts_utc.tz_convert('America/New_York')

2012-03-09 04:30:00-05:00   -0.554217
2012-03-10 04:30:00-05:00    0.634853
2012-03-11 05:30:00-04:00    1.008662
2012-03-12 05:30:00-04:00   -0.467357
2012-03-13 05:30:00-04:00   -1.749476
2012-03-14 05:30:00-04:00   -0.391256
Freq: D, dtype: float64

In the case of the preceding time series, which straddles a DST transition in the Amer
ica/New_York time zone, we could localize to EST and convert to, say, UTC or Berlin
time:

In [13]:
ts_eastern = ts.tz_localize('America/New_York')
ts_eastern.tz_convert('UTC')

2012-03-09 14:30:00+00:00   -0.554217
2012-03-10 14:30:00+00:00    0.634853
2012-03-11 13:30:00+00:00    1.008662
2012-03-12 13:30:00+00:00   -0.467357
2012-03-13 13:30:00+00:00   -1.749476
2012-03-14 13:30:00+00:00   -0.391256
dtype: float64

In [14]:
ts_eastern.tz_convert('Europe/Berlin')

2012-03-09 15:30:00+01:00   -0.554217
2012-03-10 15:30:00+01:00    0.634853
2012-03-11 14:30:00+01:00    1.008662
2012-03-12 14:30:00+01:00   -0.467357
2012-03-13 14:30:00+01:00   -1.749476
2012-03-14 14:30:00+01:00   -0.391256
dtype: float64

In [15]:
# tz_localize and tz_convert are also instance methods on DatetimeIndex:
ts.index.tz_localize('Asia/Shanghai')

# Localizing naive timestamps also checks for ambiguous or non-
# existent times around daylight saving time transitions.

DatetimeIndex(['2012-03-09 09:30:00+08:00', '2012-03-10 09:30:00+08:00',
               '2012-03-11 09:30:00+08:00', '2012-03-12 09:30:00+08:00',
               '2012-03-13 09:30:00+08:00', '2012-03-14 09:30:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq=None)

### Operations with Time Zone−Aware Timestamp Objects

Similar to time series and date ranges, individual Timestamp objects similarly can be
localized from naive to time zone–aware and converted from one time zone to
another:

In [16]:
stamp = pd.Timestamp('2011-03-12 04:00')
stamp_utc = stamp.tz_localize('utc')
stamp_utc.tz_convert('America/New_York')

Timestamp('2011-03-11 23:00:00-0500', tz='America/New_York')

In [17]:
# You can also pass a time zone when creating the Timestamp:
stamp_moscow = pd.Timestamp('2011-03-12 04:00', tz='Europe/Moscow')
stamp_moscow

Timestamp('2011-03-12 04:00:00+0300', tz='Europe/Moscow')

Time zone–aware Timestamp objects internally store a UTC timestamp value as nano‐
seconds since the Unix epoch (January 1, 1970); this UTC value is invariant between
time zone conversions:

In [18]:
stamp_utc.value

1299902400000000000

In [19]:
stamp_utc.tz_convert('America/New_York').value

1299902400000000000

When performing time arithmetic using pandas’s DateOffset objects, pandas
respects daylight saving time transitions where possible. Here we construct time‐
stamps that occur right before DST transitions (forward and backward). First, 30
minutes before transitioning to DST:

In [20]:
from pandas.tseries.offsets import Hour
stamp = pd.Timestamp('2012-03-12 01:30', tz='US/Eastern')
stamp

Timestamp('2012-03-12 01:30:00-0400', tz='US/Eastern')

In [21]:
stamp + Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

Then, 90 minutes before transitioning out of DST:

In [22]:
stamp = pd.Timestamp('2012-11-04 00:30', tz='US/Eastern')
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [23]:
stamp + 2 * Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

### Operations Between Different Time Zones

If two time series with different time zones are combined, the result will be UTC.
Since the timestamps are stored under the hood in UTC, this is a straightforward
operation and requires no conversion to happen:

In [24]:
rng = pd.date_range('3/7/2012 9:30', periods=10, freq='B')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts

2012-03-07 09:30:00   -0.599166
2012-03-08 09:30:00    0.531471
2012-03-09 09:30:00    0.603727
2012-03-12 09:30:00    0.037563
2012-03-13 09:30:00    0.930359
2012-03-14 09:30:00   -0.791696
2012-03-15 09:30:00    1.021734
2012-03-16 09:30:00   -2.682962
2012-03-19 09:30:00    0.756711
2012-03-20 09:30:00   -0.725718
Freq: B, dtype: float64

In [25]:
ts1 = ts[:7].tz_localize('Europe/London')
ts2 = ts1[2:].tz_convert('Europe/Moscow')
result = ts1 + ts2
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)