## Time Zone Handling

*[Coding along with Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, Wes Mckinney, O'Reilly, 1st Edition October 2012; the notebook accompanying this chapter can be found on [GitHub](https://github.com/wesm/pydata-book/blob/3rd-edition/ch11.ipynb)]*

Python uses the third party library `pytz`for time zone information. Many time series users prefer to work with `coordinated universal time` or UTC. Time zones are expressed as offsets from UTC (New York is four hours behind UTC during daylight saving time, five hours behind for the rest of the year).
<br/>
The `pytz` library exposes the *Olson database*, a compilation of world time zone information.

The __Olson database__, also known as the `tz database` or zoneinfo database, is a public-domain database of time zones for locations around the world. It is maintained by a group of volunteers, with Arthur David Olson and Paul Eggert being the primary contributors.

__Key Features:__

- __Time Zone Names:__ Each location is identified by a unique name, such as “America/New_York” or “Europe/Berlin”, which represents the local time for that region.
- __Historical Data:__ The database records the exact moment of transition for each time zone, allowing for accurate representation of historical time zone changes.
- __Rule Lines:__ The database includes rule lines that define the standard and daylight saving time (DST) rules for each time zone.
Zone Lines: Zone lines define the time zone boundaries and UTC offsets for each region.
- __ISO 3166 Country Codes:__ The database uses ISO 3166 2-character country codes to identify countries.

In [2]:
import pytz

In [2]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

In [5]:
# getting a timezone object from pytz
tz = pytz.timezone('America/New_York')
tz

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

### Time Zone Localization 

In [3]:
import pandas as pd
import numpy as np

In [17]:
# time zone example
rng = pd.date_range('3/9/2012 9:30', periods=6, freq='D')
rng

DatetimeIndex(['2012-03-09 09:30:00', '2012-03-10 09:30:00',
               '2012-03-11 09:30:00', '2012-03-12 09:30:00',
               '2012-03-13 09:30:00', '2012-03-14 09:30:00'],
              dtype='datetime64[ns]', freq='D')

In [12]:
ts = pd.Series(np.random.randn(len(rng)), index=rng)

In [13]:
ts

2012-03-09 09:30:00    0.378185
2012-03-10 09:30:00   -0.390203
2012-03-11 09:30:00   -0.500009
2012-03-12 09:30:00   -0.628508
2012-03-13 09:30:00    0.390594
2012-03-14 09:30:00   -0.871618
Freq: D, dtype: float64

In [15]:
print(ts.index.tz) # index's tz field is none

None


In [16]:
# generating date ranges with a timezone set
pd.date_range('3/9/2012 9:30', periods=10, freq='D', tz="UTC")

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00', '2012-03-16 09:30:00+00:00',
               '2012-03-17 09:30:00+00:00', '2012-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [18]:
# conversion from naive to localized with the tz_localize method
ts_utc = ts.tz_localize('UTC')

In [19]:
ts_utc

2012-03-09 09:30:00+00:00    0.378185
2012-03-10 09:30:00+00:00   -0.390203
2012-03-11 09:30:00+00:00   -0.500009
2012-03-12 09:30:00+00:00   -0.628508
2012-03-13 09:30:00+00:00    0.390594
2012-03-14 09:30:00+00:00   -0.871618
Freq: D, dtype: float64

In [20]:
ts_utc.index

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [21]:
# converting localized series to another timezone
ts_utc.tz_convert('America/New_York')

2012-03-09 04:30:00-05:00    0.378185
2012-03-10 04:30:00-05:00   -0.390203
2012-03-11 05:30:00-04:00   -0.500009
2012-03-12 05:30:00-04:00   -0.628508
2012-03-13 05:30:00-04:00    0.390594
2012-03-14 05:30:00-04:00   -0.871618
Freq: D, dtype: float64

In [22]:
# another example: converting to localized America/New_York timezone
ts_eastern = ts.tz_localize('America/New_York')
ts_eastern

2012-03-09 09:30:00-05:00    0.378185
2012-03-10 09:30:00-05:00   -0.390203
2012-03-11 09:30:00-04:00   -0.500009
2012-03-12 09:30:00-04:00   -0.628508
2012-03-13 09:30:00-04:00    0.390594
2012-03-14 09:30:00-04:00   -0.871618
dtype: float64

In [23]:
# converting eastern time to UTC
ts_eastern.tz_convert('UTC')

2012-03-09 14:30:00+00:00    0.378185
2012-03-10 14:30:00+00:00   -0.390203
2012-03-11 13:30:00+00:00   -0.500009
2012-03-12 13:30:00+00:00   -0.628508
2012-03-13 13:30:00+00:00    0.390594
2012-03-14 13:30:00+00:00   -0.871618
dtype: float64

In [25]:
# converting eastern time to European time
ts_eastern.tz_convert('Europe/Berlin')

2012-03-09 15:30:00+01:00    0.378185
2012-03-10 15:30:00+01:00   -0.390203
2012-03-11 14:30:00+01:00   -0.500009
2012-03-12 14:30:00+01:00   -0.628508
2012-03-13 14:30:00+01:00    0.390594
2012-03-14 14:30:00+01:00   -0.871618
dtype: float64

In [28]:
# tz_localize and tz_convert are also instance methods on DatetimeIndex
ts.index.tz_localize('Asia/Shanghai')

DatetimeIndex(['2012-03-09 09:30:00+08:00', '2012-03-10 09:30:00+08:00',
               '2012-03-11 09:30:00+08:00', '2012-03-12 09:30:00+08:00',
               '2012-03-13 09:30:00+08:00', '2012-03-14 09:30:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq=None)

### Operations with Time Zone - Aware Timestamp Objects

Individual `Timestamp` objects can be localized from naive to time-zone aware and converted from one time zone to another.

In [4]:
stamp = pd.Timestamp('2021-03-12 04:00')
stamp

Timestamp('2021-03-12 04:00:00')

In [5]:
stamp_utc = stamp.tz_localize('utc')
stamp_utc

Timestamp('2021-03-12 04:00:00+0000', tz='UTC')

In [6]:
stamp_utc.tz_convert('America/New_York')

Timestamp('2021-03-11 23:00:00-0500', tz='America/New_York')

In [8]:
stamp_moscow = pd.Timestamp('2021-03-12 04:00', tz='Europe/Moscow')
stamp_moscow

Timestamp('2021-03-12 04:00:00+0300', tz='Europe/Moscow')

If performing time arithmetic using panda's `DateOffset` objects, pandas respects daylight saving time transitions where possible.

__Examples:__ Timestamps that occur right before DST transitions.

In [11]:
from pandas.tseries.offsets import Hour

In [12]:
# 30 minutes before transitioning to DST
stamp = pd.Timestamp('2012-03-12 01:30', tz='US/Eastern')
stamp

Timestamp('2012-03-12 01:30:00-0400', tz='US/Eastern')

In [13]:
stamp + Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

In [14]:
# 90 minutes before transitioning to DST
stamp = pd.Timestamp('2012-03-12 00:30', tz='US/Eastern')
stamp

Timestamp('2012-03-12 00:30:00-0400', tz='US/Eastern')

In [15]:
stamp + 2 * Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

### Operations Between Different Time Zones

If two series with different time zones are combined the result will be UTC.

In [17]:
rng = pd.date_range('3/7/2012 9:30', periods=10, freq='B')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts

2012-03-07 09:30:00   -0.890673
2012-03-08 09:30:00   -0.371541
2012-03-09 09:30:00    1.872968
2012-03-12 09:30:00    0.500255
2012-03-13 09:30:00   -0.105224
2012-03-14 09:30:00    0.499948
2012-03-15 09:30:00    0.794980
2012-03-16 09:30:00   -1.128259
2012-03-19 09:30:00    0.497215
2012-03-20 09:30:00   -0.683767
Freq: B, dtype: float64

In [19]:
ts1 = ts[:7].tz_localize('Europe/London')
ts1

2012-03-07 09:30:00+00:00   -0.890673
2012-03-08 09:30:00+00:00   -0.371541
2012-03-09 09:30:00+00:00    1.872968
2012-03-12 09:30:00+00:00    0.500255
2012-03-13 09:30:00+00:00   -0.105224
2012-03-14 09:30:00+00:00    0.499948
2012-03-15 09:30:00+00:00    0.794980
dtype: float64

In [21]:
ts2 = ts[2:].tz_localize('Europe/Moscow')
ts2

2012-03-09 09:30:00+04:00    1.872968
2012-03-12 09:30:00+04:00    0.500255
2012-03-13 09:30:00+04:00   -0.105224
2012-03-14 09:30:00+04:00    0.499948
2012-03-15 09:30:00+04:00    0.794980
2012-03-16 09:30:00+04:00   -1.128259
2012-03-19 09:30:00+04:00    0.497215
2012-03-20 09:30:00+04:00   -0.683767
dtype: float64

In [22]:
result = ts1 + ts2

In [23]:
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 05:30:00+00:00', '2012-03-09 09:30:00+00:00',
               '2012-03-12 05:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 05:30:00+00:00', '2012-03-13 09:30:00+00:00',
               '2012-03-14 05:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 05:30:00+00:00', '2012-03-15 09:30:00+00:00',
               '2012-03-16 05:30:00+00:00', '2012-03-19 05:30:00+00:00',
               '2012-03-20 05:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)