# Timezone Handling

Working with Timezones can be extremely hard. pandas makes it simpler and intuitive. Most of the functionality to work with timezones is in the `pytz` package, which pandas transparently relies on.

In [1]:
import pytz
import numpy as np
import pandas as pd

Timestamps can be "naive" or "aware". A naive timestamp indicates a moment in time, but it doesn't say "where" that time is based on. Example, "the first match of the world cup will be on June 14th at 11AM". The question now is, "11AM where?". This is a naive timestamp, we say day and time, but without information of the "timezone" (or location) of the time:

In [2]:
pd.Timestamp('June 14 2018, 11AM')

Timestamp('2018-06-14 11:00:00')

An "aware" timestamp does contain the timezone information. Following our previous example, we can say "the first match of the world cup will be on June 14th at 11AM **in New York**". We're "localizing" the timestamp. In pandas, you can create the timestamp passing the timezone:

In [3]:
pd.Timestamp('June 14 2018, 11AM', tz='America/New_York')

Timestamp('2018-06-14 11:00:00-0400', tz='America/New_York')

In [4]:
pd.Timestamp('June 14 2018, 11AM', tz='US/Eastern')

Timestamp('2018-06-14 11:00:00-0400', tz='US/Eastern')

Or by "localizing" a naive timestamp:

In [5]:
ts_naive = pd.Timestamp('June 14 2018, 11AM')

In [6]:
ts_naive.tz_localize('US/Eastern')

Timestamp('2018-06-14 11:00:00-0400', tz='US/Eastern')

Once you have a localized timestamp, you can convert it back and forth between different timestamps. For example, let's start with the first match of the World Cup in in UTC:

![image](https://user-images.githubusercontent.com/872296/40145954-cddc310a-5931-11e8-9d0c-354de3a1b3df.png)


In [7]:
ts_first_match = pd.Timestamp('June 14 2018, 3PM UTC')
ts_first_match

Timestamp('2018-06-14 15:00:00+0000', tz='UTC')

In [8]:
ts_first_match.tz_convert('US/Eastern')

Timestamp('2018-06-14 11:00:00-0400', tz='US/Eastern')

In [9]:
ts_first_match.tz_convert('America/Buenos_Aires')

Timestamp('2018-06-14 12:00:00-0300', tz='America/Buenos_Aires')

In [10]:
ts_first_match.tz_convert('Asia/Tokyo')

Timestamp('2018-06-15 00:00:00+0900', tz='Asia/Tokyo')

In [11]:
ts_first_match.tz_convert('Europe/Moscow')

Timestamp('2018-06-14 18:00:00+0300', tz='Europe/Moscow')

In [12]:
ts_first_match.tz_convert('Asia/Yekaterinburg')

Timestamp('2018-06-14 20:00:00+0500', tz='Asia/Yekaterinburg')

The match between Uruguay and Russia is expressed in "SAMT" time, which is the [Samara Timezone](https://en.wikipedia.org/wiki/Samara_Time). `pytz` has a reference of all the available timezones in the `pytz.all_timezones` attribute:

In [13]:
len(pytz.all_timezones)

591

In [14]:
[tz for tz in pytz.all_timezones if 'samara' in tz.lower()]

['Europe/Samara']

In [15]:
ts_first_match.tz_convert('Europe/Samara')

Timestamp('2018-06-14 19:00:00+0400', tz='Europe/Samara')

## Daylight saving times

![image](https://user-images.githubusercontent.com/872296/40147933-9699cfa6-5939-11e8-855d-e310ebcc6bd9.png)


DSTs are considered, and are frequently updated in `pytz`. Example, in March 11th 2018, New York clocks were turned forward 1 hour (DST began). So, we usually say we "lose" an hour. pandas is aware of that:

In [16]:
list(pd.date_range('2018-03-11 00:00:00', '2018-03-11 05:00:00', freq='H', tz='US/Eastern'))

[Timestamp('2018-03-11 00:00:00-0500', tz='US/Eastern', freq='H'),
 Timestamp('2018-03-11 01:00:00-0500', tz='US/Eastern', freq='H'),
 Timestamp('2018-03-11 03:00:00-0400', tz='US/Eastern', freq='H'),
 Timestamp('2018-03-11 04:00:00-0400', tz='US/Eastern', freq='H'),
 Timestamp('2018-03-11 05:00:00-0400', tz='US/Eastern', freq='H')]

Again, we say "lose" that hour because "it doesn't exist":

In [17]:
pd.Timestamp('2018-03-11 02:00:00', tz='US/Eastern') # there is not 2 am because of the daylight savings

NonExistentTimeError: 2018-03-11 02:00:00