<h1 style="color:blue" align="center">Pandas Time Series Analysis: Handling Time Zones</h1>

**We live in a world with different timezones. If you are having morning coffee in new york at 9 AM it might be a time for dinner in Mumbai, India because it would be 6.30 PM there. Handling time zone could become necessity while doing time series analysis in Pandas**

<img src="timezones_world_map.png" />

**Read microsoft's intraday stock prize**

In [149]:
import pandas as pd
df = pd.read_csv("msft.csv", header=1,index_col='Date Time',parse_dates=True)
df

Unnamed: 0_level_0,Price
Date Time,Unnamed: 1_level_1
2017-08-17 09:00:00,72.38
2017-08-17 09:15:00,71.0
2017-08-17 09:30:00,71.67
2017-08-17 10:00:00,72.8
2017-08-17 10:30:00,73.0
2017-08-17 11:00:00,72.5


In [150]:
df.index

DatetimeIndex(['2017-08-17 09:00:00', '2017-08-17 09:15:00',
               '2017-08-17 09:30:00', '2017-08-17 10:00:00',
               '2017-08-17 10:30:00', '2017-08-17 11:00:00'],
              dtype='datetime64[ns]', name='Date Time', freq=None)

<h3>Two types of datetimes in python</h3>
<ol>
    <li>Naive (no timezone awareness)</li>
    <li>Timezone aware datetime</li>
<ol>

<h3 style="color:purple">Convert naive DatetimeIndex to timezone aware DatetimeIndex using tz_localize</h3>

In [151]:
df.tz_localize(tz='US/Eastern')
df

Unnamed: 0_level_0,Price
Date Time,Unnamed: 1_level_1
2017-08-17 09:00:00,72.38
2017-08-17 09:15:00,71.0
2017-08-17 09:30:00,71.67
2017-08-17 10:00:00,72.8
2017-08-17 10:30:00,73.0
2017-08-17 11:00:00,72.5


In [152]:
df.index = df.index.tz_localize(tz='US/Eastern')
df.index

DatetimeIndex(['2017-08-17 09:00:00-04:00', '2017-08-17 09:15:00-04:00',
               '2017-08-17 09:30:00-04:00', '2017-08-17 10:00:00-04:00',
               '2017-08-17 10:30:00-04:00', '2017-08-17 11:00:00-04:00'],
              dtype='datetime64[ns, US/Eastern]', name='Date Time', freq=None)

<h3 style="color:purple">Convert to Berlin time using tz_convert</h3>

In [153]:
df = df.tz_convert('Europe/Berlin')
df

Unnamed: 0_level_0,Price
Date Time,Unnamed: 1_level_1
2017-08-17 15:00:00+02:00,72.38
2017-08-17 15:15:00+02:00,71.0
2017-08-17 15:30:00+02:00,71.67
2017-08-17 16:00:00+02:00,72.8
2017-08-17 16:30:00+02:00,73.0
2017-08-17 17:00:00+02:00,72.5


In [154]:
df.index

DatetimeIndex(['2017-08-17 15:00:00+02:00', '2017-08-17 15:15:00+02:00',
               '2017-08-17 15:30:00+02:00', '2017-08-17 16:00:00+02:00',
               '2017-08-17 16:30:00+02:00', '2017-08-17 17:00:00+02:00'],
              dtype='datetime64[ns, Europe/Berlin]', name='Date Time', freq=None)

In [138]:
from pytz import all_timezones
print (all_timezones)

['Africa/Abidjan', 'Africa/Accra', 'Africa/Addis_Ababa', 'Africa/Algiers', 'Africa/Asmara', 'Africa/Asmera', 'Africa/Bamako', 'Africa/Bangui', 'Africa/Banjul', 'Africa/Bissau', 'Africa/Blantyre', 'Africa/Brazzaville', 'Africa/Bujumbura', 'Africa/Cairo', 'Africa/Casablanca', 'Africa/Ceuta', 'Africa/Conakry', 'Africa/Dakar', 'Africa/Dar_es_Salaam', 'Africa/Djibouti', 'Africa/Douala', 'Africa/El_Aaiun', 'Africa/Freetown', 'Africa/Gaborone', 'Africa/Harare', 'Africa/Johannesburg', 'Africa/Juba', 'Africa/Kampala', 'Africa/Khartoum', 'Africa/Kigali', 'Africa/Kinshasa', 'Africa/Lagos', 'Africa/Libreville', 'Africa/Lome', 'Africa/Luanda', 'Africa/Lubumbashi', 'Africa/Lusaka', 'Africa/Malabo', 'Africa/Maputo', 'Africa/Maseru', 'Africa/Mbabane', 'Africa/Mogadishu', 'Africa/Monrovia', 'Africa/Nairobi', 'Africa/Ndjamena', 'Africa/Niamey', 'Africa/Nouakchott', 'Africa/Ouagadougou', 'Africa/Porto-Novo', 'Africa/Sao_Tome', 'Africa/Timbuktu', 'Africa/Tripoli', 'Africa/Tunis', 'Africa/Windhoek', 'Ameri

<h3 style="color:purple">Convert to Mumbai time</h3>

In [139]:
df.index = df.index.tz_convert('Asia/Calcutta') # tz database doesn't have any Mumbai timezone but calcutta and mumbai are both in same timezone so we will use that
df

Unnamed: 0_level_0,Price
Date Time,Unnamed: 1_level_1
2017-08-17 18:30:00+05:30,72.38
2017-08-17 18:45:00+05:30,71.0
2017-08-17 19:00:00+05:30,71.67
2017-08-17 19:30:00+05:30,72.8
2017-08-17 20:00:00+05:30,73.0
2017-08-17 20:30:00+05:30,72.5


<h3 style="color:purple">Using timezones in date_range</h3>

<h4 style="color:green">(1) timezone using pytz</h4>

In [140]:
london = pd.date_range('3/6/2012 00:09:00', periods=10, freq='H',tz='Europe/London')
london

DatetimeIndex(['2012-03-06 00:09:00+00:00', '2012-03-06 01:09:00+00:00',
               '2012-03-06 02:09:00+00:00', '2012-03-06 03:09:00+00:00',
               '2012-03-06 04:09:00+00:00', '2012-03-06 05:09:00+00:00',
               '2012-03-06 06:09:00+00:00', '2012-03-06 07:09:00+00:00',
               '2012-03-06 08:09:00+00:00', '2012-03-06 09:09:00+00:00'],
              dtype='datetime64[ns, Europe/London]', freq='H')

<h4 style="color:green">(2) timezone using dateutil</h4>

In [141]:
td = pd.date_range('3/6/2012 00:00', periods=10, freq='H',tz='dateutil/Europe/London')
td

DatetimeIndex(['2012-03-06 00:00:00+00:00', '2012-03-06 01:00:00+00:00',
               '2012-03-06 02:00:00+00:00', '2012-03-06 03:00:00+00:00',
               '2012-03-06 04:00:00+00:00', '2012-03-06 05:00:00+00:00',
               '2012-03-06 06:00:00+00:00', '2012-03-06 07:00:00+00:00',
               '2012-03-06 08:00:00+00:00', '2012-03-06 09:00:00+00:00'],
              dtype='datetime64[ns, tzfile('GB-Eire')]', freq='H')


<h3>Pandas documentation indicates that difference between pytz timezone and dateutil timezones is</h3>
<ol>
    <li>In pytz you can find a list of common (and less common) time zones using from pytz import common_timezones, all_timezones</li>
    <li>dateutil uses the OS timezones so there isn’t a fixed list available. For common zones, the names are the same as pytz</li>
<ol>

<h3 style="color:purple">Airthmetic between different timezones</h3>

In [155]:
rng = pd.date_range(start="2017-08-22 09:00:00",periods=10, freq='30min')
s = pd.Series(range(10),index=rng)
s

DatetimeIndex(['2017-08-22 09:00:00', '2017-08-22 09:30:00',
               '2017-08-22 10:00:00', '2017-08-22 10:30:00',
               '2017-08-22 11:00:00', '2017-08-22 11:30:00',
               '2017-08-22 12:00:00', '2017-08-22 12:30:00',
               '2017-08-22 13:00:00', '2017-08-22 13:30:00'],
              dtype='datetime64[ns]', freq='30T')

2017-08-22 09:00:00    0
2017-08-22 09:30:00    1
2017-08-22 10:00:00    2
2017-08-22 10:30:00    3
2017-08-22 11:00:00    4
2017-08-22 11:30:00    5
2017-08-22 12:00:00    6
2017-08-22 12:30:00    7
2017-08-22 13:00:00    8
2017-08-22 13:30:00    9
Freq: 30T, dtype: int32

In [157]:
b = s.tz_localize(tz="Europe/Berlin")
b

2017-08-22 09:00:00+02:00    0
2017-08-22 09:30:00+02:00    1
2017-08-22 10:00:00+02:00    2
2017-08-22 10:30:00+02:00    3
2017-08-22 11:00:00+02:00    4
2017-08-22 11:30:00+02:00    5
2017-08-22 12:00:00+02:00    6
2017-08-22 12:30:00+02:00    7
2017-08-22 13:00:00+02:00    8
2017-08-22 13:30:00+02:00    9
Freq: 30T, dtype: int32

In [158]:
b.index

DatetimeIndex(['2017-08-22 09:00:00+02:00', '2017-08-22 09:30:00+02:00',
               '2017-08-22 10:00:00+02:00', '2017-08-22 10:30:00+02:00',
               '2017-08-22 11:00:00+02:00', '2017-08-22 11:30:00+02:00',
               '2017-08-22 12:00:00+02:00', '2017-08-22 12:30:00+02:00',
               '2017-08-22 13:00:00+02:00', '2017-08-22 13:30:00+02:00'],
              dtype='datetime64[ns, Europe/Berlin]', freq='30T')

In [159]:
m = s.tz_localize(tz="Asia/Calcutta")
m.index

DatetimeIndex(['2017-08-22 09:00:00+05:30', '2017-08-22 09:30:00+05:30',
               '2017-08-22 10:00:00+05:30', '2017-08-22 10:30:00+05:30',
               '2017-08-22 11:00:00+05:30', '2017-08-22 11:30:00+05:30',
               '2017-08-22 12:00:00+05:30', '2017-08-22 12:30:00+05:30',
               '2017-08-22 13:00:00+05:30', '2017-08-22 13:30:00+05:30'],
              dtype='datetime64[ns, Asia/Calcutta]', freq='30T')

In [160]:
m

2017-08-22 09:00:00+05:30    0
2017-08-22 09:30:00+05:30    1
2017-08-22 10:00:00+05:30    2
2017-08-22 10:30:00+05:30    3
2017-08-22 11:00:00+05:30    4
2017-08-22 11:30:00+05:30    5
2017-08-22 12:00:00+05:30    6
2017-08-22 12:30:00+05:30    7
2017-08-22 13:00:00+05:30    8
2017-08-22 13:30:00+05:30    9
Freq: 30T, dtype: int32

**It will first convert individual timezones to UTC and then align datetimes to perform addition/subtraction etc. operations**

In [161]:
b + m 

2017-08-22 03:30:00+00:00     NaN
2017-08-22 04:00:00+00:00     NaN
2017-08-22 04:30:00+00:00     NaN
2017-08-22 05:00:00+00:00     NaN
2017-08-22 05:30:00+00:00     NaN
2017-08-22 06:00:00+00:00     NaN
2017-08-22 06:30:00+00:00     NaN
2017-08-22 07:00:00+00:00     7.0
2017-08-22 07:30:00+00:00     9.0
2017-08-22 08:00:00+00:00    11.0
2017-08-22 08:30:00+00:00     NaN
2017-08-22 09:00:00+00:00     NaN
2017-08-22 09:30:00+00:00     NaN
2017-08-22 10:00:00+00:00     NaN
2017-08-22 10:30:00+00:00     NaN
2017-08-22 11:00:00+00:00     NaN
2017-08-22 11:30:00+00:00     NaN
Freq: 30T, dtype: float64

**Date alignment is shown below**

<img src="alignment.png" />