## Working with timezones in Pandas

In [2]:
import pandas as pd
tindex = pd.date_range('1 October 2021', '31 October 2021', freq = "1D")
tindex

DatetimeIndex(['2021-10-01', '2021-10-02', '2021-10-03', '2021-10-04',
               '2021-10-05', '2021-10-06', '2021-10-07', '2021-10-08',
               '2021-10-09', '2021-10-10', '2021-10-11', '2021-10-12',
               '2021-10-13', '2021-10-14', '2021-10-15', '2021-10-16',
               '2021-10-17', '2021-10-18', '2021-10-19', '2021-10-20',
               '2021-10-21', '2021-10-22', '2021-10-23', '2021-10-24',
               '2021-10-25', '2021-10-26', '2021-10-27', '2021-10-28',
               '2021-10-29', '2021-10-30', '2021-10-31'],
              dtype='datetime64[ns]', freq='D')

By default, the timestamps generated by date_range are naive, that is unaware of timezone.

In [3]:
tindex.tz

In [4]:
tindex.tz is None

True

It is possible to generate timezone aware timestamps by using `tz` argument.

In [5]:
tindexIndia = pd.date_range('1 October 2021', '5 October 2021', freq = "1D", tz = 'Asia/Kolkata')
tindexIndia

DatetimeIndex(['2021-10-01 00:00:00+05:30', '2021-10-02 00:00:00+05:30',
               '2021-10-03 00:00:00+05:30', '2021-10-04 00:00:00+05:30',
               '2021-10-05 00:00:00+05:30'],
              dtype='datetime64[ns, Asia/Kolkata]', freq='D')

In [6]:
tindexIndia.tz

<DstTzInfo 'Asia/Kolkata' LMT+5:53:00 STD>

Note that the timezone 'Asia/Kolkata' is shown as LMT+5:53:00 !!

This is due to the design of package `pytz` on which this functionality depends. This timezone representation is the one before the timezones were standardized. 

The example given below clearly shows the change in timezone settings.

In [8]:
pd.date_range('1 December 1901', '31 December 1901', freq = "1D", tz = 'Asia/Kolkata')

DatetimeIndex(['1901-12-01 00:00:00+05:53', '1901-12-02 00:00:00+05:53',
               '1901-12-03 00:00:00+05:53', '1901-12-04 00:00:00+05:53',
               '1901-12-05 00:00:00+05:53', '1901-12-06 00:00:00+05:53',
               '1901-12-07 00:00:00+05:53', '1901-12-08 00:00:00+05:53',
               '1901-12-09 00:00:00+05:53', '1901-12-10 00:00:00+05:53',
               '1901-12-11 00:00:00+05:53', '1901-12-12 00:00:00+05:53',
               '1901-12-13 00:00:00+05:53', '1901-12-14 00:00:00+05:53',
               '1901-12-15 00:00:00+05:21', '1901-12-16 00:00:00+05:21',
               '1901-12-17 00:00:00+05:21', '1901-12-18 00:00:00+05:21',
               '1901-12-19 00:00:00+05:21', '1901-12-20 00:00:00+05:21',
               '1901-12-21 00:00:00+05:21', '1901-12-22 00:00:00+05:21',
               '1901-12-23 00:00:00+05:21', '1901-12-24 00:00:00+05:21',
               '1901-12-25 00:00:00+05:21', '1901-12-26 00:00:00+05:21',
               '1901-12-27 00:00:00+05:21', '1901-1

Next example clearly shows one more change in timezone settings at a much finer level.

In [9]:
pd.date_range('31 Dec 1905 23:59', '1 Jan 1906 00:15', freq = "1min", tz = 'Asia/Kolkata')

DatetimeIndex(['1905-12-31 23:59:00+05:21', '1906-01-01 00:09:00+05:30',
               '1906-01-01 00:10:00+05:30', '1906-01-01 00:11:00+05:30',
               '1906-01-01 00:12:00+05:30', '1906-01-01 00:13:00+05:30',
               '1906-01-01 00:14:00+05:30', '1906-01-01 00:15:00+05:30'],
              dtype='datetime64[ns, Asia/Kolkata]', freq='T')

See the change at even finer level

In [10]:
pd.date_range('31 Dec 1905 23:59:00', '1 Jan 1906 00:09:00', freq = "1S", tz = 'Asia/Kolkata')

DatetimeIndex(['1905-12-31 23:59:00+05:21', '1905-12-31 23:59:01+05:21',
               '1905-12-31 23:59:02+05:21', '1905-12-31 23:59:03+05:21',
               '1905-12-31 23:59:04+05:21', '1905-12-31 23:59:05+05:21',
               '1905-12-31 23:59:06+05:21', '1905-12-31 23:59:07+05:21',
               '1905-12-31 23:59:08+05:21', '1905-12-31 23:59:09+05:21',
               '1905-12-31 23:59:10+05:21', '1905-12-31 23:59:11+05:21',
               '1905-12-31 23:59:12+05:21', '1905-12-31 23:59:13+05:21',
               '1905-12-31 23:59:14+05:21', '1905-12-31 23:59:15+05:21',
               '1905-12-31 23:59:16+05:21', '1905-12-31 23:59:17+05:21',
               '1905-12-31 23:59:18+05:21', '1905-12-31 23:59:19+05:21',
               '1905-12-31 23:59:20+05:21', '1905-12-31 23:59:21+05:21',
               '1905-12-31 23:59:22+05:21', '1905-12-31 23:59:23+05:21',
               '1905-12-31 23:59:24+05:21', '1905-12-31 23:59:25+05:21',
               '1905-12-31 23:59:26+05:21', '1905-1

It is interesting to note that Note that the **Time between '1905-12-31 23:59:50' and '1906-01-01 00:08:49' doesn't exist** for India.

Timezone unaware timestamps can be made timezone aware using `tz_localize` method as shown below.

In [11]:
t = pd.date_range('01 Oct 2021', '15 Oct 2021', freq = "1D")
t

DatetimeIndex(['2021-10-01', '2021-10-02', '2021-10-03', '2021-10-04',
               '2021-10-05', '2021-10-06', '2021-10-07', '2021-10-08',
               '2021-10-09', '2021-10-10', '2021-10-11', '2021-10-12',
               '2021-10-13', '2021-10-14', '2021-10-15'],
              dtype='datetime64[ns]', freq='D')

In [12]:
t.tz_localize('Asia/Kolkata')

DatetimeIndex(['2021-10-01 00:00:00+05:30', '2021-10-02 00:00:00+05:30',
               '2021-10-03 00:00:00+05:30', '2021-10-04 00:00:00+05:30',
               '2021-10-05 00:00:00+05:30', '2021-10-06 00:00:00+05:30',
               '2021-10-07 00:00:00+05:30', '2021-10-08 00:00:00+05:30',
               '2021-10-09 00:00:00+05:30', '2021-10-10 00:00:00+05:30',
               '2021-10-11 00:00:00+05:30', '2021-10-12 00:00:00+05:30',
               '2021-10-13 00:00:00+05:30', '2021-10-14 00:00:00+05:30',
               '2021-10-15 00:00:00+05:30'],
              dtype='datetime64[ns, Asia/Kolkata]', freq=None)