Time series data is an important form of structured data in many different fields, such
as finance, economics, ecology, neuroscience, and physics. Anything that is observed
or measured at many points in time forms a time series. Many time series are fixed
frequency, which is to say that data points occur at regular intervals according to some
rule, such as every 15 seconds, every 5 minutes, or once per month. Time series can
also be irregular without a fixed unit of time or offset between units.`

How you mark and refer to time series data depends on the application, and you may have one of the following:
1. Timestamps, specific instants in time
2. Fixed periods, such as the month January 2007 or the full year 2010
3. Intervals of time, indicated by a start and end timestamp. Periods can be thought of as special cases of intervals
4. Experiment or elapsed time; each timestamp is a measure of time relative to a particular start time (e.g., the diameter of a cookie baking each second since being placed in the oven)


In [1]:
import numpy as np
import pandas as pd

In [2]:
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

# 11.1 Date and Time Data Types and Tools

In [3]:
from datetime import datetime

In [4]:
now = datetime.now()

In [5]:
print(now)

2020-07-22 10:28:51.471055


In [6]:
now.year

2020

In [7]:
now.day

22

In [8]:
now.month

7

In [9]:
now.year, now.day, now.month

(2020, 22, 7)

In [10]:
delta = datetime(2020, 7, 20) - datetime(2015, 2, 15, 7, 18)
delta

datetime.timedelta(days=1981, seconds=60120)

In [11]:
delta.days

1981

In [12]:
delta.seconds

60120

In [13]:
from datetime import timedelta

In [14]:
start = datetime(1998, 8, 24)

In [15]:
start + timedelta(-2)

datetime.datetime(1998, 8, 22, 0, 0)

In [16]:
start - 2 * timedelta(-2)

datetime.datetime(1998, 8, 28, 0, 0)

*See Table 11-1. Types in datetime module*

![Types in datetime module](Img/11.1.png)

## Converting Between String and Datetime

In [17]:
stamp = datetime(2020, 8, 24)

In [18]:
str(stamp)

'2020-08-24 00:00:00'

In [19]:
# Converting stamp-time into stamp-str
stamp.strftime('%Y-%m-%d')

'2020-08-24'

*See Table 11-2 for a complete list of the format codes present in Chp2*

In [20]:
value = '2020-08-24'

In [21]:
stamp = datetime.strptime(value, '%Y-%m-%d')

In [22]:
stamp

datetime.datetime(2020, 8, 24, 0, 0)

In [23]:
datelists = ['2020-02-24', '2019-08-15', '2018-04-18']

In [24]:
[datetime.strptime(x, '%Y-%m-%d') for x in datelists]   

[datetime.datetime(2020, 2, 24, 0, 0),
 datetime.datetime(2019, 8, 15, 0, 0),
 datetime.datetime(2018, 4, 18, 0, 0)]

**datetime.strptime** is a good way to parse a date with a known format. However, it
can be a bit annoying to have to write a format spec each time, especially for common
date formats. In this case, you can use the **parser.parse** method in the third-party
dateutil package (this is installed automatically when you install pandas):

In [25]:
from dateutil.parser import parse

In [26]:
parse('1998-08-24')

datetime.datetime(1998, 8, 24, 0, 0)

In [27]:
parse('August 24 1998 4:03 AM')

datetime.datetime(1998, 8, 24, 4, 3)

In [28]:
parse('26/12/2011')

datetime.datetime(2011, 12, 26, 0, 0)

In [29]:
parse('6/12/2011', dayfirst=True)

datetime.datetime(2011, 12, 6, 0, 0)

In [30]:
datestrs = ['2011-07-06 12:00:00', '2011-08-06 00:00:00']

In [31]:
stamp = pd.to_datetime(datestrs)
stamp

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00'], dtype='datetime64[ns]', freq=None)

In [32]:
stamp = pd.to_datetime(datestrs + [None])
stamp

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)

In [33]:
stamp[2]

NaT

**NaT** (Not a Time) is pandas’s null value for timestamp data.

*See Table 11-3. Locale-specific date formatting*

![Locale-specific date formatting](Img/11.3.png)

# 11.2 Time Series Basics

A basic kind of time series object in pandas is a Series indexed by timestamps, which
is often represented external to pandas as Python strings or datetime objects

In [34]:
dates = [datetime(2011, 1, 2), datetime(2011, 1, 5), datetime(2011, 1, 7),
         datetime(2011, 1, 8), datetime(2011, 1, 10), datetime(2011, 1, 12)]

In [35]:
ts = pd.Series(np.random.randn(6), index=dates)

In [36]:
ts

2011-01-02   -1.003231
2011-01-05   -1.496920
2011-01-07    1.048115
2011-01-08   -0.147966
2011-01-10   -1.547044
2011-01-12   -0.609347
dtype: float64

In [37]:
ts.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

In [38]:
ts[::2]

2011-01-02   -1.003231
2011-01-07    1.048115
2011-01-10   -1.547044
dtype: float64

In [39]:
ts + ts[::2]

2011-01-02   -2.006462
2011-01-05         NaN
2011-01-07    2.096229
2011-01-08         NaN
2011-01-10   -3.094089
2011-01-12         NaN
dtype: float64

In [40]:
ts.index.dtype

dtype('<M8[ns]')

In [41]:
stamp = ts.index[0]

In [42]:
stamp

Timestamp('2011-01-02 00:00:00')

## Indexing, Selection, Subsetting

In [43]:
stamp = ts.index[2]

In [44]:
ts[stamp]

1.04811451630194

In [45]:
ts

2011-01-02   -1.003231
2011-01-05   -1.496920
2011-01-07    1.048115
2011-01-08   -0.147966
2011-01-10   -1.547044
2011-01-12   -0.609347
dtype: float64

In [46]:
ts['1/10/2011'] #you can also pass a string that is interpretable as a date

-1.5470443682420807

In [47]:
ts['20110110'] #you can also pass a string that is interpretable as a date

-1.5470443682420807

In [48]:
lts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2010', periods=1000))

In [49]:
lts

2010-01-01   -0.237766
2010-01-02   -0.205763
2010-01-03   -1.325449
2010-01-04   -0.178484
2010-01-05   -1.082852
                ...   
2012-09-22   -2.275846
2012-09-23    0.484376
2012-09-24   -0.224740
2012-09-25    2.769927
2012-09-26   -0.777445
Freq: D, Length: 1000, dtype: float64

In [50]:
lts['2011']

2011-01-01    0.971550
2011-01-02    1.764501
2011-01-03    0.111590
2011-01-04    0.757759
2011-01-05    0.119378
                ...   
2011-12-27   -0.880609
2011-12-28    0.816993
2011-12-29   -1.166560
2011-12-30   -0.654165
2011-12-31   -0.317054
Freq: D, Length: 365, dtype: float64

In [51]:
lts['2011-08']

2011-08-01   -0.072023
2011-08-02    0.260193
2011-08-03   -0.857947
2011-08-04   -0.985321
2011-08-05   -1.035861
2011-08-06    0.947934
2011-08-07    0.322522
2011-08-08   -0.447591
2011-08-09   -0.437198
2011-08-10   -0.130740
2011-08-11    0.285199
2011-08-12    0.893606
2011-08-13    0.296886
2011-08-14   -0.909039
2011-08-15    0.366267
2011-08-16   -1.306282
2011-08-17   -0.314006
2011-08-18   -0.356406
2011-08-19    0.052704
2011-08-20    1.526965
2011-08-21   -2.570644
2011-08-22    0.738963
2011-08-23   -0.926996
2011-08-24    3.470344
2011-08-25   -0.772374
2011-08-26   -0.265936
2011-08-27    0.832243
2011-08-28   -0.863703
2011-08-29    0.772414
2011-08-30   -0.554472
2011-08-31   -1.619850
Freq: D, dtype: float64

In [52]:
lts[datetime(2011, 6, 1):]

2011-06-01   -1.276456
2011-06-02    0.233991
2011-06-03    0.881760
2011-06-04   -0.382703
2011-06-05    1.472987
                ...   
2012-09-22   -2.275846
2012-09-23    0.484376
2012-09-24   -0.224740
2012-09-25    2.769927
2012-09-26   -0.777445
Freq: D, Length: 484, dtype: float64

In [53]:
lts['2011-06-01':]

2011-06-01   -1.276456
2011-06-02    0.233991
2011-06-03    0.881760
2011-06-04   -0.382703
2011-06-05    1.472987
                ...   
2012-09-22   -2.275846
2012-09-23    0.484376
2012-09-24   -0.224740
2012-09-25    2.769927
2012-09-26   -0.777445
Freq: D, Length: 484, dtype: float64

In [54]:
lts['2011-08-25':'2012-08-24']

2011-08-25   -0.772374
2011-08-26   -0.265936
2011-08-27    0.832243
2011-08-28   -0.863703
2011-08-29    0.772414
                ...   
2012-08-20    0.832725
2012-08-21    0.940275
2012-08-22    0.692549
2012-08-23   -0.637598
2012-08-24   -0.407290
Freq: D, Length: 366, dtype: float64

As before, you can pass either a string date, **datetime**, or timestamp. Remember that
slicing in this manner produces views on the source time series like slicing NumPy
arrays. This means that no data is copied and modifications on the slice will be reflec‐
ted in the original data.

There is an equivalent instance method, truncate, that slices a Series between two
dates

In [55]:
lts.truncate(before='2012-08-01')

2012-08-01    0.448356
2012-08-02    0.124522
2012-08-03    1.953035
2012-08-04    0.427994
2012-08-05    0.790907
2012-08-06    1.518593
2012-08-07    0.250939
2012-08-08    1.275271
2012-08-09   -0.494953
2012-08-10    0.622048
2012-08-11    0.596488
2012-08-12   -1.189778
2012-08-13   -0.388949
2012-08-14    2.245498
2012-08-15    1.789944
2012-08-16   -1.180911
2012-08-17   -2.958566
2012-08-18    1.396545
2012-08-19    0.726929
2012-08-20    0.832725
2012-08-21    0.940275
2012-08-22    0.692549
2012-08-23   -0.637598
2012-08-24   -0.407290
2012-08-25    0.838208
2012-08-26   -0.641766
2012-08-27    1.837654
2012-08-28   -0.849455
2012-08-29   -0.246347
2012-08-30   -0.409388
2012-08-31   -0.454661
2012-09-01    0.365459
2012-09-02   -0.759928
2012-09-03   -0.755244
2012-09-04   -0.280597
2012-09-05   -0.068298
2012-09-06   -0.535106
2012-09-07   -0.926304
2012-09-08   -0.324994
2012-09-09   -0.070286
2012-09-10    1.127199
2012-09-11    0.749904
2012-09-12    0.269056
2012-09-13 

In [56]:
dates = pd.date_range('2012-08-05', periods=100, freq='W-WED')

In [57]:
long_df = pd.DataFrame(np.random.randn(100,4), index=dates, columns=['NY', 'KHI', 'DEL', 'ANK'])

In [58]:
long_df

Unnamed: 0,NY,KHI,DEL,ANK
2012-08-08,-1.036047,2.637510,0.437063,0.046689
2012-08-15,1.782631,-0.375070,0.603835,-0.231214
2012-08-22,-0.027000,0.023360,1.101765,0.762517
2012-08-29,1.418819,-1.207805,-2.119166,-0.079623
2012-09-05,0.800157,-0.456457,-0.531727,0.192759
...,...,...,...,...
2014-06-04,0.135334,0.512010,0.072383,-0.460687
2014-06-11,-1.189303,1.227403,1.764791,-0.090979
2014-06-18,-1.723334,-0.467451,1.680970,-0.740675
2014-06-25,0.404951,-0.283153,-0.324911,-0.442905


In [59]:
long_df.loc['2013-08']

Unnamed: 0,NY,KHI,DEL,ANK
2013-08-07,-0.318618,-0.127524,1.559932,1.673422
2013-08-14,-0.561744,1.46155,-0.994221,0.989295
2013-08-21,-1.277445,-0.206639,1.803574,-1.574962
2013-08-28,0.373524,0.666678,1.312168,-0.391453


## Time Series with Duplicate Indices

In some applications, there may be multiple data observations falling on a particular
timestamp

In [60]:
dates = pd.DatetimeIndex(['1/1/2000', '1/2/2000', '1/2/2000', '1/2/2000', '1/3/2000'])

In [61]:
dates

DatetimeIndex(['2000-01-01', '2000-01-02', '2000-01-02', '2000-01-02',
               '2000-01-03'],
              dtype='datetime64[ns]', freq=None)

In [62]:
dup_ts = pd.Series(np.random.randn(len(dates)), index=dates)
dup_ts

2000-01-01    1.753368
2000-01-02   -0.354485
2000-01-02    1.441901
2000-01-02   -0.207880
2000-01-03    0.775163
dtype: float64

In [63]:
dup_ts.index.is_unique #means that index has dupplicate value

False

In [64]:
dup_ts['2000-01-02']

2000-01-02   -0.354485
2000-01-02    1.441901
2000-01-02   -0.207880
dtype: float64

In [65]:
dup_ts.groupby(level=0).count()

2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64

# 11.3 Date Ranges, Frequencies, and Shifting

In [66]:
ts = long_df['NY']

In [67]:
ts

2012-08-08   -1.036047
2012-08-15    1.782631
2012-08-22   -0.027000
2012-08-29    1.418819
2012-09-05    0.800157
                ...   
2014-06-04    0.135334
2014-06-11   -1.189303
2014-06-18   -1.723334
2014-06-25    0.404951
2014-07-02    2.029791
Freq: W-WED, Name: NY, Length: 100, dtype: float64

In [68]:
resampler = ts.resample('D') #The string 'D' is interpreted as daily frequency.

## Generating Date Ranges

In [69]:
index = pd.date_range(start='1998-08-24', end='1998-12-24')

In [70]:
index

DatetimeIndex(['1998-08-24', '1998-08-25', '1998-08-26', '1998-08-27',
               '1998-08-28', '1998-08-29', '1998-08-30', '1998-08-31',
               '1998-09-01', '1998-09-02',
               ...
               '1998-12-15', '1998-12-16', '1998-12-17', '1998-12-18',
               '1998-12-19', '1998-12-20', '1998-12-21', '1998-12-22',
               '1998-12-23', '1998-12-24'],
              dtype='datetime64[ns]', length=123, freq='D')

In [71]:
pd.date_range(start='1998-08-24', periods=20)

DatetimeIndex(['1998-08-24', '1998-08-25', '1998-08-26', '1998-08-27',
               '1998-08-28', '1998-08-29', '1998-08-30', '1998-08-31',
               '1998-09-01', '1998-09-02', '1998-09-03', '1998-09-04',
               '1998-09-05', '1998-09-06', '1998-09-07', '1998-09-08',
               '1998-09-09', '1998-09-10', '1998-09-11', '1998-09-12'],
              dtype='datetime64[ns]', freq='D')

In [72]:
pd.date_range(end='2020-07-20', periods=366)

DatetimeIndex(['2019-07-21', '2019-07-22', '2019-07-23', '2019-07-24',
               '2019-07-25', '2019-07-26', '2019-07-27', '2019-07-28',
               '2019-07-29', '2019-07-30',
               ...
               '2020-07-11', '2020-07-12', '2020-07-13', '2020-07-14',
               '2020-07-15', '2020-07-16', '2020-07-17', '2020-07-18',
               '2020-07-19', '2020-07-20'],
              dtype='datetime64[ns]', length=366, freq='D')

In [73]:
#'BM' frequency is business end of month
pd.date_range(end='2020-07-20', periods=15, freq='BM')

DatetimeIndex(['2019-04-30', '2019-05-31', '2019-06-28', '2019-07-31',
               '2019-08-30', '2019-09-30', '2019-10-31', '2019-11-29',
               '2019-12-31', '2020-01-31', '2020-02-28', '2020-03-31',
               '2020-04-30', '2020-05-29', '2020-06-30'],
              dtype='datetime64[ns]', freq='BM')

*See Table 11-4. Base time series frequencies (not comprehensive)*

![Base time series frequencies](Img/11.4.png)

In [74]:
pd.date_range(start='20/07/2020 13:02:54', periods=10)

DatetimeIndex(['2020-07-20 13:02:54', '2020-07-21 13:02:54',
               '2020-07-22 13:02:54', '2020-07-23 13:02:54',
               '2020-07-24 13:02:54', '2020-07-25 13:02:54',
               '2020-07-26 13:02:54', '2020-07-27 13:02:54',
               '2020-07-28 13:02:54', '2020-07-29 13:02:54'],
              dtype='datetime64[ns]', freq='D')

In [75]:
pd.date_range(start='20/07/2020 13:02:54', periods=10, normalize=True)

DatetimeIndex(['2020-07-20', '2020-07-21', '2020-07-22', '2020-07-23',
               '2020-07-24', '2020-07-25', '2020-07-26', '2020-07-27',
               '2020-07-28', '2020-07-29'],
              dtype='datetime64[ns]', freq='D')

# Frequencies and Date Offsets

In [76]:
pd.date_range('20/07/2020', '23/07/2020', freq='4H')

DatetimeIndex(['2020-07-20 00:00:00', '2020-07-20 04:00:00',
               '2020-07-20 08:00:00', '2020-07-20 12:00:00',
               '2020-07-20 16:00:00', '2020-07-20 20:00:00',
               '2020-07-21 00:00:00', '2020-07-21 04:00:00',
               '2020-07-21 08:00:00', '2020-07-21 12:00:00',
               '2020-07-21 16:00:00', '2020-07-21 20:00:00',
               '2020-07-22 00:00:00', '2020-07-22 04:00:00',
               '2020-07-22 08:00:00', '2020-07-22 12:00:00',
               '2020-07-22 16:00:00', '2020-07-22 20:00:00',
               '2020-07-23 00:00:00'],
              dtype='datetime64[ns]', freq='4H')

In [77]:
pd.date_range('20/07/2020', '23/07/2020', freq='1H30Min')

DatetimeIndex(['2020-07-20 00:00:00', '2020-07-20 01:30:00',
               '2020-07-20 03:00:00', '2020-07-20 04:30:00',
               '2020-07-20 06:00:00', '2020-07-20 07:30:00',
               '2020-07-20 09:00:00', '2020-07-20 10:30:00',
               '2020-07-20 12:00:00', '2020-07-20 13:30:00',
               '2020-07-20 15:00:00', '2020-07-20 16:30:00',
               '2020-07-20 18:00:00', '2020-07-20 19:30:00',
               '2020-07-20 21:00:00', '2020-07-20 22:30:00',
               '2020-07-21 00:00:00', '2020-07-21 01:30:00',
               '2020-07-21 03:00:00', '2020-07-21 04:30:00',
               '2020-07-21 06:00:00', '2020-07-21 07:30:00',
               '2020-07-21 09:00:00', '2020-07-21 10:30:00',
               '2020-07-21 12:00:00', '2020-07-21 13:30:00',
               '2020-07-21 15:00:00', '2020-07-21 16:30:00',
               '2020-07-21 18:00:00', '2020-07-21 19:30:00',
               '2020-07-21 21:00:00', '2020-07-21 22:30:00',
               '2020-07-

### Week of month dates

One useful frequency class is “week of month,” starting with WOM. This enables you to
get dates like the third Friday of each month

In [78]:
pd.date_range('2020-01-21', '2020-12-21', freq='WOM-1FRI')

DatetimeIndex(['2020-02-07', '2020-03-06', '2020-04-03', '2020-05-01',
               '2020-06-05', '2020-07-03', '2020-08-07', '2020-09-04',
               '2020-10-02', '2020-11-06', '2020-12-04'],
              dtype='datetime64[ns]', freq='WOM-1FRI')

## Shifting (Leading and Lagging) Data

“Shifting” refers to moving data backward and forward through time. Both Series and
DataFrame have a shift method for doing naive shifts forward or backward, leaving
the index unmodified

In [80]:
ts = pd.Series(np.random.randn(6), index=pd.date_range('1/1/2000', periods=6, freq='M'))

In [81]:
ts

2000-01-31   -0.036674
2000-02-29   -1.313310
2000-03-31    0.446429
2000-04-30   -0.690893
2000-05-31   -0.339031
2000-06-30   -0.824358
Freq: M, dtype: float64

In [82]:
ts.shift(2)

2000-01-31         NaN
2000-02-29         NaN
2000-03-31   -0.036674
2000-04-30   -1.313310
2000-05-31    0.446429
2000-06-30   -0.690893
Freq: M, dtype: float64

In [83]:
ts.shift(-2)

2000-01-31    0.446429
2000-02-29   -0.690893
2000-03-31   -0.339031
2000-04-30   -0.824358
2000-05-31         NaN
2000-06-30         NaN
Freq: M, dtype: float64

In [84]:
ts / ts.shift(1) - 1    

2000-01-31          NaN
2000-02-29    34.810729
2000-03-31    -1.339927
2000-04-30    -2.547598
2000-05-31    -0.509286
2000-06-30     1.431513
Freq: M, dtype: float64

In [87]:
#if the frequency is known, it can be passed to shift to advance the timestamps instead of data
ts.shift(2, freq='M')

2000-03-31   -0.036674
2000-04-30   -1.313310
2000-05-31    0.446429
2000-06-30   -0.690893
2000-07-31   -0.339031
2000-08-31   -0.824358
Freq: M, dtype: float64

In [88]:
ts.shift(3, freq='D')

2000-02-03   -0.036674
2000-03-03   -1.313310
2000-04-03    0.446429
2000-05-03   -0.690893
2000-06-03   -0.339031
2000-07-03   -0.824358
dtype: float64

In [89]:
ts.shift(1, freq='90T') #The T here stands for minutes. (90T -> 90 Minutes)

2000-01-31 01:30:00   -0.036674
2000-02-29 01:30:00   -1.313310
2000-03-31 01:30:00    0.446429
2000-04-30 01:30:00   -0.690893
2000-05-31 01:30:00   -0.339031
2000-06-30 01:30:00   -0.824358
Freq: M, dtype: float64

### Shifting dates with offsets

In [90]:
from pandas.tseries.offsets import MonthEnd, Day

In [99]:
now = datetime(2020, 11, 17)
now

datetime.datetime(2020, 11, 17, 0, 0)

In [94]:
now + 3 * Day()

Timestamp('2020-11-20 00:00:00')

In [98]:
now + MonthEnd()

Timestamp('2020-11-30 00:00:00')

In [102]:
now + MonthEnd(2)

Timestamp('2020-12-31 00:00:00')

In [103]:
now + MonthEnd(3)

Timestamp('2021-01-31 00:00:00')

In [104]:
offset = MonthEnd()

In [105]:
offset.rollforward(now)

Timestamp('2020-11-30 00:00:00')

In [106]:
offset.rollback(now)

Timestamp('2020-10-31 00:00:00')

In [107]:
ts = pd.Series(np.random.randn(10), index=pd.date_range('22/07/2020', periods=10, freq='4d'))

In [108]:
ts

2020-07-22   -0.149694
2020-07-26    0.992339
2020-07-30    0.910226
2020-08-03    0.496291
2020-08-07    0.118034
2020-08-11    0.144852
2020-08-15   -2.548091
2020-08-19    0.114010
2020-08-23    2.199712
2020-08-27   -1.314360
Freq: 4D, dtype: float64

In [110]:
ts.groupby(offset.rollforward).count()

2020-07-31    3
2020-08-31    7
dtype: int64

In [111]:
ts.groupby(offset.rollforward).mean()

2020-07-31    0.584290
2020-08-31   -0.112793
dtype: float64

In [113]:
ts.resample('M').mean()

2020-07-31    0.584290
2020-08-31   -0.112793
Freq: M, dtype: float64

In [114]:
ts.resample('M').count()

2020-07-31    3
2020-08-31    7
Freq: M, dtype: int64

# 11.4 Time Zone Handling

In [115]:
import pytz

In [117]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

In [121]:
tz = pytz.timezone('America/New_York')
tz

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

## Time Zone Localization and Conversion

In [142]:
ts = pd.Series(np.random.randn(10), index=pd.date_range('22/07/2020 11:45', periods=10, freq='D'))

In [143]:
ts

2020-07-22 11:45:00   -1.361026
2020-07-23 11:45:00    0.213966
2020-07-24 11:45:00    0.920189
2020-07-25 11:45:00    0.511554
2020-07-26 11:45:00   -0.862784
2020-07-27 11:45:00   -0.447586
2020-07-28 11:45:00    0.240678
2020-07-29 11:45:00   -0.772703
2020-07-30 11:45:00   -0.786570
2020-07-31 11:45:00   -0.481367
Freq: D, dtype: float64

In [144]:
print(ts.index.tz)

None


In [145]:
ts_UTC = pd.Series(np.random.randn(10), index=pd.date_range('22/07/2020 11:45', periods=10, freq='D', tz='UTC'))

In [146]:
ts_UTC

2020-07-22 11:45:00+00:00    0.327405
2020-07-23 11:45:00+00:00    1.070646
2020-07-24 11:45:00+00:00    0.691818
2020-07-25 11:45:00+00:00   -0.228348
2020-07-26 11:45:00+00:00   -0.963303
2020-07-27 11:45:00+00:00    0.543780
2020-07-28 11:45:00+00:00   -1.185834
2020-07-29 11:45:00+00:00   -0.997104
2020-07-30 11:45:00+00:00    0.585770
2020-07-31 11:45:00+00:00    0.258532
Freq: D, dtype: float64

Conversion from naive to localized is handled by the tz_localize method

In [147]:
ts_UTC = ts.tz_localize('UTC')

In [148]:
ts_UTC

2020-07-22 11:45:00+00:00   -1.361026
2020-07-23 11:45:00+00:00    0.213966
2020-07-24 11:45:00+00:00    0.920189
2020-07-25 11:45:00+00:00    0.511554
2020-07-26 11:45:00+00:00   -0.862784
2020-07-27 11:45:00+00:00   -0.447586
2020-07-28 11:45:00+00:00    0.240678
2020-07-29 11:45:00+00:00   -0.772703
2020-07-30 11:45:00+00:00   -0.786570
2020-07-31 11:45:00+00:00   -0.481367
Freq: D, dtype: float64

In [152]:
ts_UTC.index.tz

<UTC>

Once a time series has been localized to a particular time zone, it can be converted to
another time zone with tz_convert:

In [153]:
ts_UTC.tz_convert('America/New_York')

2020-07-22 07:45:00-04:00   -1.361026
2020-07-23 07:45:00-04:00    0.213966
2020-07-24 07:45:00-04:00    0.920189
2020-07-25 07:45:00-04:00    0.511554
2020-07-26 07:45:00-04:00   -0.862784
2020-07-27 07:45:00-04:00   -0.447586
2020-07-28 07:45:00-04:00    0.240678
2020-07-29 07:45:00-04:00   -0.772703
2020-07-30 07:45:00-04:00   -0.786570
2020-07-31 07:45:00-04:00   -0.481367
Freq: D, dtype: float64

In [154]:
ts_eastern = ts.tz_localize('America/New_York')

In [155]:
ts_eastern

2020-07-22 11:45:00-04:00   -1.361026
2020-07-23 11:45:00-04:00    0.213966
2020-07-24 11:45:00-04:00    0.920189
2020-07-25 11:45:00-04:00    0.511554
2020-07-26 11:45:00-04:00   -0.862784
2020-07-27 11:45:00-04:00   -0.447586
2020-07-28 11:45:00-04:00    0.240678
2020-07-29 11:45:00-04:00   -0.772703
2020-07-30 11:45:00-04:00   -0.786570
2020-07-31 11:45:00-04:00   -0.481367
Freq: D, dtype: float64

In [156]:
ts_eastern.tz_convert('UTC')

2020-07-22 15:45:00+00:00   -1.361026
2020-07-23 15:45:00+00:00    0.213966
2020-07-24 15:45:00+00:00    0.920189
2020-07-25 15:45:00+00:00    0.511554
2020-07-26 15:45:00+00:00   -0.862784
2020-07-27 15:45:00+00:00   -0.447586
2020-07-28 15:45:00+00:00    0.240678
2020-07-29 15:45:00+00:00   -0.772703
2020-07-30 15:45:00+00:00   -0.786570
2020-07-31 15:45:00+00:00   -0.481367
Freq: D, dtype: float64

In [157]:
ts_eastern.tz_convert('Europe/Berlin')

2020-07-22 17:45:00+02:00   -1.361026
2020-07-23 17:45:00+02:00    0.213966
2020-07-24 17:45:00+02:00    0.920189
2020-07-25 17:45:00+02:00    0.511554
2020-07-26 17:45:00+02:00   -0.862784
2020-07-27 17:45:00+02:00   -0.447586
2020-07-28 17:45:00+02:00    0.240678
2020-07-29 17:45:00+02:00   -0.772703
2020-07-30 17:45:00+02:00   -0.786570
2020-07-31 17:45:00+02:00   -0.481367
Freq: D, dtype: float64

In [162]:
ts_eastern.tz_convert('Asia/Karachi')

2020-07-22 20:45:00+05:00   -1.361026
2020-07-23 20:45:00+05:00    0.213966
2020-07-24 20:45:00+05:00    0.920189
2020-07-25 20:45:00+05:00    0.511554
2020-07-26 20:45:00+05:00   -0.862784
2020-07-27 20:45:00+05:00   -0.447586
2020-07-28 20:45:00+05:00    0.240678
2020-07-29 20:45:00+05:00   -0.772703
2020-07-30 20:45:00+05:00   -0.786570
2020-07-31 20:45:00+05:00   -0.481367
Freq: D, dtype: float64

In [165]:
ts_eastern.tz_convert('Asia/Shanghai')

2020-07-22 23:45:00+08:00   -1.361026
2020-07-23 23:45:00+08:00    0.213966
2020-07-24 23:45:00+08:00    0.920189
2020-07-25 23:45:00+08:00    0.511554
2020-07-26 23:45:00+08:00   -0.862784
2020-07-27 23:45:00+08:00   -0.447586
2020-07-28 23:45:00+08:00    0.240678
2020-07-29 23:45:00+08:00   -0.772703
2020-07-30 23:45:00+08:00   -0.786570
2020-07-31 23:45:00+08:00   -0.481367
Freq: D, dtype: float64

## Operations with Time Zone−Aware Timestamp Objects

In [166]:
stamp = pd.Timestamp('2020-08-24 04:00')

In [170]:
stamp_utc = stamp.tz_localize('UTC')
stamp_utc

Timestamp('2020-08-24 04:00:00+0000', tz='UTC')

In [171]:
stamp_utc.tz_convert('Asia/Karachi')

Timestamp('2020-08-24 09:00:00+0500', tz='Asia/Karachi')

In [178]:
stamp_KHI = pd.Timestamp('2020-08-24 04:00', tz='Asia/Karachi')
stamp_KHI

Timestamp('2020-08-24 04:00:00+0500', tz='Asia/Karachi')

Time zone–aware Timestamp objects internally store a UTC timestamp value as nano‐
seconds since the Unix epoch (January 1, 1970); this UTC value is invariant between
time zone conversions

In [176]:
stamp_utc.value

1598241600000000000

In [177]:
stamp_utc.tz_convert('America/New_York').value

1598241600000000000

In [179]:
from pandas.tseries.offsets import Hour

In [183]:
t_stamp = pd.Timestamp('2020-08-24 09:00', tz='Asia/Karachi')

In [184]:
t_stamp

Timestamp('2020-08-24 09:00:00+0500', tz='Asia/Karachi')

In [185]:
t_stamp + Hour()

Timestamp('2020-08-24 10:00:00+0500', tz='Asia/Karachi')

In [189]:
t_stamp + Hour(15)

Timestamp('2020-08-25 00:00:00+0500', tz='Asia/Karachi')

In [196]:
stamp = pd.Timestamp('2012-11-04 00:30', tz='US/Eastern')

In [197]:
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [198]:
stamp + 2 * Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

## Operations Between Different Time Zones

In [199]:
ts = pd.Series(np.random.randn(10), index=pd.date_range('22/07/2020 9:30', freq='B', periods=10))

In [200]:
ts

2020-07-22 09:30:00    1.179796
2020-07-23 09:30:00    0.437467
2020-07-24 09:30:00   -0.156221
2020-07-27 09:30:00    0.462077
2020-07-28 09:30:00   -0.831943
2020-07-29 09:30:00   -0.338552
2020-07-30 09:30:00    0.194236
2020-07-31 09:30:00   -0.181359
2020-08-03 09:30:00    0.481387
2020-08-04 09:30:00   -1.762461
Freq: B, dtype: float64

In [203]:
ts1 = ts[:7].tz_localize('US/Eastern')
ts1

2020-07-22 09:30:00-04:00    1.179796
2020-07-23 09:30:00-04:00    0.437467
2020-07-24 09:30:00-04:00   -0.156221
2020-07-27 09:30:00-04:00    0.462077
2020-07-28 09:30:00-04:00   -0.831943
2020-07-29 09:30:00-04:00   -0.338552
2020-07-30 09:30:00-04:00    0.194236
Freq: B, dtype: float64

In [204]:
ts2 = ts[2:].tz_localize('US/Eastern')
ts2

2020-07-24 09:30:00-04:00   -0.156221
2020-07-27 09:30:00-04:00    0.462077
2020-07-28 09:30:00-04:00   -0.831943
2020-07-29 09:30:00-04:00   -0.338552
2020-07-30 09:30:00-04:00    0.194236
2020-07-31 09:30:00-04:00   -0.181359
2020-08-03 09:30:00-04:00    0.481387
2020-08-04 09:30:00-04:00   -1.762461
Freq: B, dtype: float64

In [206]:
result = ts1 + ts2
result.index

DatetimeIndex(['2020-07-22 09:30:00-04:00', '2020-07-23 09:30:00-04:00',
               '2020-07-24 09:30:00-04:00', '2020-07-27 09:30:00-04:00',
               '2020-07-28 09:30:00-04:00', '2020-07-29 09:30:00-04:00',
               '2020-07-30 09:30:00-04:00', '2020-07-31 09:30:00-04:00',
               '2020-08-03 09:30:00-04:00', '2020-08-04 09:30:00-04:00'],
              dtype='datetime64[ns, US/Eastern]', freq='B')