# Time Series

__Time series__ data is an important form of structured data in many different fields, such as finance, economics, ecology, neuroscience, and physics. Anything that is observed or measured at many points in time forms a time series. Many time series are _fixed frequency_, which is to say that data pints occur at regular intervals according to some rule, such as 15 seconds, every 5 minutes, or once per month. Time series can also be _irregular_ without a fixed unit of time or offset between the units. How you mark and refer to time series data depends on the application, and you may have one of the following:
- __Timestamps__, specific instants in time
- Fixed __periods__, such as the month January 2007 or the full year 2010
- __Intervals__ of time, indicated by a start and end timestamp. Periods can be thought of as specialcase of intervals
- Experiment or elapsed time; each timestamp is a measure of time relative to a particular start time (e.g., the diameter of a cookie baking each second since being placed in the oven)

In this chapter, we are mainly concerned with time series in the first three categories, though many of the techniques can be applied to experimental time series where the index may be an integer or floating-point number indicating elpsed time for the start of the experiment. The simplest and most widely used kind of time series are those indexed by timestamp.

__pandas__ provides many built-in time series tools and data algorithms. You can efficiently work woth very large time series and easily slice and dice, aggregate, and resample irregular- and fixed-frequency time series. Some of these tools are espcially useful for financial and economics applications, but you could certainly use them to analyze server log data, too.

## Data and Time Data Types and Tools

In [1]:
import pandas as pd
import numpy as np
from datetime import datetime

In [2]:
now = datetime.now()

In [3]:
now

datetime.datetime(2019, 3, 25, 23, 44, 39, 64987)

In [4]:
now.year, now.month, now.day

(2019, 3, 25)

In [5]:
delta = datetime(2011,1,7)-datetime(2008,6,24,8,15)

In [6]:
delta

datetime.timedelta(days=926, seconds=56700)

In [7]:
delta.days

926

In [8]:
delta.seconds

56700

In [9]:
from datetime import timedelta

In [10]:
start = datetime(2011,1,7)

In [11]:
start + timedelta(12)

datetime.datetime(2011, 1, 19, 0, 0)

In [12]:
start - 2*timedelta(12)

datetime.datetime(2010, 12, 14, 0, 0)

### Converting Between String and Tools

In [13]:
stamp = datetime(2011,1,3)

In [14]:
str(stamp)

'2011-01-03 00:00:00'

In [15]:
stamp.strftime('%Y-%m-%d')

'2011-01-03'

In [16]:
value = '2011-01-03'

In [17]:
datetime.strptime(value, '%Y-%m-%d')

datetime.datetime(2011, 1, 3, 0, 0)

In [18]:
datestrs = ['7/6/2011', '8/6/2011']

In [19]:
[datetime.strptime(x, '%m/%d/%Y') for x in datestrs]

[datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 8, 6, 0, 0)]

In [20]:
from dateutil.parser import parse

In [21]:
parse('2011-01-03')

datetime.datetime(2011, 1, 3, 0, 0)

In [22]:
parse('Jan 31, 1997 10:45 PM')

datetime.datetime(1997, 1, 31, 22, 45)

In [23]:
parse('6/12/2011', dayfirst = True)

datetime.datetime(2011, 12, 6, 0, 0)

In [24]:
datestrs = ['2011-07-06 12:00:00', '2011-08-06 00:00:00']

In [25]:
pd.to_datetime(datestrs)

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00'], dtype='datetime64[ns]', freq=None)

In [26]:
idx = pd.to_datetime(datestrs + [None])

In [27]:
idx

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)

In [28]:
idx[2]

NaT

In [29]:
pd.isnull(idx)

array([False, False,  True])

## Time Series Basics

In [30]:
from datetime import datetime

In [31]:
dates = [datetime(2011,1,2), datetime(2011,1,5),
        datetime(2011,1,7), datetime(2011,1,8),
        datetime(2011,1,10),datetime(2011,1,12)]

In [32]:
ts = pd.Series(np.random.randn(6),index=dates)

In [33]:
ts

2011-01-02   -0.297055
2011-01-05    1.830747
2011-01-07   -0.252877
2011-01-08    0.264915
2011-01-10    0.204516
2011-01-12    1.666639
dtype: float64

In [34]:
ts.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

In [35]:
ts + ts[::2]

2011-01-02   -0.594110
2011-01-05         NaN
2011-01-07   -0.505754
2011-01-08         NaN
2011-01-10    0.409032
2011-01-12         NaN
dtype: float64

In [36]:
ts.index.dtype

dtype('<M8[ns]')

In [37]:
stamp = ts.index[0]

In [38]:
stamp

Timestamp('2011-01-02 00:00:00')

### Indexing, Selection, Subsetting

In [39]:
stamp = ts.index[2]

In [40]:
ts[stamp]

-0.25287678976341754

In [41]:
ts['1/10/2011']

0.2045157656794592

In [42]:
ts['20110110']

0.2045157656794592

In [43]:
longer_ts = pd.Series(np.random.randn(1000),
                     index=pd.date_range('1/1/2000',periods=1000))

In [44]:
longer_ts

2000-01-01    0.557597
2000-01-02   -0.940082
2000-01-03   -0.361686
2000-01-04    0.790567
2000-01-05    1.175491
2000-01-06   -0.377479
2000-01-07    2.055836
2000-01-08   -1.150138
2000-01-09    0.315845
2000-01-10    0.528625
2000-01-11    1.140746
2000-01-12   -0.630529
2000-01-13   -0.128040
2000-01-14    0.389982
2000-01-15    1.835513
2000-01-16    0.691312
2000-01-17    0.977816
2000-01-18    1.258009
2000-01-19    0.929387
2000-01-20    0.723458
2000-01-21    0.573009
2000-01-22   -1.068877
2000-01-23    2.941782
2000-01-24    1.027447
2000-01-25   -1.301756
2000-01-26   -1.063147
2000-01-27    1.143060
2000-01-28    1.833751
2000-01-29    0.305895
2000-01-30   -1.598328
                ...   
2002-08-28    0.109681
2002-08-29   -1.385879
2002-08-30    0.596299
2002-08-31   -0.078687
2002-09-01   -1.343206
2002-09-02    2.422075
2002-09-03   -1.643361
2002-09-04   -0.936672
2002-09-05   -0.230465
2002-09-06    0.379373
2002-09-07    0.969384
2002-09-08    0.254101
2002-09-09 

In [45]:
longer_ts['2001']

2001-01-01   -1.771129
2001-01-02   -0.097456
2001-01-03    0.775749
2001-01-04   -0.898293
2001-01-05   -1.278896
2001-01-06    0.022879
2001-01-07   -1.031676
2001-01-08    0.174878
2001-01-09   -0.278619
2001-01-10    0.602066
2001-01-11   -1.284630
2001-01-12    1.193533
2001-01-13    0.059200
2001-01-14   -1.521837
2001-01-15   -0.290541
2001-01-16   -0.097524
2001-01-17    1.461579
2001-01-18   -0.850643
2001-01-19   -0.090471
2001-01-20   -0.041332
2001-01-21   -1.089939
2001-01-22   -1.168262
2001-01-23   -0.604010
2001-01-24   -1.682125
2001-01-25   -0.292231
2001-01-26   -1.777617
2001-01-27    0.520642
2001-01-28   -0.502826
2001-01-29    0.072865
2001-01-30    0.079271
                ...   
2001-12-02    0.937406
2001-12-03   -1.295764
2001-12-04   -0.622708
2001-12-05   -0.510005
2001-12-06    0.207207
2001-12-07    1.576935
2001-12-08    0.637127
2001-12-09    0.326599
2001-12-10   -0.580001
2001-12-11   -1.561567
2001-12-12    1.094133
2001-12-13   -0.263759
2001-12-14 

In [46]:
longer_ts['2001-05']

2001-05-01   -0.326447
2001-05-02    1.098162
2001-05-03    1.128888
2001-05-04    0.891980
2001-05-05    0.845420
2001-05-06   -0.802946
2001-05-07    0.205337
2001-05-08   -0.359893
2001-05-09    0.420847
2001-05-10   -0.598904
2001-05-11   -1.701839
2001-05-12    1.691166
2001-05-13    0.476249
2001-05-14    1.810507
2001-05-15    2.015990
2001-05-16   -1.149038
2001-05-17   -0.194061
2001-05-18    0.936984
2001-05-19    0.629840
2001-05-20   -0.308184
2001-05-21    0.893715
2001-05-22    1.005764
2001-05-23   -1.346155
2001-05-24    0.680705
2001-05-25   -0.138421
2001-05-26   -0.464129
2001-05-27    0.245672
2001-05-28    1.239059
2001-05-29   -0.708089
2001-05-30   -0.252468
2001-05-31    0.893110
Freq: D, dtype: float64

In [47]:
ts[datetime(2011,1,7):]

2011-01-07   -0.252877
2011-01-08    0.264915
2011-01-10    0.204516
2011-01-12    1.666639
dtype: float64

In [48]:
ts

2011-01-02   -0.297055
2011-01-05    1.830747
2011-01-07   -0.252877
2011-01-08    0.264915
2011-01-10    0.204516
2011-01-12    1.666639
dtype: float64

In [49]:
ts['1/6/2011':'1/11/2011']

2011-01-07   -0.252877
2011-01-08    0.264915
2011-01-10    0.204516
dtype: float64

In [50]:
ts.truncate(after='1/9/2011')

2011-01-02   -0.297055
2011-01-05    1.830747
2011-01-07   -0.252877
2011-01-08    0.264915
dtype: float64

In [51]:
dates = pd.date_range('1/1/2000',periods=100,freq='W-WED')

In [52]:
long_df = pd.DataFrame(np.random.randn(100,4),
                      index = dates,
                      columns = ['Colorado','Texas',
                                'New York', 'Ohio'])

In [53]:
long_df.loc['5-2001']

Unnamed: 0,Colorado,Texas,New York,Ohio
2001-05-02,0.727216,1.538879,0.487911,-2.045737
2001-05-09,1.381156,0.418564,-0.810505,-0.775304
2001-05-16,-1.062244,-0.662822,0.536723,0.1325
2001-05-23,0.098349,0.458414,0.418542,1.482927
2001-05-30,0.789256,1.607389,-1.204012,0.240669


### Time Series with Duplicate Indices

In [54]:
dates = pd.DatetimeIndex(['1/1/2000','1/2/2000','1/2/2000',
                          '1/2/2000','1/3/2000'])

In [55]:
dup_ts = pd.Series(np.arange(5),index=dates)

In [56]:
dup_ts

2000-01-01    0
2000-01-02    1
2000-01-02    2
2000-01-02    3
2000-01-03    4
dtype: int32

In [57]:
dup_ts.index.is_unique

False

In [58]:
dup_ts['1/3/2000'] # not duplicated

4

In [59]:
dup_ts['1/2/2000'] # duplicated

2000-01-02    1
2000-01-02    2
2000-01-02    3
dtype: int32

In [60]:
grouped = dup_ts.groupby(level=0)

In [61]:
grouped.mean()

2000-01-01    0
2000-01-02    2
2000-01-03    4
dtype: int32

In [62]:
grouped.count()

2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64

## Date Ranges, Frequencies, and Shifting

In [63]:
ts

2011-01-02   -0.297055
2011-01-05    1.830747
2011-01-07   -0.252877
2011-01-08    0.264915
2011-01-10    0.204516
2011-01-12    1.666639
dtype: float64

In [64]:
resampler = ts.resample('D') # 'D' stands for daily frequency

### Generating Data Ranges

In [65]:
index = pd.date_range('2012-04-01', '2012-06-01')

In [66]:
index

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',
               '2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',
               '2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',
               '2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',
               '2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',
               '2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',
               '2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',
               '2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',
      

In [67]:
pd.date_range(start='2012-04-01', periods=20)

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20'],
              dtype='datetime64[ns]', freq='D')

In [68]:
pd.date_range(start='2012-06-01', periods=20)

DatetimeIndex(['2012-06-01', '2012-06-02', '2012-06-03', '2012-06-04',
               '2012-06-05', '2012-06-06', '2012-06-07', '2012-06-08',
               '2012-06-09', '2012-06-10', '2012-06-11', '2012-06-12',
               '2012-06-13', '2012-06-14', '2012-06-15', '2012-06-16',
               '2012-06-17', '2012-06-18', '2012-06-19', '2012-06-20'],
              dtype='datetime64[ns]', freq='D')

In [69]:
pd.date_range('2000-01-01','2000-12-01',freq='BM')

DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-28',
               '2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31',
               '2000-09-29', '2000-10-31', '2000-11-30'],
              dtype='datetime64[ns]', freq='BM')

In [70]:
pd.date_range('2012-05-02 12:56:31', periods=5)

DatetimeIndex(['2012-05-02 12:56:31', '2012-05-03 12:56:31',
               '2012-05-04 12:56:31', '2012-05-05 12:56:31',
               '2012-05-06 12:56:31'],
              dtype='datetime64[ns]', freq='D')

In [71]:
pd.date_range('2012-05-02 12:56:31', periods=5, normalize=True)

DatetimeIndex(['2012-05-02', '2012-05-03', '2012-05-04', '2012-05-05',
               '2012-05-06'],
              dtype='datetime64[ns]', freq='D')

### Frequencies and Data Offsets

In [72]:
from pandas.tseries.offsets import Hour, Minute

In [73]:
hour = Hour()

In [74]:
hour

<Hour>

In [75]:
four_hours = Hour(4)

In [76]:
four_hours

<4 * Hours>

In [77]:
pd.date_range('2000-01-01', '2000-01-03 23:59', freq='4h')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 04:00:00',
               '2000-01-01 08:00:00', '2000-01-01 12:00:00',
               '2000-01-01 16:00:00', '2000-01-01 20:00:00',
               '2000-01-02 00:00:00', '2000-01-02 04:00:00',
               '2000-01-02 08:00:00', '2000-01-02 12:00:00',
               '2000-01-02 16:00:00', '2000-01-02 20:00:00',
               '2000-01-03 00:00:00', '2000-01-03 04:00:00',
               '2000-01-03 08:00:00', '2000-01-03 12:00:00',
               '2000-01-03 16:00:00', '2000-01-03 20:00:00'],
              dtype='datetime64[ns]', freq='4H')

In [78]:
Hour(2) + Minute(30)

<150 * Minutes>

In [79]:
pd.date_range('2000-01-01',periods=10,freq='1h30min')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 01:30:00',
               '2000-01-01 03:00:00', '2000-01-01 04:30:00',
               '2000-01-01 06:00:00', '2000-01-01 07:30:00',
               '2000-01-01 09:00:00', '2000-01-01 10:30:00',
               '2000-01-01 12:00:00', '2000-01-01 13:30:00'],
              dtype='datetime64[ns]', freq='90T')

#### Week of month dates

In [80]:
rng = pd.date_range('2012-01-01','2012-09-01',freq='WOM-3FRI')

In [81]:
list(rng)

[Timestamp('2012-01-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-02-17 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-03-16 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-04-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-05-18 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-06-15 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-07-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-08-17 00:00:00', freq='WOM-3FRI')]

### Shifting (Leading and Lagging) Data

In [82]:
ts = pd.Series(np.random.randn(4),
              index=pd.date_range('1/1/2000',periods=4,freq='M'))

In [83]:
ts

2000-01-31   -1.360462
2000-02-29   -0.897685
2000-03-31    0.738264
2000-04-30   -0.442526
Freq: M, dtype: float64

In [84]:
ts.shift(2)

2000-01-31         NaN
2000-02-29         NaN
2000-03-31   -1.360462
2000-04-30   -0.897685
Freq: M, dtype: float64

In [85]:
ts.shift(-2)

2000-01-31    0.738264
2000-02-29   -0.442526
2000-03-31         NaN
2000-04-30         NaN
Freq: M, dtype: float64

In [86]:
ts.shift(2,freq='M')

2000-03-31   -1.360462
2000-04-30   -0.897685
2000-05-31    0.738264
2000-06-30   -0.442526
Freq: M, dtype: float64

In [87]:
ts.shift(3,freq='D')

2000-02-03   -1.360462
2000-03-03   -0.897685
2000-04-03    0.738264
2000-05-03   -0.442526
dtype: float64

In [88]:
ts.shift(1,freq='90T')

2000-01-31 01:30:00   -1.360462
2000-02-29 01:30:00   -0.897685
2000-03-31 01:30:00    0.738264
2000-04-30 01:30:00   -0.442526
Freq: M, dtype: float64

#### Shifting dates with offsets

In [89]:
from pandas.tseries.offsets import Day, MonthEnd

In [90]:
now = datetime(2011,11,17)

In [91]:
now + 3 * Day()

Timestamp('2011-11-20 00:00:00')

In [92]:
now + MonthEnd()

Timestamp('2011-11-30 00:00:00')

In [93]:
now + MonthEnd(2)

Timestamp('2011-12-31 00:00:00')

In [94]:
offset = MonthEnd()

In [95]:
offset.rollforward(now)

Timestamp('2011-11-30 00:00:00')

In [96]:
offset.rollback(now)

Timestamp('2011-10-31 00:00:00')

In [97]:
ts = pd.Series(np.random.randn(20),
              index=pd.date_range('1/15/2000',periods=20,freq='4d'))

In [98]:
ts

2000-01-15    0.673174
2000-01-19    1.019236
2000-01-23    0.685013
2000-01-27    1.038470
2000-01-31    0.540280
2000-02-04    0.770313
2000-02-08   -1.562074
2000-02-12   -1.487849
2000-02-16    0.332657
2000-02-20    0.382860
2000-02-24   -0.568926
2000-02-28    0.306436
2000-03-03   -0.303773
2000-03-07    0.748852
2000-03-11    0.256492
2000-03-15   -0.077342
2000-03-19    0.950417
2000-03-23   -0.508203
2000-03-27    0.155941
2000-03-31   -1.117897
Freq: 4D, dtype: float64

In [99]:
ts.groupby(offset.rollforward).mean()

2000-01-31    0.791235
2000-02-29   -0.260941
2000-03-31    0.013061
dtype: float64

In [100]:
ts.resample('M').mean()

2000-01-31    0.791235
2000-02-29   -0.260941
2000-03-31    0.013061
Freq: M, dtype: float64

## Time Zone Handling

In [101]:
import pytz

In [102]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

In [103]:
tz = pytz.timezone('America/New_York')

In [104]:
tz

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

### Time Zone Localization and Conversion

By default, time series in pandas are __time zone naive__. For example, consider the following time series:

In [105]:
rng = pd.date_range('3/9/2012 9:30',periods=6,freq='D')

In [106]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [107]:
ts

2012-03-09 09:30:00    1.106247
2012-03-10 09:30:00   -1.384250
2012-03-11 09:30:00    0.064858
2012-03-12 09:30:00    0.210247
2012-03-13 09:30:00    1.445146
2012-03-14 09:30:00    1.190518
Freq: D, dtype: float64

In [108]:
print(ts.index.tz)

None


In [109]:
pd.date_range('3/9/2012 9:30',periods=10,freq='D',tz='UTC')

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00', '2012-03-16 09:30:00+00:00',
               '2012-03-17 09:30:00+00:00', '2012-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

Conversion from naive to __localized__ is handled by the __tz_localize__ method:

In [110]:
ts

2012-03-09 09:30:00    1.106247
2012-03-10 09:30:00   -1.384250
2012-03-11 09:30:00    0.064858
2012-03-12 09:30:00    0.210247
2012-03-13 09:30:00    1.445146
2012-03-14 09:30:00    1.190518
Freq: D, dtype: float64

In [111]:
ts_utc = ts.tz_localize('UTC')

In [112]:
ts_utc

2012-03-09 09:30:00+00:00    1.106247
2012-03-10 09:30:00+00:00   -1.384250
2012-03-11 09:30:00+00:00    0.064858
2012-03-12 09:30:00+00:00    0.210247
2012-03-13 09:30:00+00:00    1.445146
2012-03-14 09:30:00+00:00    1.190518
Freq: D, dtype: float64

In [113]:
ts_utc.index

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [114]:
ts_utc.tz_convert('America/New_York')

2012-03-09 04:30:00-05:00    1.106247
2012-03-10 04:30:00-05:00   -1.384250
2012-03-11 05:30:00-04:00    0.064858
2012-03-12 05:30:00-04:00    0.210247
2012-03-13 05:30:00-04:00    1.445146
2012-03-14 05:30:00-04:00    1.190518
Freq: D, dtype: float64

In [115]:
ts_eastern = ts.tz_localize('America/New_York')

In [116]:
ts_eastern.tz_convert('UTC')

2012-03-09 14:30:00+00:00    1.106247
2012-03-10 14:30:00+00:00   -1.384250
2012-03-11 13:30:00+00:00    0.064858
2012-03-12 13:30:00+00:00    0.210247
2012-03-13 13:30:00+00:00    1.445146
2012-03-14 13:30:00+00:00    1.190518
Freq: D, dtype: float64

In [117]:
ts_eastern.tz_convert('Europe/Berlin')

2012-03-09 15:30:00+01:00    1.106247
2012-03-10 15:30:00+01:00   -1.384250
2012-03-11 14:30:00+01:00    0.064858
2012-03-12 14:30:00+01:00    0.210247
2012-03-13 14:30:00+01:00    1.445146
2012-03-14 14:30:00+01:00    1.190518
Freq: D, dtype: float64

In [118]:
ts.index.tz_localize('Asia/Shanghai')

DatetimeIndex(['2012-03-09 09:30:00+08:00', '2012-03-10 09:30:00+08:00',
               '2012-03-11 09:30:00+08:00', '2012-03-12 09:30:00+08:00',
               '2012-03-13 09:30:00+08:00', '2012-03-14 09:30:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq='D')

### Operations with Time Zone - Aware Timestamp Objects

In [119]:
stamp = pd.Timestamp('2011-03-12 04:00')

In [120]:
stamp_utc = stamp.tz_localize('utc')

In [121]:
stamp_utc.tz_convert('America/New_York')

Timestamp('2011-03-11 23:00:00-0500', tz='America/New_York')

In [122]:
stamp_moscow = pd.Timestamp('2011-03-12 04:00',tz='Europe/Moscow')

In [123]:
stamp_moscow

Timestamp('2011-03-12 04:00:00+0300', tz='Europe/Moscow')

In [124]:
stamp_utc.value

1299902400000000000

In [125]:
stamp_utc.tz_convert('America/New_York').value

1299902400000000000

In [126]:
from pandas.tseries.offsets import Hour

In [127]:
stamp = pd.Timestamp('2012-03-12 01:30',tz='US/Eastern')

In [128]:
stamp

Timestamp('2012-03-12 01:30:00-0400', tz='US/Eastern')

In [129]:
stamp + Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

In [130]:
stamp = pd.Timestamp('2012-11-04 00:30',tz='US/Eastern')

In [131]:
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [132]:
stamp + 2*Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

### Operations Between Different Time Zones

In [133]:
rng = pd.date_range('3/7/2012 9:30',periods=10,freq='B')

In [134]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [135]:
ts

2012-03-07 09:30:00   -0.044383
2012-03-08 09:30:00    0.057817
2012-03-09 09:30:00   -0.336690
2012-03-12 09:30:00    1.533612
2012-03-13 09:30:00    0.140660
2012-03-14 09:30:00    1.708958
2012-03-15 09:30:00   -0.334757
2012-03-16 09:30:00   -0.224129
2012-03-19 09:30:00    0.509401
2012-03-20 09:30:00   -0.198934
Freq: B, dtype: float64

In [136]:
ts1 = ts[:7].tz_localize('Europe/London')

In [137]:
ts2 = ts[2:].tz_localize('Europe/Moscow')

In [138]:
result = ts1 + ts2

In [139]:
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 05:30:00+00:00', '2012-03-09 09:30:00+00:00',
               '2012-03-12 05:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 05:30:00+00:00', '2012-03-13 09:30:00+00:00',
               '2012-03-14 05:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 05:30:00+00:00', '2012-03-15 09:30:00+00:00',
               '2012-03-16 05:30:00+00:00', '2012-03-19 05:30:00+00:00',
               '2012-03-20 05:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)

## Periods and Period Arithmetic

In [140]:
p = pd.Period(2007,freq='A-DEC')

In [141]:
p

Period('2007', 'A-DEC')

In [142]:
p + 5

Period('2012', 'A-DEC')

In [143]:
p - 2

Period('2005', 'A-DEC')

In [144]:
pd.Period('2014',freq='A-DEC')-p

7

In [145]:
rng = pd.period_range('2000-01-01','2000-06-30',freq='M')

In [146]:
rng

PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05', '2000-06'], dtype='period[M]', freq='M')

In [147]:
pd.Series(np.random.randn(6),index=rng)

2000-01   -0.318918
2000-02   -0.827070
2000-03   -0.064498
2000-04    0.264583
2000-05   -1.373782
2000-06   -0.018662
Freq: M, dtype: float64

In [148]:
values = ['2001Q3','2002Q2','2003Q1']

In [149]:
index = pd.PeriodIndex(values,freq='Q-DEC')

In [150]:
index

PeriodIndex(['2001Q3', '2002Q2', '2003Q1'], dtype='period[Q-DEC]', freq='Q-DEC')

### Period Frequency Conversation

In [151]:
p = pd.Period('2007',freq='A-DEC')

In [152]:
p

Period('2007', 'A-DEC')

In [153]:
p.asfreq('M',how='start')

Period('2007-01', 'M')

In [154]:
p.asfreq('M',how='end')

Period('2007-12', 'M')

In [155]:
p = pd.Period('2007',freq='A-JUN')

In [156]:
p

Period('2007', 'A-JUN')

In [157]:
p.asfreq('M','start')

Period('2006-07', 'M')

In [158]:
p.asfreq('M','end')

Period('2007-06', 'M')

In [159]:
p = pd.Period('Aug-2007','M')

In [160]:
p.asfreq('A-JUN')

Period('2008', 'A-JUN')

In [161]:
rng = pd.period_range('2006','2009',freq='A-DEC')

In [162]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [163]:
ts

2006    0.733032
2007   -0.606297
2008   -2.167395
2009   -0.084321
Freq: A-DEC, dtype: float64

In [164]:
ts.asfreq('M',how='start')

2006-01    0.733032
2007-01   -0.606297
2008-01   -2.167395
2009-01   -0.084321
Freq: M, dtype: float64

In [165]:
ts.asfreq('B',how='end')

2006-12-29    0.733032
2007-12-31   -0.606297
2008-12-31   -2.167395
2009-12-31   -0.084321
Freq: B, dtype: float64

### Quarterly Period Frequencies

In [166]:
p = pd.Period('2012Q4',freq='Q-JAN')

In [167]:
p

Period('2012Q4', 'Q-JAN')

In [168]:
p.asfreq('D','start')

Period('2011-11-01', 'D')

In [169]:
p.asfreq('D','end')

Period('2012-01-31', 'D')

In [170]:
p4pm = (p.asfreq('B','e')-1).asfreq('T','s')+16*60

In [171]:
p4pm

Period('2012-01-30 16:00', 'T')

In [172]:
p4pm.to_timestamp()

Timestamp('2012-01-30 16:00:00')

In [173]:
rng = pd.period_range('2011Q3','2012Q4',freq='Q-JAN')

In [174]:
ts = pd.Series(np.arange(len(rng)),index=rng)

In [175]:
ts

2011Q3    0
2011Q4    1
2012Q1    2
2012Q2    3
2012Q3    4
2012Q4    5
Freq: Q-JAN, dtype: int32

In [176]:
new_rng = (rng.asfreq('B','e')-1).asfreq('T','s')+16*60

In [177]:
ts.index = new_rng.to_timestamp()

In [178]:
ts

2010-10-28 16:00:00    0
2011-01-28 16:00:00    1
2011-04-28 16:00:00    2
2011-07-28 16:00:00    3
2011-10-28 16:00:00    4
2012-01-30 16:00:00    5
dtype: int32

In [179]:
g = 100

In [180]:
h = 300

In [181]:
cu = 50 

In [182]:
x = 20020

In [183]:
l = 22828

In [184]:
t = 64646

### Converting Timestamps to Periods (and Back)

### Creating a PeriodIndex from Arrays

## Resampling and Frequency Conversion

### Downsampling 

### Upsampling and Interpolation

### Resampling with Periods

## Moving Windows Functions

### Exponentially Weighted Functions

### Binary Moving Windows Functions

### User-Defined Moving Window Functions