# Time Series

__Time series__ data is an important form of structured data in many different fields, such as finance, economics, ecology, neuroscience, and physics. Anything that is observed or measured at many points in time forms a time series. Many time series are _fixed frequency_, which is to say that data pints occur at regular intervals according to some rule, such as 15 seconds, every 5 minutes, or once per month. Time series can also be _irregular_ without a fixed unit of time or offset between the units. How you mark and refer to time series data depends on the application, and you may have one of the following:
- __Timestamps__, specific instants in time
- Fixed __periods__, such as the month January 2007 or the full year 2010
- __Intervals__ of time, indicated by a start and end timestamp. Periods can be thought of as specialcase of intervals
- Experiment or elapsed time; each timestamp is a measure of time relative to a particular start time (e.g., the diameter of a cookie baking each second since being placed in the oven)

In this chapter, we are mainly concerned with time series in the first three categories, though many of the techniques can be applied to experimental time series where the index may be an integer or floating-point number indicating elpsed time for the start of the experiment. The simplest and most widely used kind of time series are those indexed by timestamp.

__pandas__ provides many built-in time series tools and data algorithms. You can efficiently work woth very large time series and easily slice and dice, aggregate, and resample irregular- and fixed-frequency time series. Some of these tools are espcially useful for financial and economics applications, but you could certainly use them to analyze server log data, too.

## Data and Time Data Types and Tools

In [3]:
import pandas as pd
import numpy as np
from datetime import datetime

In [4]:
now = datetime.now()

In [5]:
now

datetime.datetime(2019, 5, 11, 22, 57, 49, 887903)

In [6]:
now.year, now.month, now.day

(2019, 5, 11)

In [7]:
delta = datetime(2011,1,7)-datetime(2008,6,24,8,15)

In [8]:
delta

datetime.timedelta(days=926, seconds=56700)

In [9]:
delta.days

926

In [10]:
delta.seconds

56700

In [11]:
from datetime import timedelta

In [12]:
start = datetime(2011,1,7)

In [13]:
start + timedelta(12)

datetime.datetime(2011, 1, 19, 0, 0)

In [14]:
start - 2*timedelta(12)

datetime.datetime(2010, 12, 14, 0, 0)

### Converting Between String and Tools

In [15]:
stamp = datetime(2011,1,3)

In [16]:
str(stamp)

'2011-01-03 00:00:00'

In [17]:
stamp.strftime('%Y-%m-%d')

'2011-01-03'

In [18]:
value = '2011-01-03'

In [19]:
datetime.strptime(value, '%Y-%m-%d')

datetime.datetime(2011, 1, 3, 0, 0)

In [20]:
datestrs = ['7/6/2011', '8/6/2011']

In [21]:
[datetime.strptime(x, '%m/%d/%Y') for x in datestrs]

[datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 8, 6, 0, 0)]

In [22]:
from dateutil.parser import parse

In [23]:
parse('2011-01-03')

datetime.datetime(2011, 1, 3, 0, 0)

In [24]:
parse('Jan 31, 1997 10:45 PM')

datetime.datetime(1997, 1, 31, 22, 45)

In [25]:
parse('6/12/2011', dayfirst = True)

datetime.datetime(2011, 12, 6, 0, 0)

In [26]:
datestrs = ['2011-07-06 12:00:00', '2011-08-06 00:00:00']

In [27]:
pd.to_datetime(datestrs)

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00'], dtype='datetime64[ns]', freq=None)

In [28]:
idx = pd.to_datetime(datestrs + [None])

In [29]:
idx

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)

In [30]:
idx[2]

NaT

In [31]:
pd.isnull(idx)

array([False, False,  True])

## Time Series Basics

In [32]:
from datetime import datetime

In [33]:
dates = [datetime(2011,1,2), datetime(2011,1,5),
        datetime(2011,1,7), datetime(2011,1,8),
        datetime(2011,1,10),datetime(2011,1,12)]

In [34]:
ts = pd.Series(np.random.randn(6),index=dates)

In [35]:
ts

2011-01-02    0.836350
2011-01-05   -0.518443
2011-01-07   -1.287090
2011-01-08   -0.400790
2011-01-10   -0.838243
2011-01-12    0.339225
dtype: float64

In [36]:
ts.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

In [37]:
ts + ts[::2]

2011-01-02    1.672700
2011-01-05         NaN
2011-01-07   -2.574179
2011-01-08         NaN
2011-01-10   -1.676487
2011-01-12         NaN
dtype: float64

In [38]:
ts.index.dtype

dtype('<M8[ns]')

In [39]:
stamp = ts.index[0]

In [40]:
stamp

Timestamp('2011-01-02 00:00:00')

### Indexing, Selection, Subsetting

In [41]:
stamp = ts.index[2]

In [42]:
ts[stamp]

-1.2870895008582124

In [43]:
ts['1/10/2011']

-0.8382433746983937

In [44]:
ts['20110110']

-0.8382433746983937

In [45]:
longer_ts = pd.Series(np.random.randn(1000),
                     index=pd.date_range('1/1/2000',periods=1000))

In [46]:
longer_ts

2000-01-01   -0.057960
2000-01-02    1.839364
2000-01-03   -1.082615
2000-01-04    0.858126
2000-01-05   -0.324359
2000-01-06   -0.282429
2000-01-07    0.820937
2000-01-08    0.723753
2000-01-09   -0.554633
2000-01-10    0.528826
2000-01-11    1.764991
2000-01-12   -0.863465
2000-01-13    0.833810
2000-01-14   -1.026756
2000-01-15   -0.289307
2000-01-16   -0.973552
2000-01-17    1.169965
2000-01-18   -0.398558
2000-01-19    1.302605
2000-01-20   -1.086803
2000-01-21   -1.913771
2000-01-22    0.112714
2000-01-23    1.296758
2000-01-24   -0.634312
2000-01-25   -0.544455
2000-01-26    0.408831
2000-01-27   -0.746229
2000-01-28   -0.937396
2000-01-29    0.216362
2000-01-30   -1.913066
                ...   
2002-08-28   -0.247394
2002-08-29   -1.026275
2002-08-30    1.073999
2002-08-31    2.702135
2002-09-01    1.066829
2002-09-02    0.892345
2002-09-03   -0.393247
2002-09-04   -1.172402
2002-09-05   -0.468051
2002-09-06    0.074881
2002-09-07    1.604507
2002-09-08    0.839821
2002-09-09 

In [47]:
longer_ts['2001']

2001-01-01   -2.815611
2001-01-02    0.797183
2001-01-03   -1.108371
2001-01-04    1.019033
2001-01-05   -0.586175
2001-01-06    1.926674
2001-01-07   -1.675723
2001-01-08    0.525380
2001-01-09    0.593603
2001-01-10   -0.421102
2001-01-11    0.130518
2001-01-12    0.730273
2001-01-13   -1.220413
2001-01-14   -1.522305
2001-01-15    0.733579
2001-01-16   -1.125011
2001-01-17   -0.380050
2001-01-18   -0.374420
2001-01-19   -0.192675
2001-01-20   -0.775328
2001-01-21   -1.208672
2001-01-22   -0.360537
2001-01-23    0.373400
2001-01-24   -1.758906
2001-01-25   -0.485013
2001-01-26   -0.598457
2001-01-27    0.294965
2001-01-28    0.372436
2001-01-29   -0.232708
2001-01-30    0.012319
                ...   
2001-12-02   -0.422851
2001-12-03   -0.747949
2001-12-04    1.186955
2001-12-05   -0.196763
2001-12-06   -0.375724
2001-12-07    0.958495
2001-12-08    0.268483
2001-12-09    0.035410
2001-12-10    0.257541
2001-12-11    0.536844
2001-12-12   -0.669767
2001-12-13   -1.357451
2001-12-14 

In [48]:
longer_ts['2001-05']

2001-05-01   -2.047184
2001-05-02    0.091922
2001-05-03    1.134937
2001-05-04    0.761392
2001-05-05   -2.057103
2001-05-06   -0.189897
2001-05-07    0.517808
2001-05-08   -0.609882
2001-05-09    0.916360
2001-05-10   -2.427351
2001-05-11   -0.864877
2001-05-12    0.097935
2001-05-13    0.203179
2001-05-14   -2.018046
2001-05-15   -1.060157
2001-05-16   -0.508739
2001-05-17   -1.332907
2001-05-18   -1.337218
2001-05-19    0.894420
2001-05-20    0.703300
2001-05-21   -0.677841
2001-05-22   -0.935472
2001-05-23    1.837945
2001-05-24    0.997341
2001-05-25    1.200150
2001-05-26    1.256899
2001-05-27    1.294142
2001-05-28    0.817670
2001-05-29   -1.970818
2001-05-30    1.240271
2001-05-31    1.383498
Freq: D, dtype: float64

In [49]:
ts[datetime(2011,1,7):]

2011-01-07   -1.287090
2011-01-08   -0.400790
2011-01-10   -0.838243
2011-01-12    0.339225
dtype: float64

In [50]:
ts

2011-01-02    0.836350
2011-01-05   -0.518443
2011-01-07   -1.287090
2011-01-08   -0.400790
2011-01-10   -0.838243
2011-01-12    0.339225
dtype: float64

In [51]:
ts['1/6/2011':'1/11/2011']

2011-01-07   -1.287090
2011-01-08   -0.400790
2011-01-10   -0.838243
dtype: float64

In [52]:
ts.truncate(after='1/9/2011')

2011-01-02    0.836350
2011-01-05   -0.518443
2011-01-07   -1.287090
2011-01-08   -0.400790
dtype: float64

In [53]:
dates = pd.date_range('1/1/2000',periods=100,freq='W-WED')

In [54]:
long_df = pd.DataFrame(np.random.randn(100,4),
                      index = dates,
                      columns = ['Colorado','Texas',
                                'New York', 'Ohio'])

In [55]:
long_df.loc['5-2001']

Unnamed: 0,Colorado,Texas,New York,Ohio
2001-05-02,1.237219,-1.217912,0.544024,-2.160989
2001-05-09,-0.379483,1.719977,-0.978744,-0.737738
2001-05-16,0.523928,-0.128459,-0.493878,-0.038976
2001-05-23,0.539219,-0.098069,1.178903,-1.564272
2001-05-30,-0.429948,-0.317419,0.801034,0.957944


### Time Series with Duplicate Indices

In [56]:
dates = pd.DatetimeIndex(['1/1/2000','1/2/2000','1/2/2000',
                          '1/2/2000','1/3/2000'])

In [57]:
dup_ts = pd.Series(np.arange(5),index=dates)

In [58]:
dup_ts

2000-01-01    0
2000-01-02    1
2000-01-02    2
2000-01-02    3
2000-01-03    4
dtype: int32

In [59]:
dup_ts.index.is_unique

False

In [60]:
dup_ts['1/3/2000'] # not duplicated

4

In [61]:
dup_ts['1/2/2000'] # duplicated

2000-01-02    1
2000-01-02    2
2000-01-02    3
dtype: int32

In [62]:
grouped = dup_ts.groupby(level=0)

In [63]:
grouped.mean()

2000-01-01    0
2000-01-02    2
2000-01-03    4
dtype: int32

In [64]:
grouped.count()

2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64

## Date Ranges, Frequencies, and Shifting

In [65]:
ts

2011-01-02    0.836350
2011-01-05   -0.518443
2011-01-07   -1.287090
2011-01-08   -0.400790
2011-01-10   -0.838243
2011-01-12    0.339225
dtype: float64

In [66]:
resampler = ts.resample('D') # 'D' stands for daily frequency

### Generating Data Ranges

In [67]:
index = pd.date_range('2012-04-01', '2012-06-01')

In [68]:
index

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',
               '2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',
               '2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',
               '2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',
               '2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',
               '2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',
               '2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',
               '2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',
      

In [69]:
pd.date_range(start='2012-04-01', periods=20)

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20'],
              dtype='datetime64[ns]', freq='D')

In [70]:
pd.date_range(start='2012-06-01', periods=20)

DatetimeIndex(['2012-06-01', '2012-06-02', '2012-06-03', '2012-06-04',
               '2012-06-05', '2012-06-06', '2012-06-07', '2012-06-08',
               '2012-06-09', '2012-06-10', '2012-06-11', '2012-06-12',
               '2012-06-13', '2012-06-14', '2012-06-15', '2012-06-16',
               '2012-06-17', '2012-06-18', '2012-06-19', '2012-06-20'],
              dtype='datetime64[ns]', freq='D')

In [71]:
pd.date_range('2000-01-01','2000-12-01',freq='BM')

DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-28',
               '2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31',
               '2000-09-29', '2000-10-31', '2000-11-30'],
              dtype='datetime64[ns]', freq='BM')

In [72]:
pd.date_range('2012-05-02 12:56:31', periods=5)

DatetimeIndex(['2012-05-02 12:56:31', '2012-05-03 12:56:31',
               '2012-05-04 12:56:31', '2012-05-05 12:56:31',
               '2012-05-06 12:56:31'],
              dtype='datetime64[ns]', freq='D')

In [73]:
pd.date_range('2012-05-02 12:56:31', periods=5, normalize=True)

DatetimeIndex(['2012-05-02', '2012-05-03', '2012-05-04', '2012-05-05',
               '2012-05-06'],
              dtype='datetime64[ns]', freq='D')

### Frequencies and Data Offsets

In [74]:
from pandas.tseries.offsets import Hour, Minute

In [75]:
hour = Hour()

In [76]:
hour

<Hour>

In [77]:
four_hours = Hour(4)

In [78]:
four_hours

<4 * Hours>

In [79]:
pd.date_range('2000-01-01', '2000-01-03 23:59', freq='4h')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 04:00:00',
               '2000-01-01 08:00:00', '2000-01-01 12:00:00',
               '2000-01-01 16:00:00', '2000-01-01 20:00:00',
               '2000-01-02 00:00:00', '2000-01-02 04:00:00',
               '2000-01-02 08:00:00', '2000-01-02 12:00:00',
               '2000-01-02 16:00:00', '2000-01-02 20:00:00',
               '2000-01-03 00:00:00', '2000-01-03 04:00:00',
               '2000-01-03 08:00:00', '2000-01-03 12:00:00',
               '2000-01-03 16:00:00', '2000-01-03 20:00:00'],
              dtype='datetime64[ns]', freq='4H')

In [80]:
Hour(2) + Minute(30)

<150 * Minutes>

In [81]:
pd.date_range('2000-01-01',periods=10,freq='1h30min')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 01:30:00',
               '2000-01-01 03:00:00', '2000-01-01 04:30:00',
               '2000-01-01 06:00:00', '2000-01-01 07:30:00',
               '2000-01-01 09:00:00', '2000-01-01 10:30:00',
               '2000-01-01 12:00:00', '2000-01-01 13:30:00'],
              dtype='datetime64[ns]', freq='90T')

#### Week of month dates

In [82]:
rng = pd.date_range('2012-01-01','2012-09-01',freq='WOM-3FRI')

In [83]:
list(rng)

[Timestamp('2012-01-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-02-17 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-03-16 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-04-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-05-18 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-06-15 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-07-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-08-17 00:00:00', freq='WOM-3FRI')]

### Shifting (Leading and Lagging) Data

In [84]:
ts = pd.Series(np.random.randn(4),
              index=pd.date_range('1/1/2000',periods=4,freq='M'))

In [85]:
ts

2000-01-31    0.516980
2000-02-29    1.177724
2000-03-31    0.837472
2000-04-30   -0.366302
Freq: M, dtype: float64

In [86]:
ts.shift(2)

2000-01-31         NaN
2000-02-29         NaN
2000-03-31    0.516980
2000-04-30    1.177724
Freq: M, dtype: float64

In [87]:
ts.shift(-2)

2000-01-31    0.837472
2000-02-29   -0.366302
2000-03-31         NaN
2000-04-30         NaN
Freq: M, dtype: float64

In [88]:
ts.shift(2,freq='M')

2000-03-31    0.516980
2000-04-30    1.177724
2000-05-31    0.837472
2000-06-30   -0.366302
Freq: M, dtype: float64

In [89]:
ts.shift(3,freq='D')

2000-02-03    0.516980
2000-03-03    1.177724
2000-04-03    0.837472
2000-05-03   -0.366302
dtype: float64

In [90]:
ts.shift(1,freq='90T')

2000-01-31 01:30:00    0.516980
2000-02-29 01:30:00    1.177724
2000-03-31 01:30:00    0.837472
2000-04-30 01:30:00   -0.366302
Freq: M, dtype: float64

#### Shifting dates with offsets

In [91]:
from pandas.tseries.offsets import Day, MonthEnd

In [92]:
now = datetime(2011,11,17)

In [93]:
now + 3 * Day()

Timestamp('2011-11-20 00:00:00')

In [94]:
now + MonthEnd()

Timestamp('2011-11-30 00:00:00')

In [95]:
now + MonthEnd(2)

Timestamp('2011-12-31 00:00:00')

In [96]:
offset = MonthEnd()

In [97]:
offset.rollforward(now)

Timestamp('2011-11-30 00:00:00')

In [98]:
offset.rollback(now)

Timestamp('2011-10-31 00:00:00')

In [99]:
ts = pd.Series(np.random.randn(20),
              index=pd.date_range('1/15/2000',periods=20,freq='4d'))

In [100]:
ts

2000-01-15   -0.691127
2000-01-19   -1.922988
2000-01-23   -0.964917
2000-01-27   -0.056399
2000-01-31    0.355981
2000-02-04   -1.125623
2000-02-08    0.766198
2000-02-12   -0.565523
2000-02-16    2.412972
2000-02-20    1.346483
2000-02-24    0.125983
2000-02-28    0.610978
2000-03-03   -1.108058
2000-03-07    1.793717
2000-03-11   -0.850434
2000-03-15   -1.248865
2000-03-19   -0.397808
2000-03-23    0.948636
2000-03-27   -1.792141
2000-03-31    1.314196
Freq: 4D, dtype: float64

In [101]:
ts.groupby(offset.rollforward).mean()

2000-01-31   -0.655890
2000-02-29    0.510210
2000-03-31   -0.167595
dtype: float64

In [102]:
ts.resample('M').mean()

2000-01-31   -0.655890
2000-02-29    0.510210
2000-03-31   -0.167595
Freq: M, dtype: float64

## Time Zone Handling

In [103]:
import pytz

In [104]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

In [105]:
tz = pytz.timezone('America/New_York')

In [106]:
tz

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

### Time Zone Localization and Conversion

By default, time series in pandas are __time zone naive__. For example, consider the following time series:

In [107]:
rng = pd.date_range('3/9/2012 9:30',periods=6,freq='D')

In [108]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [109]:
ts

2012-03-09 09:30:00    0.356370
2012-03-10 09:30:00    0.971788
2012-03-11 09:30:00   -0.591819
2012-03-12 09:30:00    0.759524
2012-03-13 09:30:00   -0.013817
2012-03-14 09:30:00   -0.039048
Freq: D, dtype: float64

In [110]:
print(ts.index.tz)

None


In [111]:
pd.date_range('3/9/2012 9:30',periods=10,freq='D',tz='UTC')

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00', '2012-03-16 09:30:00+00:00',
               '2012-03-17 09:30:00+00:00', '2012-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

Conversion from naive to __localized__ is handled by the __tz_localize__ method:

In [112]:
ts

2012-03-09 09:30:00    0.356370
2012-03-10 09:30:00    0.971788
2012-03-11 09:30:00   -0.591819
2012-03-12 09:30:00    0.759524
2012-03-13 09:30:00   -0.013817
2012-03-14 09:30:00   -0.039048
Freq: D, dtype: float64

In [113]:
ts_utc = ts.tz_localize('UTC')

In [114]:
ts_utc

2012-03-09 09:30:00+00:00    0.356370
2012-03-10 09:30:00+00:00    0.971788
2012-03-11 09:30:00+00:00   -0.591819
2012-03-12 09:30:00+00:00    0.759524
2012-03-13 09:30:00+00:00   -0.013817
2012-03-14 09:30:00+00:00   -0.039048
Freq: D, dtype: float64

In [115]:
ts_utc.index

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [116]:
ts_utc.tz_convert('America/New_York')

2012-03-09 04:30:00-05:00    0.356370
2012-03-10 04:30:00-05:00    0.971788
2012-03-11 05:30:00-04:00   -0.591819
2012-03-12 05:30:00-04:00    0.759524
2012-03-13 05:30:00-04:00   -0.013817
2012-03-14 05:30:00-04:00   -0.039048
Freq: D, dtype: float64

In [117]:
ts_eastern = ts.tz_localize('America/New_York')

In [118]:
ts_eastern.tz_convert('UTC')

2012-03-09 14:30:00+00:00    0.356370
2012-03-10 14:30:00+00:00    0.971788
2012-03-11 13:30:00+00:00   -0.591819
2012-03-12 13:30:00+00:00    0.759524
2012-03-13 13:30:00+00:00   -0.013817
2012-03-14 13:30:00+00:00   -0.039048
Freq: D, dtype: float64

In [119]:
ts_eastern.tz_convert('Europe/Berlin')

2012-03-09 15:30:00+01:00    0.356370
2012-03-10 15:30:00+01:00    0.971788
2012-03-11 14:30:00+01:00   -0.591819
2012-03-12 14:30:00+01:00    0.759524
2012-03-13 14:30:00+01:00   -0.013817
2012-03-14 14:30:00+01:00   -0.039048
Freq: D, dtype: float64

In [120]:
ts.index.tz_localize('Asia/Shanghai')

DatetimeIndex(['2012-03-09 09:30:00+08:00', '2012-03-10 09:30:00+08:00',
               '2012-03-11 09:30:00+08:00', '2012-03-12 09:30:00+08:00',
               '2012-03-13 09:30:00+08:00', '2012-03-14 09:30:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq='D')

### Operations with Time Zone - Aware Timestamp Objects

In [121]:
stamp = pd.Timestamp('2011-03-12 04:00')

In [122]:
stamp_utc = stamp.tz_localize('utc')

In [123]:
stamp_utc.tz_convert('America/New_York')

Timestamp('2011-03-11 23:00:00-0500', tz='America/New_York')

In [124]:
stamp_moscow = pd.Timestamp('2011-03-12 04:00',tz='Europe/Moscow')

In [125]:
stamp_moscow

Timestamp('2011-03-12 04:00:00+0300', tz='Europe/Moscow')

In [126]:
stamp_utc.value

1299902400000000000

In [127]:
stamp_utc.tz_convert('America/New_York').value

1299902400000000000

In [128]:
from pandas.tseries.offsets import Hour

In [129]:
stamp = pd.Timestamp('2012-03-12 01:30',tz='US/Eastern')

In [130]:
stamp

Timestamp('2012-03-12 01:30:00-0400', tz='US/Eastern')

In [131]:
stamp + Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

In [132]:
stamp = pd.Timestamp('2012-11-04 00:30',tz='US/Eastern')

In [133]:
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [134]:
stamp + 2*Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

### Operations Between Different Time Zones

In [135]:
rng = pd.date_range('3/7/2012 9:30',periods=10,freq='B')

In [136]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [137]:
ts

2012-03-07 09:30:00    0.209237
2012-03-08 09:30:00   -3.278015
2012-03-09 09:30:00   -0.558964
2012-03-12 09:30:00   -0.791721
2012-03-13 09:30:00   -0.329596
2012-03-14 09:30:00    0.795272
2012-03-15 09:30:00   -0.757982
2012-03-16 09:30:00    0.480220
2012-03-19 09:30:00   -1.576634
2012-03-20 09:30:00   -0.807492
Freq: B, dtype: float64

In [138]:
ts1 = ts[:7].tz_localize('Europe/London')

In [139]:
ts2 = ts[2:].tz_localize('Europe/Moscow')

In [140]:
result = ts1 + ts2

In [141]:
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 05:30:00+00:00', '2012-03-09 09:30:00+00:00',
               '2012-03-12 05:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 05:30:00+00:00', '2012-03-13 09:30:00+00:00',
               '2012-03-14 05:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 05:30:00+00:00', '2012-03-15 09:30:00+00:00',
               '2012-03-16 05:30:00+00:00', '2012-03-19 05:30:00+00:00',
               '2012-03-20 05:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)

## Periods and Period Arithmetic

In [142]:
p = pd.Period(2007,freq='A-DEC')

In [143]:
p

Period('2007', 'A-DEC')

In [144]:
p + 5

Period('2012', 'A-DEC')

In [145]:
p - 2

Period('2005', 'A-DEC')

In [146]:
pd.Period('2014',freq='A-DEC')-p

7

In [147]:
rng = pd.period_range('2000-01-01','2000-06-30',freq='M')

In [148]:
rng

PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05', '2000-06'], dtype='period[M]', freq='M')

In [149]:
pd.Series(np.random.randn(6),index=rng)

2000-01   -1.257267
2000-02    0.156310
2000-03   -1.523967
2000-04    2.991976
2000-05   -0.780410
2000-06    2.634717
Freq: M, dtype: float64

In [150]:
values = ['2001Q3','2002Q2','2003Q1']

In [151]:
index = pd.PeriodIndex(values,freq='Q-DEC')

In [152]:
index

PeriodIndex(['2001Q3', '2002Q2', '2003Q1'], dtype='period[Q-DEC]', freq='Q-DEC')

### Period Frequency Conversation

In [153]:
p = pd.Period('2007',freq='A-DEC')

In [154]:
p

Period('2007', 'A-DEC')

In [155]:
p.asfreq('M',how='start')

Period('2007-01', 'M')

In [156]:
p.asfreq('M',how='end')

Period('2007-12', 'M')

In [157]:
p = pd.Period('2007',freq='A-JUN')

In [158]:
p

Period('2007', 'A-JUN')

In [159]:
p.asfreq('M','start')

Period('2006-07', 'M')

In [160]:
p.asfreq('M','end')

Period('2007-06', 'M')

In [161]:
p = pd.Period('Aug-2007','M')

In [162]:
p.asfreq('A-JUN')

Period('2008', 'A-JUN')

In [163]:
rng = pd.period_range('2006','2009',freq='A-DEC')

In [164]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [165]:
ts

2006   -0.098351
2007   -0.260384
2008    1.605331
2009   -1.848815
Freq: A-DEC, dtype: float64

In [166]:
ts.asfreq('M',how='start')

2006-01   -0.098351
2007-01   -0.260384
2008-01    1.605331
2009-01   -1.848815
Freq: M, dtype: float64

In [167]:
ts.asfreq('B',how='end')

2006-12-29   -0.098351
2007-12-31   -0.260384
2008-12-31    1.605331
2009-12-31   -1.848815
Freq: B, dtype: float64

### Quarterly Period Frequencies

In [168]:
p = pd.Period('2012Q4',freq='Q-JAN')

In [169]:
p

Period('2012Q4', 'Q-JAN')

In [170]:
p.asfreq('D','start')

Period('2011-11-01', 'D')

In [171]:
p.asfreq('D','end')

Period('2012-01-31', 'D')

In [172]:
p4pm = (p.asfreq('B','e')-1).asfreq('T','s')+16*60

In [173]:
p4pm

Period('2012-01-30 16:00', 'T')

In [174]:
p4pm.to_timestamp()

Timestamp('2012-01-30 16:00:00')

In [175]:
rng = pd.period_range('2011Q3','2012Q4',freq='Q-JAN')

In [176]:
ts = pd.Series(np.arange(len(rng)),index=rng)

In [177]:
ts

2011Q3    0
2011Q4    1
2012Q1    2
2012Q2    3
2012Q3    4
2012Q4    5
Freq: Q-JAN, dtype: int32

In [178]:
new_rng = (rng.asfreq('B','e')-1).asfreq('T','s')+16*60

In [179]:
ts.index = new_rng.to_timestamp()

In [180]:
ts

2010-10-28 16:00:00    0
2011-01-28 16:00:00    1
2011-04-28 16:00:00    2
2011-07-28 16:00:00    3
2011-10-28 16:00:00    4
2012-01-30 16:00:00    5
dtype: int32

In [181]:
c = 5654

In [182]:
f = 44674

In [183]:
g = 63636

In [184]:
ggf = 75757

In [185]:
z = 7737

In [186]:
t = 6767

In [187]:
u = 6464

In [189]:
y = 8838

In [190]:
t = 787

### Converting Timestamps to Periods (and Back)

### Creating a PeriodIndex from Arrays

## Resampling and Frequency Conversion

### Downsampling 

### Upsampling and Interpolation

### Resampling with Periods

## Moving Windows Functions

### Exponentially Weighted Functions

### Binary Moving Windows Functions

### User-Defined Moving Window Functions