# Time Series

__Time series__ data is an important form of structured data in many different fields, such as finance, economics, ecology, neuroscience, and physics. Anything that is observed or measured at many points in time forms a time series. Many time series are _fixed frequency_, which is to say that data pints occur at regular intervals according to some rule, such as 15 seconds, every 5 minutes, or once per month. Time series can also be _irregular_ without a fixed unit of time or offset between the units. How you mark and refer to time series data depends on the application, and you may have one of the following:
- __Timestamps__, specific instants in time
- Fixed __periods__, such as the month January 2007 or the full year 2010
- __Intervals__ of time, indicated by a start and end timestamp. Periods can be thought of as specialcase of intervals
- Experiment or elapsed time; each timestamp is a measure of time relative to a particular start time (e.g., the diameter of a cookie baking each second since being placed in the oven)

In this chapter, we are mainly concerned with time series in the first three categories, though many of the techniques can be applied to experimental time series where the index may be an integer or floating-point number indicating elpsed time for the start of the experiment. The simplest and most widely used kind of time series are those indexed by timestamp.

__pandas__ provides many built-in time series tools and data algorithms. You can efficiently work woth very large time series and easily slice and dice, aggregate, and resample irregular- and fixed-frequency time series. Some of these tools are espcially useful for financial and economics applications, but you could certainly use them to analyze server log data, too.

## Data and Time Data Types and Tools

In [3]:
import pandas as pd
import numpy as np
from datetime import datetime

In [4]:
now = datetime.now()

In [5]:
now

datetime.datetime(2019, 3, 30, 20, 43, 0, 445754)

In [6]:
now.year, now.month, now.day

(2019, 3, 30)

In [7]:
delta = datetime(2011,1,7)-datetime(2008,6,24,8,15)

In [8]:
delta

datetime.timedelta(days=926, seconds=56700)

In [9]:
delta.days

926

In [10]:
delta.seconds

56700

In [11]:
from datetime import timedelta

In [12]:
start = datetime(2011,1,7)

In [13]:
start + timedelta(12)

datetime.datetime(2011, 1, 19, 0, 0)

In [14]:
start - 2*timedelta(12)

datetime.datetime(2010, 12, 14, 0, 0)

### Converting Between String and Tools

In [15]:
stamp = datetime(2011,1,3)

In [16]:
str(stamp)

'2011-01-03 00:00:00'

In [17]:
stamp.strftime('%Y-%m-%d')

'2011-01-03'

In [18]:
value = '2011-01-03'

In [19]:
datetime.strptime(value, '%Y-%m-%d')

datetime.datetime(2011, 1, 3, 0, 0)

In [20]:
datestrs = ['7/6/2011', '8/6/2011']

In [21]:
[datetime.strptime(x, '%m/%d/%Y') for x in datestrs]

[datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 8, 6, 0, 0)]

In [22]:
from dateutil.parser import parse

In [23]:
parse('2011-01-03')

datetime.datetime(2011, 1, 3, 0, 0)

In [24]:
parse('Jan 31, 1997 10:45 PM')

datetime.datetime(1997, 1, 31, 22, 45)

In [25]:
parse('6/12/2011', dayfirst = True)

datetime.datetime(2011, 12, 6, 0, 0)

In [26]:
datestrs = ['2011-07-06 12:00:00', '2011-08-06 00:00:00']

In [27]:
pd.to_datetime(datestrs)

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00'], dtype='datetime64[ns]', freq=None)

In [28]:
idx = pd.to_datetime(datestrs + [None])

In [29]:
idx

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)

In [30]:
idx[2]

NaT

In [31]:
pd.isnull(idx)

array([False, False,  True])

## Time Series Basics

In [32]:
from datetime import datetime

In [33]:
dates = [datetime(2011,1,2), datetime(2011,1,5),
        datetime(2011,1,7), datetime(2011,1,8),
        datetime(2011,1,10),datetime(2011,1,12)]

In [34]:
ts = pd.Series(np.random.randn(6),index=dates)

In [35]:
ts

2011-01-02   -0.495547
2011-01-05    0.440865
2011-01-07    1.034071
2011-01-08   -0.594544
2011-01-10   -0.142050
2011-01-12   -0.152662
dtype: float64

In [36]:
ts.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

In [37]:
ts + ts[::2]

2011-01-02   -0.991093
2011-01-05         NaN
2011-01-07    2.068142
2011-01-08         NaN
2011-01-10   -0.284100
2011-01-12         NaN
dtype: float64

In [38]:
ts.index.dtype

dtype('<M8[ns]')

In [39]:
stamp = ts.index[0]

In [40]:
stamp

Timestamp('2011-01-02 00:00:00')

### Indexing, Selection, Subsetting

In [41]:
stamp = ts.index[2]

In [42]:
ts[stamp]

1.0340707820767607

In [43]:
ts['1/10/2011']

-0.1420501405057779

In [44]:
ts['20110110']

-0.1420501405057779

In [45]:
longer_ts = pd.Series(np.random.randn(1000),
                     index=pd.date_range('1/1/2000',periods=1000))

In [46]:
longer_ts

2000-01-01   -1.182902
2000-01-02    0.535967
2000-01-03    0.236322
2000-01-04   -0.288521
2000-01-05    0.989083
2000-01-06   -1.280072
2000-01-07    1.802239
2000-01-08    0.831788
2000-01-09    0.940819
2000-01-10   -0.608644
2000-01-11    0.645361
2000-01-12    0.183610
2000-01-13   -1.716874
2000-01-14    0.434756
2000-01-15   -1.246211
2000-01-16    0.286218
2000-01-17   -0.679088
2000-01-18   -0.550199
2000-01-19   -0.617910
2000-01-20   -1.088129
2000-01-21   -0.636641
2000-01-22    1.705399
2000-01-23    1.153772
2000-01-24   -0.305257
2000-01-25   -0.939136
2000-01-26   -0.039193
2000-01-27    1.310563
2000-01-28    0.511888
2000-01-29    0.542061
2000-01-30    0.188554
                ...   
2002-08-28    0.358142
2002-08-29    0.573023
2002-08-30    1.367099
2002-08-31   -0.293295
2002-09-01    0.460638
2002-09-02    0.625107
2002-09-03    1.149483
2002-09-04   -2.501985
2002-09-05    0.881956
2002-09-06    1.960733
2002-09-07   -0.171965
2002-09-08   -0.384356
2002-09-09 

In [47]:
longer_ts['2001']

2001-01-01   -0.140844
2001-01-02    0.307244
2001-01-03   -2.048691
2001-01-04    0.613611
2001-01-05   -0.101270
2001-01-06   -0.099019
2001-01-07   -0.672514
2001-01-08    0.851178
2001-01-09   -1.665047
2001-01-10    1.317337
2001-01-11    0.295810
2001-01-12   -0.060146
2001-01-13    0.337425
2001-01-14   -1.271275
2001-01-15   -0.860632
2001-01-16    1.590096
2001-01-17   -1.439017
2001-01-18   -0.040309
2001-01-19    0.316233
2001-01-20    1.859884
2001-01-21    0.219930
2001-01-22   -0.397000
2001-01-23    0.286376
2001-01-24    1.311956
2001-01-25   -0.760508
2001-01-26   -1.316932
2001-01-27    1.120483
2001-01-28    0.010493
2001-01-29   -0.482626
2001-01-30    0.220997
                ...   
2001-12-02    0.904917
2001-12-03   -0.341667
2001-12-04   -1.576387
2001-12-05   -0.387498
2001-12-06   -0.454202
2001-12-07   -0.770937
2001-12-08    0.899037
2001-12-09   -0.707635
2001-12-10   -2.285914
2001-12-11   -2.148048
2001-12-12    0.406252
2001-12-13    1.647802
2001-12-14 

In [48]:
longer_ts['2001-05']

2001-05-01    0.742085
2001-05-02   -0.547352
2001-05-03   -0.562546
2001-05-04   -1.180051
2001-05-05   -0.574218
2001-05-06   -1.108006
2001-05-07   -0.855284
2001-05-08   -1.235347
2001-05-09   -0.511334
2001-05-10   -1.156580
2001-05-11    0.552873
2001-05-12    2.180921
2001-05-13    0.035752
2001-05-14    1.623808
2001-05-15    0.820430
2001-05-16    0.426415
2001-05-17   -0.673053
2001-05-18   -1.244787
2001-05-19    0.395914
2001-05-20   -1.072316
2001-05-21   -0.590269
2001-05-22    0.534624
2001-05-23   -0.661448
2001-05-24    1.520763
2001-05-25    0.075382
2001-05-26    0.895030
2001-05-27   -1.059374
2001-05-28    0.301446
2001-05-29   -1.847398
2001-05-30   -0.711939
2001-05-31   -2.107331
Freq: D, dtype: float64

In [49]:
ts[datetime(2011,1,7):]

2011-01-07    1.034071
2011-01-08   -0.594544
2011-01-10   -0.142050
2011-01-12   -0.152662
dtype: float64

In [50]:
ts

2011-01-02   -0.495547
2011-01-05    0.440865
2011-01-07    1.034071
2011-01-08   -0.594544
2011-01-10   -0.142050
2011-01-12   -0.152662
dtype: float64

In [51]:
ts['1/6/2011':'1/11/2011']

2011-01-07    1.034071
2011-01-08   -0.594544
2011-01-10   -0.142050
dtype: float64

In [52]:
ts.truncate(after='1/9/2011')

2011-01-02   -0.495547
2011-01-05    0.440865
2011-01-07    1.034071
2011-01-08   -0.594544
dtype: float64

In [53]:
dates = pd.date_range('1/1/2000',periods=100,freq='W-WED')

In [54]:
long_df = pd.DataFrame(np.random.randn(100,4),
                      index = dates,
                      columns = ['Colorado','Texas',
                                'New York', 'Ohio'])

In [55]:
long_df.loc['5-2001']

Unnamed: 0,Colorado,Texas,New York,Ohio
2001-05-02,1.507593,0.593417,1.231232,-1.549055
2001-05-09,-2.24922,-1.36352,-0.098988,2.253689
2001-05-16,0.085051,-0.642108,0.091287,-0.114453
2001-05-23,-1.268298,0.05753,1.453837,-1.790268
2001-05-30,0.270495,0.86935,0.616269,-1.660131


### Time Series with Duplicate Indices

In [56]:
dates = pd.DatetimeIndex(['1/1/2000','1/2/2000','1/2/2000',
                          '1/2/2000','1/3/2000'])

In [57]:
dup_ts = pd.Series(np.arange(5),index=dates)

In [58]:
dup_ts

2000-01-01    0
2000-01-02    1
2000-01-02    2
2000-01-02    3
2000-01-03    4
dtype: int32

In [59]:
dup_ts.index.is_unique

False

In [60]:
dup_ts['1/3/2000'] # not duplicated

4

In [61]:
dup_ts['1/2/2000'] # duplicated

2000-01-02    1
2000-01-02    2
2000-01-02    3
dtype: int32

In [62]:
grouped = dup_ts.groupby(level=0)

In [63]:
grouped.mean()

2000-01-01    0
2000-01-02    2
2000-01-03    4
dtype: int32

In [64]:
grouped.count()

2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64

## Date Ranges, Frequencies, and Shifting

In [65]:
ts

2011-01-02   -0.495547
2011-01-05    0.440865
2011-01-07    1.034071
2011-01-08   -0.594544
2011-01-10   -0.142050
2011-01-12   -0.152662
dtype: float64

In [66]:
resampler = ts.resample('D') # 'D' stands for daily frequency

### Generating Data Ranges

In [67]:
index = pd.date_range('2012-04-01', '2012-06-01')

In [68]:
index

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',
               '2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',
               '2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',
               '2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',
               '2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',
               '2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',
               '2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',
               '2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',
      

In [69]:
pd.date_range(start='2012-04-01', periods=20)

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20'],
              dtype='datetime64[ns]', freq='D')

In [70]:
pd.date_range(start='2012-06-01', periods=20)

DatetimeIndex(['2012-06-01', '2012-06-02', '2012-06-03', '2012-06-04',
               '2012-06-05', '2012-06-06', '2012-06-07', '2012-06-08',
               '2012-06-09', '2012-06-10', '2012-06-11', '2012-06-12',
               '2012-06-13', '2012-06-14', '2012-06-15', '2012-06-16',
               '2012-06-17', '2012-06-18', '2012-06-19', '2012-06-20'],
              dtype='datetime64[ns]', freq='D')

In [71]:
pd.date_range('2000-01-01','2000-12-01',freq='BM')

DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-28',
               '2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31',
               '2000-09-29', '2000-10-31', '2000-11-30'],
              dtype='datetime64[ns]', freq='BM')

In [72]:
pd.date_range('2012-05-02 12:56:31', periods=5)

DatetimeIndex(['2012-05-02 12:56:31', '2012-05-03 12:56:31',
               '2012-05-04 12:56:31', '2012-05-05 12:56:31',
               '2012-05-06 12:56:31'],
              dtype='datetime64[ns]', freq='D')

In [73]:
pd.date_range('2012-05-02 12:56:31', periods=5, normalize=True)

DatetimeIndex(['2012-05-02', '2012-05-03', '2012-05-04', '2012-05-05',
               '2012-05-06'],
              dtype='datetime64[ns]', freq='D')

### Frequencies and Data Offsets

In [74]:
from pandas.tseries.offsets import Hour, Minute

In [75]:
hour = Hour()

In [76]:
hour

<Hour>

In [77]:
four_hours = Hour(4)

In [78]:
four_hours

<4 * Hours>

In [79]:
pd.date_range('2000-01-01', '2000-01-03 23:59', freq='4h')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 04:00:00',
               '2000-01-01 08:00:00', '2000-01-01 12:00:00',
               '2000-01-01 16:00:00', '2000-01-01 20:00:00',
               '2000-01-02 00:00:00', '2000-01-02 04:00:00',
               '2000-01-02 08:00:00', '2000-01-02 12:00:00',
               '2000-01-02 16:00:00', '2000-01-02 20:00:00',
               '2000-01-03 00:00:00', '2000-01-03 04:00:00',
               '2000-01-03 08:00:00', '2000-01-03 12:00:00',
               '2000-01-03 16:00:00', '2000-01-03 20:00:00'],
              dtype='datetime64[ns]', freq='4H')

In [80]:
Hour(2) + Minute(30)

<150 * Minutes>

In [81]:
pd.date_range('2000-01-01',periods=10,freq='1h30min')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 01:30:00',
               '2000-01-01 03:00:00', '2000-01-01 04:30:00',
               '2000-01-01 06:00:00', '2000-01-01 07:30:00',
               '2000-01-01 09:00:00', '2000-01-01 10:30:00',
               '2000-01-01 12:00:00', '2000-01-01 13:30:00'],
              dtype='datetime64[ns]', freq='90T')

#### Week of month dates

In [82]:
rng = pd.date_range('2012-01-01','2012-09-01',freq='WOM-3FRI')

In [83]:
list(rng)

[Timestamp('2012-01-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-02-17 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-03-16 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-04-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-05-18 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-06-15 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-07-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-08-17 00:00:00', freq='WOM-3FRI')]

### Shifting (Leading and Lagging) Data

In [84]:
ts = pd.Series(np.random.randn(4),
              index=pd.date_range('1/1/2000',periods=4,freq='M'))

In [85]:
ts

2000-01-31    0.235221
2000-02-29    0.091709
2000-03-31    0.073060
2000-04-30   -0.664578
Freq: M, dtype: float64

In [86]:
ts.shift(2)

2000-01-31         NaN
2000-02-29         NaN
2000-03-31    0.235221
2000-04-30    0.091709
Freq: M, dtype: float64

In [87]:
ts.shift(-2)

2000-01-31    0.073060
2000-02-29   -0.664578
2000-03-31         NaN
2000-04-30         NaN
Freq: M, dtype: float64

In [88]:
ts.shift(2,freq='M')

2000-03-31    0.235221
2000-04-30    0.091709
2000-05-31    0.073060
2000-06-30   -0.664578
Freq: M, dtype: float64

In [89]:
ts.shift(3,freq='D')

2000-02-03    0.235221
2000-03-03    0.091709
2000-04-03    0.073060
2000-05-03   -0.664578
dtype: float64

In [90]:
ts.shift(1,freq='90T')

2000-01-31 01:30:00    0.235221
2000-02-29 01:30:00    0.091709
2000-03-31 01:30:00    0.073060
2000-04-30 01:30:00   -0.664578
Freq: M, dtype: float64

#### Shifting dates with offsets

In [91]:
from pandas.tseries.offsets import Day, MonthEnd

In [92]:
now = datetime(2011,11,17)

In [93]:
now + 3 * Day()

Timestamp('2011-11-20 00:00:00')

In [94]:
now + MonthEnd()

Timestamp('2011-11-30 00:00:00')

In [95]:
now + MonthEnd(2)

Timestamp('2011-12-31 00:00:00')

In [96]:
offset = MonthEnd()

In [97]:
offset.rollforward(now)

Timestamp('2011-11-30 00:00:00')

In [98]:
offset.rollback(now)

Timestamp('2011-10-31 00:00:00')

In [99]:
ts = pd.Series(np.random.randn(20),
              index=pd.date_range('1/15/2000',periods=20,freq='4d'))

In [100]:
ts

2000-01-15   -0.783740
2000-01-19    1.076824
2000-01-23    0.275334
2000-01-27    0.062403
2000-01-31   -0.111394
2000-02-04    0.361839
2000-02-08    0.506665
2000-02-12   -1.187364
2000-02-16   -1.437856
2000-02-20   -0.014629
2000-02-24    0.604444
2000-02-28   -2.282254
2000-03-03    0.900529
2000-03-07   -0.607305
2000-03-11   -0.806434
2000-03-15    1.171666
2000-03-19   -1.505092
2000-03-23    0.777798
2000-03-27    2.142731
2000-03-31    0.582145
Freq: 4D, dtype: float64

In [101]:
ts.groupby(offset.rollforward).mean()

2000-01-31    0.103885
2000-02-29   -0.492736
2000-03-31    0.332005
dtype: float64

In [102]:
ts.resample('M').mean()

2000-01-31    0.103885
2000-02-29   -0.492736
2000-03-31    0.332005
Freq: M, dtype: float64

## Time Zone Handling

In [103]:
import pytz

In [104]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

In [105]:
tz = pytz.timezone('America/New_York')

In [106]:
tz

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

### Time Zone Localization and Conversion

By default, time series in pandas are __time zone naive__. For example, consider the following time series:

In [107]:
rng = pd.date_range('3/9/2012 9:30',periods=6,freq='D')

In [108]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [109]:
ts

2012-03-09 09:30:00    0.037291
2012-03-10 09:30:00    0.145433
2012-03-11 09:30:00    1.068495
2012-03-12 09:30:00    0.929472
2012-03-13 09:30:00   -0.265892
2012-03-14 09:30:00   -0.302873
Freq: D, dtype: float64

In [110]:
print(ts.index.tz)

None


In [111]:
pd.date_range('3/9/2012 9:30',periods=10,freq='D',tz='UTC')

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00', '2012-03-16 09:30:00+00:00',
               '2012-03-17 09:30:00+00:00', '2012-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

Conversion from naive to __localized__ is handled by the __tz_localize__ method:

In [112]:
ts

2012-03-09 09:30:00    0.037291
2012-03-10 09:30:00    0.145433
2012-03-11 09:30:00    1.068495
2012-03-12 09:30:00    0.929472
2012-03-13 09:30:00   -0.265892
2012-03-14 09:30:00   -0.302873
Freq: D, dtype: float64

In [113]:
ts_utc = ts.tz_localize('UTC')

In [114]:
ts_utc

2012-03-09 09:30:00+00:00    0.037291
2012-03-10 09:30:00+00:00    0.145433
2012-03-11 09:30:00+00:00    1.068495
2012-03-12 09:30:00+00:00    0.929472
2012-03-13 09:30:00+00:00   -0.265892
2012-03-14 09:30:00+00:00   -0.302873
Freq: D, dtype: float64

In [115]:
ts_utc.index

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [116]:
ts_utc.tz_convert('America/New_York')

2012-03-09 04:30:00-05:00    0.037291
2012-03-10 04:30:00-05:00    0.145433
2012-03-11 05:30:00-04:00    1.068495
2012-03-12 05:30:00-04:00    0.929472
2012-03-13 05:30:00-04:00   -0.265892
2012-03-14 05:30:00-04:00   -0.302873
Freq: D, dtype: float64

In [117]:
ts_eastern = ts.tz_localize('America/New_York')

In [118]:
ts_eastern.tz_convert('UTC')

2012-03-09 14:30:00+00:00    0.037291
2012-03-10 14:30:00+00:00    0.145433
2012-03-11 13:30:00+00:00    1.068495
2012-03-12 13:30:00+00:00    0.929472
2012-03-13 13:30:00+00:00   -0.265892
2012-03-14 13:30:00+00:00   -0.302873
Freq: D, dtype: float64

In [119]:
ts_eastern.tz_convert('Europe/Berlin')

2012-03-09 15:30:00+01:00    0.037291
2012-03-10 15:30:00+01:00    0.145433
2012-03-11 14:30:00+01:00    1.068495
2012-03-12 14:30:00+01:00    0.929472
2012-03-13 14:30:00+01:00   -0.265892
2012-03-14 14:30:00+01:00   -0.302873
Freq: D, dtype: float64

In [120]:
ts.index.tz_localize('Asia/Shanghai')

DatetimeIndex(['2012-03-09 09:30:00+08:00', '2012-03-10 09:30:00+08:00',
               '2012-03-11 09:30:00+08:00', '2012-03-12 09:30:00+08:00',
               '2012-03-13 09:30:00+08:00', '2012-03-14 09:30:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq='D')

### Operations with Time Zone - Aware Timestamp Objects

In [121]:
stamp = pd.Timestamp('2011-03-12 04:00')

In [122]:
stamp_utc = stamp.tz_localize('utc')

In [123]:
stamp_utc.tz_convert('America/New_York')

Timestamp('2011-03-11 23:00:00-0500', tz='America/New_York')

In [124]:
stamp_moscow = pd.Timestamp('2011-03-12 04:00',tz='Europe/Moscow')

In [125]:
stamp_moscow

Timestamp('2011-03-12 04:00:00+0300', tz='Europe/Moscow')

In [126]:
stamp_utc.value

1299902400000000000

In [127]:
stamp_utc.tz_convert('America/New_York').value

1299902400000000000

In [128]:
from pandas.tseries.offsets import Hour

In [129]:
stamp = pd.Timestamp('2012-03-12 01:30',tz='US/Eastern')

In [130]:
stamp

Timestamp('2012-03-12 01:30:00-0400', tz='US/Eastern')

In [131]:
stamp + Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

In [132]:
stamp = pd.Timestamp('2012-11-04 00:30',tz='US/Eastern')

In [133]:
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [134]:
stamp + 2*Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

### Operations Between Different Time Zones

In [135]:
rng = pd.date_range('3/7/2012 9:30',periods=10,freq='B')

In [136]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [137]:
ts

2012-03-07 09:30:00    0.587444
2012-03-08 09:30:00    1.454459
2012-03-09 09:30:00   -0.018388
2012-03-12 09:30:00   -0.543866
2012-03-13 09:30:00   -0.928811
2012-03-14 09:30:00    0.575187
2012-03-15 09:30:00    1.362172
2012-03-16 09:30:00   -1.515211
2012-03-19 09:30:00   -2.472693
2012-03-20 09:30:00    0.756345
Freq: B, dtype: float64

In [138]:
ts1 = ts[:7].tz_localize('Europe/London')

In [139]:
ts2 = ts[2:].tz_localize('Europe/Moscow')

In [140]:
result = ts1 + ts2

In [141]:
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 05:30:00+00:00', '2012-03-09 09:30:00+00:00',
               '2012-03-12 05:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 05:30:00+00:00', '2012-03-13 09:30:00+00:00',
               '2012-03-14 05:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 05:30:00+00:00', '2012-03-15 09:30:00+00:00',
               '2012-03-16 05:30:00+00:00', '2012-03-19 05:30:00+00:00',
               '2012-03-20 05:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)

## Periods and Period Arithmetic

In [142]:
p = pd.Period(2007,freq='A-DEC')

In [143]:
p

Period('2007', 'A-DEC')

In [144]:
p + 5

Period('2012', 'A-DEC')

In [145]:
p - 2

Period('2005', 'A-DEC')

In [146]:
pd.Period('2014',freq='A-DEC')-p

7

In [147]:
rng = pd.period_range('2000-01-01','2000-06-30',freq='M')

In [148]:
rng

PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05', '2000-06'], dtype='period[M]', freq='M')

In [149]:
pd.Series(np.random.randn(6),index=rng)

2000-01   -1.805437
2000-02    0.436405
2000-03   -1.359758
2000-04    0.895169
2000-05    1.442798
2000-06   -1.019792
Freq: M, dtype: float64

In [150]:
values = ['2001Q3','2002Q2','2003Q1']

In [151]:
index = pd.PeriodIndex(values,freq='Q-DEC')

In [152]:
index

PeriodIndex(['2001Q3', '2002Q2', '2003Q1'], dtype='period[Q-DEC]', freq='Q-DEC')

### Period Frequency Conversation

In [153]:
p = pd.Period('2007',freq='A-DEC')

In [154]:
p

Period('2007', 'A-DEC')

In [155]:
p.asfreq('M',how='start')

Period('2007-01', 'M')

In [156]:
p.asfreq('M',how='end')

Period('2007-12', 'M')

In [157]:
p = pd.Period('2007',freq='A-JUN')

In [158]:
p

Period('2007', 'A-JUN')

In [159]:
p.asfreq('M','start')

Period('2006-07', 'M')

In [160]:
p.asfreq('M','end')

Period('2007-06', 'M')

In [161]:
p = pd.Period('Aug-2007','M')

In [162]:
p.asfreq('A-JUN')

Period('2008', 'A-JUN')

In [163]:
rng = pd.period_range('2006','2009',freq='A-DEC')

In [164]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [165]:
ts

2006    1.993567
2007   -1.643409
2008   -0.794396
2009   -0.833388
Freq: A-DEC, dtype: float64

In [166]:
ts.asfreq('M',how='start')

2006-01    1.993567
2007-01   -1.643409
2008-01   -0.794396
2009-01   -0.833388
Freq: M, dtype: float64

In [167]:
ts.asfreq('B',how='end')

2006-12-29    1.993567
2007-12-31   -1.643409
2008-12-31   -0.794396
2009-12-31   -0.833388
Freq: B, dtype: float64

### Quarterly Period Frequencies

In [168]:
p = pd.Period('2012Q4',freq='Q-JAN')

In [169]:
p

Period('2012Q4', 'Q-JAN')

In [170]:
p.asfreq('D','start')

Period('2011-11-01', 'D')

In [171]:
p.asfreq('D','end')

Period('2012-01-31', 'D')

In [172]:
p4pm = (p.asfreq('B','e')-1).asfreq('T','s')+16*60

In [173]:
p4pm

Period('2012-01-30 16:00', 'T')

In [174]:
p4pm.to_timestamp()

Timestamp('2012-01-30 16:00:00')

In [175]:
rng = pd.period_range('2011Q3','2012Q4',freq='Q-JAN')

In [176]:
ts = pd.Series(np.arange(len(rng)),index=rng)

In [177]:
ts

2011Q3    0
2011Q4    1
2012Q1    2
2012Q2    3
2012Q3    4
2012Q4    5
Freq: Q-JAN, dtype: int32

In [178]:
new_rng = (rng.asfreq('B','e')-1).asfreq('T','s')+16*60

In [179]:
ts.index = new_rng.to_timestamp()

In [180]:
ts

2010-10-28 16:00:00    0
2011-01-28 16:00:00    1
2011-04-28 16:00:00    2
2011-07-28 16:00:00    3
2011-10-28 16:00:00    4
2012-01-30 16:00:00    5
dtype: int32

In [181]:
c = 5654

In [182]:
g = 356272

In [183]:
b = 7373

In [184]:
u = 636727

In [186]:
y = 8977

In [188]:
r = 27277

In [190]:
j =744848

In [191]:
r = 45466

### Converting Timestamps to Periods (and Back)

### Creating a PeriodIndex from Arrays

## Resampling and Frequency Conversion

### Downsampling 

### Upsampling and Interpolation

### Resampling with Periods

## Moving Windows Functions

### Exponentially Weighted Functions

### Binary Moving Windows Functions

### User-Defined Moving Window Functions