# Time Series

__Time series__ data is an important form of structured data in many different fields, such as finance, economics, ecology, neuroscience, and physics. Anything that is observed or measured at many points in time forms a time series. Many time series are _fixed frequency_, which is to say that data pints occur at regular intervals according to some rule, such as 15 seconds, every 5 minutes, or once per month. Time series can also be _irregular_ without a fixed unit of time or offset between the units. How you mark and refer to time series data depends on the application, and you may have one of the following:
- __Timestamps__, specific instants in time
- Fixed __periods__, such as the month January 2007 or the full year 2010
- __Intervals__ of time, indicated by a start and end timestamp. Periods can be thought of as specialcase of intervals
- Experiment or elapsed time; each timestamp is a measure of time relative to a particular start time (e.g., the diameter of a cookie baking each second since being placed in the oven)

In this chapter, we are mainly concerned with time series in the first three categories, though many of the techniques can be applied to experimental time series where the index may be an integer or floating-point number indicating elpsed time for the start of the experiment. The simplest and most widely used kind of time series are those indexed by timestamp.

__pandas__ provides many built-in time series tools and data algorithms. You can efficiently work woth very large time series and easily slice and dice, aggregate, and resample irregular- and fixed-frequency time series. Some of these tools are espcially useful for financial and economics applications, but you could certainly use them to analyze server log data, too.

## Data and Time Data Types and Tools

In [1]:
import pandas as pd
import numpy as np
from datetime import datetime

In [2]:
now = datetime.now()

In [3]:
now

datetime.datetime(2019, 3, 14, 11, 6, 40, 392936)

In [4]:
now.year, now.month, now.day

(2019, 3, 14)

In [5]:
delta = datetime(2011,1,7)-datetime(2008,6,24,8,15)

In [6]:
delta

datetime.timedelta(days=926, seconds=56700)

In [7]:
delta.days

926

In [8]:
delta.seconds

56700

In [9]:
from datetime import timedelta

In [10]:
start = datetime(2011,1,7)

In [11]:
start + timedelta(12)

datetime.datetime(2011, 1, 19, 0, 0)

In [12]:
start - 2*timedelta(12)

datetime.datetime(2010, 12, 14, 0, 0)

### Converting Between String and Tools

In [13]:
stamp = datetime(2011,1,3)

In [14]:
str(stamp)

'2011-01-03 00:00:00'

In [15]:
stamp.strftime('%Y-%m-%d')

'2011-01-03'

In [16]:
value = '2011-01-03'

In [17]:
datetime.strptime(value, '%Y-%m-%d')

datetime.datetime(2011, 1, 3, 0, 0)

In [18]:
datestrs = ['7/6/2011', '8/6/2011']

In [19]:
[datetime.strptime(x, '%m/%d/%Y') for x in datestrs]

[datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 8, 6, 0, 0)]

In [20]:
from dateutil.parser import parse

In [21]:
parse('2011-01-03')

datetime.datetime(2011, 1, 3, 0, 0)

In [22]:
parse('Jan 31, 1997 10:45 PM')

datetime.datetime(1997, 1, 31, 22, 45)

In [23]:
parse('6/12/2011', dayfirst = True)

datetime.datetime(2011, 12, 6, 0, 0)

In [24]:
datestrs = ['2011-07-06 12:00:00', '2011-08-06 00:00:00']

In [25]:
pd.to_datetime(datestrs)

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00'], dtype='datetime64[ns]', freq=None)

In [26]:
idx = pd.to_datetime(datestrs + [None])

In [27]:
idx

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)

In [28]:
idx[2]

NaT

In [29]:
pd.isnull(idx)

array([False, False,  True])

## Time Series Basics

In [30]:
from datetime import datetime

In [31]:
dates = [datetime(2011,1,2), datetime(2011,1,5),
        datetime(2011,1,7), datetime(2011,1,8),
        datetime(2011,1,10),datetime(2011,1,12)]

In [32]:
ts = pd.Series(np.random.randn(6),index=dates)

In [33]:
ts

2011-01-02    1.364785
2011-01-05   -0.321324
2011-01-07    1.326197
2011-01-08   -0.228927
2011-01-10   -1.332210
2011-01-12   -0.001322
dtype: float64

In [34]:
ts.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

In [35]:
ts + ts[::2]

2011-01-02    2.729569
2011-01-05         NaN
2011-01-07    2.652393
2011-01-08         NaN
2011-01-10   -2.664419
2011-01-12         NaN
dtype: float64

In [36]:
ts.index.dtype

dtype('<M8[ns]')

In [37]:
stamp = ts.index[0]

In [38]:
stamp

Timestamp('2011-01-02 00:00:00')

### Indexing, Selection, Subsetting

In [39]:
stamp = ts.index[2]

In [40]:
ts[stamp]

1.3261965231860695

In [41]:
ts['1/10/2011']

-1.332209624556109

In [42]:
ts['20110110']

-1.332209624556109

In [43]:
longer_ts = pd.Series(np.random.randn(1000),
                     index=pd.date_range('1/1/2000',periods=1000))

In [44]:
longer_ts

2000-01-01   -0.375957
2000-01-02    0.414984
2000-01-03    0.418597
2000-01-04   -1.093165
2000-01-05    0.019254
2000-01-06   -0.224017
2000-01-07    0.071741
2000-01-08    0.811030
2000-01-09   -0.935908
2000-01-10   -1.046649
2000-01-11   -1.679804
2000-01-12   -0.197962
2000-01-13    0.358677
2000-01-14    1.655436
2000-01-15    0.640914
2000-01-16   -1.499013
2000-01-17   -0.266807
2000-01-18    0.081125
2000-01-19   -0.564087
2000-01-20   -0.339505
2000-01-21   -0.061176
2000-01-22   -0.711080
2000-01-23    0.463092
2000-01-24   -0.120552
2000-01-25    1.451260
2000-01-26   -1.437088
2000-01-27   -0.215969
2000-01-28    0.679506
2000-01-29   -0.708180
2000-01-30   -1.174636
                ...   
2002-08-28   -1.573798
2002-08-29   -0.783258
2002-08-30   -1.542008
2002-08-31    2.240985
2002-09-01    0.168050
2002-09-02    0.709718
2002-09-03    1.374457
2002-09-04   -0.100368
2002-09-05   -0.411572
2002-09-06   -0.884620
2002-09-07    0.545929
2002-09-08    0.224830
2002-09-09 

In [45]:
longer_ts['2001']

2001-01-01   -1.321103
2001-01-02    0.995566
2001-01-03    0.498953
2001-01-04    0.670315
2001-01-05   -1.078907
2001-01-06   -2.598969
2001-01-07    1.070729
2001-01-08    1.010985
2001-01-09    1.409113
2001-01-10   -0.210245
2001-01-11   -1.109031
2001-01-12   -0.671163
2001-01-13   -0.508732
2001-01-14    0.760339
2001-01-15    0.152113
2001-01-16   -0.589778
2001-01-17   -0.272398
2001-01-18   -0.998765
2001-01-19    0.143124
2001-01-20   -0.721952
2001-01-21   -0.274681
2001-01-22   -0.180408
2001-01-23   -1.062802
2001-01-24    0.656307
2001-01-25   -0.829971
2001-01-26   -0.603053
2001-01-27    1.032417
2001-01-28    0.699085
2001-01-29    0.237735
2001-01-30   -0.468142
                ...   
2001-12-02   -0.608240
2001-12-03   -1.540225
2001-12-04    1.444776
2001-12-05   -0.309013
2001-12-06    1.083322
2001-12-07   -1.336315
2001-12-08    1.004830
2001-12-09    1.698817
2001-12-10   -0.091802
2001-12-11   -1.261593
2001-12-12   -0.047029
2001-12-13   -1.350778
2001-12-14 

In [46]:
longer_ts['2001-05']

2001-05-01    0.797053
2001-05-02   -0.020976
2001-05-03   -0.577981
2001-05-04   -1.149508
2001-05-05    1.298019
2001-05-06   -0.746170
2001-05-07    0.881667
2001-05-08   -0.145607
2001-05-09   -0.346363
2001-05-10    0.171650
2001-05-11    0.216135
2001-05-12    1.346533
2001-05-13   -0.620877
2001-05-14   -0.482117
2001-05-15   -1.181718
2001-05-16    0.018737
2001-05-17   -0.266644
2001-05-18    0.491837
2001-05-19   -0.034675
2001-05-20   -1.883232
2001-05-21   -0.898560
2001-05-22   -2.867016
2001-05-23   -1.502722
2001-05-24    0.174801
2001-05-25   -0.110023
2001-05-26    0.006042
2001-05-27   -0.863010
2001-05-28   -0.987402
2001-05-29    0.295883
2001-05-30    1.407031
2001-05-31   -0.721295
Freq: D, dtype: float64

In [47]:
ts[datetime(2011,1,7):]

2011-01-07    1.326197
2011-01-08   -0.228927
2011-01-10   -1.332210
2011-01-12   -0.001322
dtype: float64

In [48]:
ts

2011-01-02    1.364785
2011-01-05   -0.321324
2011-01-07    1.326197
2011-01-08   -0.228927
2011-01-10   -1.332210
2011-01-12   -0.001322
dtype: float64

In [49]:
ts['1/6/2011':'1/11/2011']

2011-01-07    1.326197
2011-01-08   -0.228927
2011-01-10   -1.332210
dtype: float64

In [50]:
ts.truncate(after='1/9/2011')

2011-01-02    1.364785
2011-01-05   -0.321324
2011-01-07    1.326197
2011-01-08   -0.228927
dtype: float64

In [51]:
dates = pd.date_range('1/1/2000',periods=100,freq='W-WED')

In [52]:
long_df = pd.DataFrame(np.random.randn(100,4),
                      index = dates,
                      columns = ['Colorado','Texas',
                                'New York', 'Ohio'])

In [53]:
long_df.loc['5-2001']

Unnamed: 0,Colorado,Texas,New York,Ohio
2001-05-02,-0.633942,0.514568,0.571904,0.419835
2001-05-09,-2.427609,1.159674,0.169423,0.333543
2001-05-16,1.527574,1.551149,0.225164,-0.163856
2001-05-23,-0.320643,0.982505,0.173152,-1.902307
2001-05-30,-1.493774,-0.156464,0.262588,-0.916553


### Time Series with Duplicate Indices

In [54]:
dates = pd.DatetimeIndex(['1/1/2000','1/2/2000','1/2/2000',
                          '1/2/2000','1/3/2000'])

In [55]:
dup_ts = pd.Series(np.arange(5),index=dates)

In [56]:
dup_ts

2000-01-01    0
2000-01-02    1
2000-01-02    2
2000-01-02    3
2000-01-03    4
dtype: int32

In [57]:
dup_ts.index.is_unique

False

In [58]:
dup_ts['1/3/2000'] # not duplicated

4

In [59]:
dup_ts['1/2/2000'] # duplicated

2000-01-02    1
2000-01-02    2
2000-01-02    3
dtype: int32

In [60]:
grouped = dup_ts.groupby(level=0)

In [61]:
grouped.mean()

2000-01-01    0
2000-01-02    2
2000-01-03    4
dtype: int32

In [62]:
grouped.count()

2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64

## Date Ranges, Frequencies, and Shifting

In [63]:
ts

2011-01-02    1.364785
2011-01-05   -0.321324
2011-01-07    1.326197
2011-01-08   -0.228927
2011-01-10   -1.332210
2011-01-12   -0.001322
dtype: float64

In [64]:
resampler = ts.resample('D') # 'D' stands for daily frequency

### Generating Data Ranges

In [65]:
index = pd.date_range('2012-04-01', '2012-06-01')

In [66]:
index

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',
               '2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',
               '2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',
               '2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',
               '2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',
               '2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',
               '2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',
               '2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',
      

In [67]:
pd.date_range(start='2012-04-01', periods=20)

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20'],
              dtype='datetime64[ns]', freq='D')

In [68]:
pd.date_range(start='2012-06-01', periods=20)

DatetimeIndex(['2012-06-01', '2012-06-02', '2012-06-03', '2012-06-04',
               '2012-06-05', '2012-06-06', '2012-06-07', '2012-06-08',
               '2012-06-09', '2012-06-10', '2012-06-11', '2012-06-12',
               '2012-06-13', '2012-06-14', '2012-06-15', '2012-06-16',
               '2012-06-17', '2012-06-18', '2012-06-19', '2012-06-20'],
              dtype='datetime64[ns]', freq='D')

In [69]:
pd.date_range('2000-01-01','2000-12-01',freq='BM')

DatetimeIndex(['2000-01-31', '2000-02-29', '2000-03-31', '2000-04-28',
               '2000-05-31', '2000-06-30', '2000-07-31', '2000-08-31',
               '2000-09-29', '2000-10-31', '2000-11-30'],
              dtype='datetime64[ns]', freq='BM')

In [70]:
pd.date_range('2012-05-02 12:56:31', periods=5)

DatetimeIndex(['2012-05-02 12:56:31', '2012-05-03 12:56:31',
               '2012-05-04 12:56:31', '2012-05-05 12:56:31',
               '2012-05-06 12:56:31'],
              dtype='datetime64[ns]', freq='D')

In [71]:
pd.date_range('2012-05-02 12:56:31', periods=5, normalize=True)

DatetimeIndex(['2012-05-02', '2012-05-03', '2012-05-04', '2012-05-05',
               '2012-05-06'],
              dtype='datetime64[ns]', freq='D')

### Frequencies and Data Offsets

In [72]:
from pandas.tseries.offsets import Hour, Minute

In [73]:
hour = Hour()

In [74]:
hour

<Hour>

In [75]:
four_hours = Hour(4)

In [76]:
four_hours

<4 * Hours>

In [77]:
pd.date_range('2000-01-01', '2000-01-03 23:59', freq='4h')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 04:00:00',
               '2000-01-01 08:00:00', '2000-01-01 12:00:00',
               '2000-01-01 16:00:00', '2000-01-01 20:00:00',
               '2000-01-02 00:00:00', '2000-01-02 04:00:00',
               '2000-01-02 08:00:00', '2000-01-02 12:00:00',
               '2000-01-02 16:00:00', '2000-01-02 20:00:00',
               '2000-01-03 00:00:00', '2000-01-03 04:00:00',
               '2000-01-03 08:00:00', '2000-01-03 12:00:00',
               '2000-01-03 16:00:00', '2000-01-03 20:00:00'],
              dtype='datetime64[ns]', freq='4H')

In [78]:
Hour(2) + Minute(30)

<150 * Minutes>

In [79]:
pd.date_range('2000-01-01',periods=10,freq='1h30min')

DatetimeIndex(['2000-01-01 00:00:00', '2000-01-01 01:30:00',
               '2000-01-01 03:00:00', '2000-01-01 04:30:00',
               '2000-01-01 06:00:00', '2000-01-01 07:30:00',
               '2000-01-01 09:00:00', '2000-01-01 10:30:00',
               '2000-01-01 12:00:00', '2000-01-01 13:30:00'],
              dtype='datetime64[ns]', freq='90T')

#### Week of month dates

In [80]:
rng = pd.date_range('2012-01-01','2012-09-01',freq='WOM-3FRI')

In [81]:
list(rng)

[Timestamp('2012-01-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-02-17 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-03-16 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-04-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-05-18 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-06-15 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-07-20 00:00:00', freq='WOM-3FRI'),
 Timestamp('2012-08-17 00:00:00', freq='WOM-3FRI')]

### Shifting (Leading and Lagging) Data

In [82]:
ts = pd.Series(np.random.randn(4),
              index=pd.date_range('1/1/2000',periods=4,freq='M'))

In [83]:
ts

2000-01-31    0.492233
2000-02-29   -1.475239
2000-03-31    0.305259
2000-04-30   -1.020428
Freq: M, dtype: float64

In [84]:
ts.shift(2)

2000-01-31         NaN
2000-02-29         NaN
2000-03-31    0.492233
2000-04-30   -1.475239
Freq: M, dtype: float64

In [85]:
ts.shift(-2)

2000-01-31    0.305259
2000-02-29   -1.020428
2000-03-31         NaN
2000-04-30         NaN
Freq: M, dtype: float64

In [86]:
ts.shift(2,freq='M')

2000-03-31    0.492233
2000-04-30   -1.475239
2000-05-31    0.305259
2000-06-30   -1.020428
Freq: M, dtype: float64

In [87]:
ts.shift(3,freq='D')

2000-02-03    0.492233
2000-03-03   -1.475239
2000-04-03    0.305259
2000-05-03   -1.020428
dtype: float64

In [88]:
ts.shift(1,freq='90T')

2000-01-31 01:30:00    0.492233
2000-02-29 01:30:00   -1.475239
2000-03-31 01:30:00    0.305259
2000-04-30 01:30:00   -1.020428
Freq: M, dtype: float64

#### Shifting dates with offsets

In [89]:
from pandas.tseries.offsets import Day, MonthEnd

In [90]:
now = datetime(2011,11,17)

In [91]:
now + 3 * Day()

Timestamp('2011-11-20 00:00:00')

In [92]:
now + MonthEnd()

Timestamp('2011-11-30 00:00:00')

In [93]:
now + MonthEnd(2)

Timestamp('2011-12-31 00:00:00')

In [94]:
offset = MonthEnd()

In [95]:
offset.rollforward(now)

Timestamp('2011-11-30 00:00:00')

In [96]:
offset.rollback(now)

Timestamp('2011-10-31 00:00:00')

In [97]:
ts = pd.Series(np.random.randn(20),
              index=pd.date_range('1/15/2000',periods=20,freq='4d'))

In [98]:
ts

2000-01-15   -0.493803
2000-01-19    0.868855
2000-01-23   -0.605482
2000-01-27   -0.722976
2000-01-31   -0.710845
2000-02-04    0.699962
2000-02-08    1.551186
2000-02-12   -0.690805
2000-02-16    0.258408
2000-02-20   -1.498373
2000-02-24   -0.299396
2000-02-28   -0.030276
2000-03-03   -0.130639
2000-03-07    1.236913
2000-03-11    0.192483
2000-03-15   -0.640080
2000-03-19   -0.483829
2000-03-23    0.211362
2000-03-27   -1.517312
2000-03-31   -1.056424
Freq: 4D, dtype: float64

In [99]:
ts.groupby(offset.rollforward).mean()

2000-01-31   -0.332850
2000-02-29   -0.001328
2000-03-31   -0.273441
dtype: float64

In [100]:
ts.resample('M').mean()

2000-01-31   -0.332850
2000-02-29   -0.001328
2000-03-31   -0.273441
Freq: M, dtype: float64

## Time Zone Handling

In [101]:
import pytz

In [102]:
pytz.common_timezones[-5:]

['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']

In [103]:
tz = pytz.timezone('America/New_York')

In [104]:
tz

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

### Time Zone Localization and Conversion

By default, time series in pandas are __time zone naive__. For example, consider the following time series:

In [105]:
rng = pd.date_range('3/9/2012 9:30',periods=6,freq='D')

In [106]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [107]:
ts

2012-03-09 09:30:00   -1.944320
2012-03-10 09:30:00   -0.101766
2012-03-11 09:30:00   -0.104661
2012-03-12 09:30:00   -0.097068
2012-03-13 09:30:00    0.750070
2012-03-14 09:30:00    0.002955
Freq: D, dtype: float64

In [108]:
print(ts.index.tz)

None


In [109]:
pd.date_range('3/9/2012 9:30',periods=10,freq='D',tz='UTC')

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 09:30:00+00:00', '2012-03-16 09:30:00+00:00',
               '2012-03-17 09:30:00+00:00', '2012-03-18 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

Conversion from naive to __localized__ is handled by the __tz_localize__ method:

In [110]:
ts

2012-03-09 09:30:00   -1.944320
2012-03-10 09:30:00   -0.101766
2012-03-11 09:30:00   -0.104661
2012-03-12 09:30:00   -0.097068
2012-03-13 09:30:00    0.750070
2012-03-14 09:30:00    0.002955
Freq: D, dtype: float64

In [111]:
ts_utc = ts.tz_localize('UTC')

In [112]:
ts_utc

2012-03-09 09:30:00+00:00   -1.944320
2012-03-10 09:30:00+00:00   -0.101766
2012-03-11 09:30:00+00:00   -0.104661
2012-03-12 09:30:00+00:00   -0.097068
2012-03-13 09:30:00+00:00    0.750070
2012-03-14 09:30:00+00:00    0.002955
Freq: D, dtype: float64

In [113]:
ts_utc.index

DatetimeIndex(['2012-03-09 09:30:00+00:00', '2012-03-10 09:30:00+00:00',
               '2012-03-11 09:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 09:30:00+00:00', '2012-03-14 09:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq='D')

In [114]:
ts_utc.tz_convert('America/New_York')

2012-03-09 04:30:00-05:00   -1.944320
2012-03-10 04:30:00-05:00   -0.101766
2012-03-11 05:30:00-04:00   -0.104661
2012-03-12 05:30:00-04:00   -0.097068
2012-03-13 05:30:00-04:00    0.750070
2012-03-14 05:30:00-04:00    0.002955
Freq: D, dtype: float64

In [115]:
ts_eastern = ts.tz_localize('America/New_York')

In [116]:
ts_eastern.tz_convert('UTC')

2012-03-09 14:30:00+00:00   -1.944320
2012-03-10 14:30:00+00:00   -0.101766
2012-03-11 13:30:00+00:00   -0.104661
2012-03-12 13:30:00+00:00   -0.097068
2012-03-13 13:30:00+00:00    0.750070
2012-03-14 13:30:00+00:00    0.002955
Freq: D, dtype: float64

In [117]:
ts_eastern.tz_convert('Europe/Berlin')

2012-03-09 15:30:00+01:00   -1.944320
2012-03-10 15:30:00+01:00   -0.101766
2012-03-11 14:30:00+01:00   -0.104661
2012-03-12 14:30:00+01:00   -0.097068
2012-03-13 14:30:00+01:00    0.750070
2012-03-14 14:30:00+01:00    0.002955
Freq: D, dtype: float64

In [118]:
ts.index.tz_localize('Asia/Shanghai')

DatetimeIndex(['2012-03-09 09:30:00+08:00', '2012-03-10 09:30:00+08:00',
               '2012-03-11 09:30:00+08:00', '2012-03-12 09:30:00+08:00',
               '2012-03-13 09:30:00+08:00', '2012-03-14 09:30:00+08:00'],
              dtype='datetime64[ns, Asia/Shanghai]', freq='D')

### Operations with Time Zone - Aware Timestamp Objects

In [119]:
stamp = pd.Timestamp('2011-03-12 04:00')

In [120]:
stamp_utc = stamp.tz_localize('utc')

In [121]:
stamp_utc.tz_convert('America/New_York')

Timestamp('2011-03-11 23:00:00-0500', tz='America/New_York')

In [123]:
stamp_moscow = pd.Timestamp('2011-03-12 04:00',tz='Europe/Moscow')

In [124]:
stamp_moscow

Timestamp('2011-03-12 04:00:00+0300', tz='Europe/Moscow')

In [125]:
stamp_utc.value

1299902400000000000

In [126]:
stamp_utc.tz_convert('America/New_York').value

1299902400000000000

In [127]:
from pandas.tseries.offsets import Hour

In [128]:
stamp = pd.Timestamp('2012-03-12 01:30',tz='US/Eastern')

In [129]:
stamp

Timestamp('2012-03-12 01:30:00-0400', tz='US/Eastern')

In [130]:
stamp + Hour()

Timestamp('2012-03-12 02:30:00-0400', tz='US/Eastern')

In [131]:
stamp = pd.Timestamp('2012-11-04 00:30',tz='US/Eastern')

In [132]:
stamp

Timestamp('2012-11-04 00:30:00-0400', tz='US/Eastern')

In [133]:
stamp + 2*Hour()

Timestamp('2012-11-04 01:30:00-0500', tz='US/Eastern')

### Operations Between Different Time Zones

In [134]:
rng = pd.date_range('3/7/2012 9:30',periods=10,freq='B')

In [135]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [136]:
ts

2012-03-07 09:30:00    1.306208
2012-03-08 09:30:00   -1.268594
2012-03-09 09:30:00    0.450115
2012-03-12 09:30:00   -0.934062
2012-03-13 09:30:00    1.079659
2012-03-14 09:30:00   -1.277743
2012-03-15 09:30:00   -1.213397
2012-03-16 09:30:00   -1.253986
2012-03-19 09:30:00   -0.717441
2012-03-20 09:30:00   -1.150115
Freq: B, dtype: float64

In [137]:
ts1 = ts[:7].tz_localize('Europe/London')

In [138]:
ts2 = ts[2:].tz_localize('Europe/Moscow')

In [139]:
result = ts1 + ts2

In [140]:
result.index

DatetimeIndex(['2012-03-07 09:30:00+00:00', '2012-03-08 09:30:00+00:00',
               '2012-03-09 05:30:00+00:00', '2012-03-09 09:30:00+00:00',
               '2012-03-12 05:30:00+00:00', '2012-03-12 09:30:00+00:00',
               '2012-03-13 05:30:00+00:00', '2012-03-13 09:30:00+00:00',
               '2012-03-14 05:30:00+00:00', '2012-03-14 09:30:00+00:00',
               '2012-03-15 05:30:00+00:00', '2012-03-15 09:30:00+00:00',
               '2012-03-16 05:30:00+00:00', '2012-03-19 05:30:00+00:00',
               '2012-03-20 05:30:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)

## Periods and Period Arithmetic

In [141]:
p = pd.Period(2007,freq='A-DEC')

In [142]:
p

Period('2007', 'A-DEC')

In [143]:
p + 5

Period('2012', 'A-DEC')

In [145]:
p - 2

Period('2005', 'A-DEC')

In [146]:
pd.Period('2014',freq='A-DEC')-p

7

In [147]:
rng = pd.period_range('2000-01-01','2000-06-30',freq='M')

In [148]:
rng

PeriodIndex(['2000-01', '2000-02', '2000-03', '2000-04', '2000-05', '2000-06'], dtype='period[M]', freq='M')

In [149]:
pd.Series(np.random.randn(6),index=rng)

2000-01   -0.781614
2000-02   -0.756710
2000-03   -0.781056
2000-04    0.234712
2000-05    0.475815
2000-06    0.012533
Freq: M, dtype: float64

In [150]:
values = ['2001Q3','2002Q2','2003Q1']

In [151]:
index = pd.PeriodIndex(values,freq='Q-DEC')

In [152]:
index

PeriodIndex(['2001Q3', '2002Q2', '2003Q1'], dtype='period[Q-DEC]', freq='Q-DEC')

### Period Frequency Conversation

In [153]:
p = pd.Period('2007',freq='A-DEC')

In [154]:
p

Period('2007', 'A-DEC')

In [155]:
p.asfreq('M',how='start')

Period('2007-01', 'M')

In [156]:
p.asfreq('M',how='end')

Period('2007-12', 'M')

In [157]:
p = pd.Period('2007',freq='A-JUN')

In [158]:
p

Period('2007', 'A-JUN')

In [159]:
p.asfreq('M','start')

Period('2006-07', 'M')

In [160]:
p.asfreq('M','end')

Period('2007-06', 'M')

In [161]:
p = pd.Period('Aug-2007','M')

In [162]:
p.asfreq('A-JUN')

Period('2008', 'A-JUN')

In [163]:
rng = pd.period_range('2006','2009',freq='A-DEC')

In [164]:
ts = pd.Series(np.random.randn(len(rng)),index=rng)

In [165]:
ts

2006   -1.346903
2007   -2.330645
2008    0.349118
2009   -0.030657
Freq: A-DEC, dtype: float64

In [166]:
ts.asfreq('M',how='start')

2006-01   -1.346903
2007-01   -2.330645
2008-01    0.349118
2009-01   -0.030657
Freq: M, dtype: float64

In [167]:
ts.asfreq('B',how='end')

2006-12-29   -1.346903
2007-12-31   -2.330645
2008-12-31    0.349118
2009-12-31   -0.030657
Freq: B, dtype: float64

### Quarterly Period Frequencies

### Converting Timestamps to Periods (and Back)

### Creating a PeriodIndex from Arrays

## Resampling and Frequency Conversion

### Downsampling 

### Upsampling and Interpolation

### Resampling with Periods

## Moving Windows Functions

### Exponentially Weighted Functions

### Binary Moving Windows Functions

### User-Defined Moving Window Functions