# Time Series
---
DAT 512 Canisuis College <br>
Professor Paul Lambson<br>
<br>
### Learning Objectives<br>
- Understand python date types
- Work with time series
- become famliar with ranges, frequencies and shifting
<br>


### Sections
- [Date and Time Data Types and Tools](#date_and_time_data_types_and_tools)
- [Converting Between String and Datetime](#converting_between_string_and_datetime)
- [Time Series Basics](#timer_series_basics)
- [Indexing, Selection, Subsetting](#indexing_selection_subsetting)
- [Times Series with Duplicate Indices](#time_series_with_duplicate_indices)
- [Date Ranges, Frequencies, and Shifting](#date_ranges_frequencies_and_shifting)
- [Generating Date Ranges](#generating_date_ranges)
- [Frequencies and Date Offsets](#frequencies_and_date_offsets)
- [Shifting (Leading and Lagging) Data](#shifting)
- [Time Zone Handling](#time_zone_handling)
- [Operations with Time Zone-Aware Timestamp Objects](#operations_with_tz)
- [Operations Between Time Zones](#operations_between_tz)

In [None]:
import numpy as np
import pandas as pd
np.random.seed(12345)
import matplotlib.pyplot as plt
plt.rc("figure", figsize=(10, 6))


# Date and Time Data Types and Tools
<a id='date_and_time_data_types_and_tools'></a>

In [None]:
# import most common date library
from datetime import datetime
now = datetime.now()
now
now.year, now.month, now.day

In [None]:
# time math!
delta = datetime(2011, 1, 7) - datetime(2008, 6, 24, 8, 15)
delta
delta.days
delta.seconds

In [None]:
# add and subtract time
from datetime import timedelta
start = datetime(2011, 1, 7)
start + timedelta(12)
start - 2 * timedelta(12)

# Converting Between String and Datetime
<a id='converting_between_string_and_datetime'></a>

In [None]:
# convert to a sttring, using a prescribed format
stamp = datetime(2011, 1, 3)
str(stamp)
stamp.strftime("%Y-%m-%d")

## datetime specifications
| Type | Description|
|:---|:---|
|`%Y`| Four-digit year|
|`%t`| Two-digit year|
|`%m`| Two-digit month [01,12]|
|`%d`|Two-digit day [01,31]|
|`H`| Hour (24-hour clock) [00,23]|
|`I`| Hour (12-hour clock) [01,12]|
|`%S`| Second [00, 61] (seconds 60, 61 account for leap seconds)|
|`%f`| Microsecond as an integer, zero-padded (from 000000 to 999999)|
|`%j`| Day of the year as a zero-padded integer (from 001 to 336)|
|`%w`| Weekday as an integer [0 (Sunday), 6]|
|`%u`| Weekday as an integer starting from 1, where 1 is Monday.|
|`%U`| Week number of the year [00, 53]; Sunday is considered the first day of the week, and days before the first Sunday of the year are “week 0”|
|`%W`|Week number of the year [00, 53]; Monday is considered the first day of the week, and days before the first Monday of the year are “week 0”|
|`%z`|UTC time zone offset as +HHMM or -HHMM; empty if time zone naive|
|`%Z`|Time zone name as a string, or empty string if no time zone|
|`%F`|Shortcut for %Y-%m-%d (e.g., 2012-4-18)|
|`%D`|Shortcut for %m/%d/%y (e.g., 04/18/12)|

In [None]:
# the same format codes convert strings to dates u
# sing datetime.strptime (but some codes, like %F, cannot be used)
value = "2011-01-03"
datetime.strptime(value, "%Y-%m-%d")
datestrs = ["7/6/2011", "8/6/2011"]
[datetime.strptime(x, "%m/%d/%Y") for x in datestrs]

In [None]:
# convert a list of times to a pnadas index
datestrs = ["2011-07-06 12:00:00", "2011-08-06 00:00:00"]
pd.to_datetime(datestrs)

In [None]:
#! ipython id=ab30c892b1cd40ee981e793e194fe34e
idx = pd.to_datetime(datestrs + [None])
idx
idx[2]
pd.isna(idx)

### Local-specific Data Formatting

|Type|Description|
|:---|:---|
|`%a`|Abbreviated weekday name|
|`%A`|Full weekday name|
|`%b`|Abbreviated month name|
|`%B`|Full month name|
|`%c`|Full date and time (e.g., ‘Tue 01 May 2012 04:20:57 PM’)|
|`%p`|Locale equivalent of AM or PM|
|`%x`|Locale-appropriate formatted date (e.g., in the United States, May 1, 2012 yields ’05/01/2012’)|
|`%X`|Locale-appropriate time (e.g., ’04:24:12 PM’)|

# Time Series Basics
<a id='timer_series_basics'></a>

In [None]:
# a series indexed by dates, pretty basic
dates = [datetime(2011, 1, 2), datetime(2011, 1, 5),
         datetime(2011, 1, 7), datetime(2011, 1, 8),
         datetime(2011, 1, 10), datetime(2011, 1, 12)]
ts = pd.Series(np.random.standard_normal(6), index=dates)
ts

In [None]:
# the index was converted to a DateTimeIndex
ts.index

In [None]:
# Like other Series, arithmetic operations between differently 
# indexed time series automatically align on the dates
ts + ts[::2]

In [None]:
#pandas stores timestamps using NumPy’s datetime64 data type at the nanosecond resolution
ts.index.dtype

In [None]:
# Scalar values from a DatetimeIndex are pandas Timestamp objects
stamp = ts.index[0]
stamp

# Indexing, Selection, Subsetting
<a id='indexing_selection_subsetting'></a>

In [None]:
# use index notation to select
stamp = ts.index[2]
ts[stamp]

In [None]:
# a string of date can be used as well (note, no time stamp)
ts["2011-01-10"]

In [None]:
# pd.data_range can be used to make an exteded index
longer_ts = pd.Series(np.random.standard_normal(1000),
                      index=pd.date_range("2000-01-01", periods=1000))
longer_ts

In [None]:
# 2001 is interperteed as a year
longer_ts["2001"]

In [None]:
# 2001-05 is interprited as year and month
longer_ts["2001-05"]

In [None]:
# select by casting to date time
ts[datetime(2011, 1, 7):]


In [None]:
ts[datetime(2011, 1, 7):datetime(2011, 1, 10)]

In [None]:
ts

In [None]:
# use strings as well
ts["2011-01-06":"2011-01-11"]

In [None]:
# use truncate method
ts.truncate(after="2011-01-09")

In [None]:
# this all holds true with a dataframe
dates = pd.date_range("2000-01-01", periods=100, freq="W-WED")
long_df = pd.DataFrame(np.random.standard_normal((100, 4)),
                       index=dates,
                       columns=["Colorado", "Texas",
                                "New York", "Ohio"])
long_df

In [None]:
long_df.loc["2001-05"]

# Times Series with Duplicate Indices
<a id='time_series_with_duplicate_indices'></a>

In [None]:
# there may be multiple data observations falling on a particular timestamp
dates = pd.DatetimeIndex(["2000-01-01", "2000-01-02", "2000-01-02",
                          "2000-01-02", "2000-01-03"])
dup_ts = pd.Series(np.arange(5), index=dates)
dup_ts

In [None]:
# logic check on index
dup_ts.index.is_unique

In [None]:
dup_ts["2000-01-03"]  # not duplicated

In [None]:
dup_ts["2000-01-02"]  # duplicated

In [None]:
# aggregate to a unique level
grouped = dup_ts.groupby(level=0)
grouped.mean()

In [None]:
grouped.count()

# Date Ranges, Frequencies, and Shifting
<a id='date_ranges_frequencies_and_shifting'></a>

In [None]:
ts

In [None]:
resampler = ts.resample("D")
resampler

# Generating Date Ranges
<a id='generating_date_ranges'></a>

In [None]:
# By default, pandas.date_range generates daily timestamps
index = pd.date_range("2012-04-01", "2012-06-01")
index

In [None]:
# other options for createing date indexes
pd.date_range(start="2012-04-01", periods=20)

In [None]:
pd.date_range(end="2012-06-01", periods=20)

In [None]:
# BM is the last business day of a month
pd.date_range("2000-01-01", "2000-12-01", freq="BM")

### Base time series frequencies

|Alias|Offset type|Description|
|:---:|:---:|:---:|
|`%D`|`%Day`|Calendar daily|
|`%B`|`%BusinessDay`| Business daily|
|`%H`|`%Hour`| Hourly|
|`%T` or `min`|`%Minute`| Once a minute|
|`%S`|`%Second`|	Once a second|
|`%L` or `ms`|`%Milli`|	Millisecond (1/1,000 of 1 second)|
|`%U`|`%Micro`|	Microsecond (1/1,000,000 of 1 second)|
|`%M`|`%MonthEnd`|	Last calendar day of month|
|`%BM`|`%BusinessMonthEnd`|	Last business day (weekday) of month|
|`%MS`|`%MonthBegin`|	First calendar day of month|
|`%BMS`|`%BusinessMonthBegin`|	First weekday of month|
|`%W-MON`, `W-TUE`, `...`|	`%Week`|	Weekly on given day of week (MON, TUE, WED, THU, FRI, SAT, or SUN)|
|`%WOM-1MON`, `WOM-2MON`, `...`|`%WeekOfMonth`|	Generate weekly dates in the first, second, third, or fourth week of the month (e.g., WOM-3FRI for the third Friday of each month)|
|`%Q-JAN`, `Q-FEB`, `...`|`%QuarterEnd`|	Quarterly dates anchored on last calendar day of each month, for year ending in indicated month (JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, or DEC)|
|`%BQ-JAN`, `BQ-FEB`, `...`|`%BusinessQuarterEnd`|Quarterly dates anchored on last weekday day of each month, for year ending in indicated month|
|`%QS-JAN`, `QS-FEB`, `...`|`%QuarterBegin`|Quarterly dates anchored on first calendar day of each month, for year ending in indicated month|
|`%BQS-JAN`, `BQS-FEB`, `...`|`%BusinessQuarterBegin`|Quarterly dates anchored on first weekday day of each month, for year ending in indicated month|
|`%A-JAN`, `A-FEB`, `...`|`%YearEnd`|Annual dates anchored on last calendar day of given month (JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, or DEC)|
|`%BA-JAN`, `BA-FEB`, `...`|`%BusinessYearEnd`|Annual dates anchored on last weekday of given month|
|`%AS-JAN`, `AS-FEB`, `...`|`%YearBegin`|Annual dates anchored on first day of given month|
|`%BAS-JAN`, `BAS-FEB`, `...`|`%BusinessYearBegin`|Annual dates anchored on first weekday of given month|

In [None]:
# time is preserved
pd.date_range("2012-05-02 12:56:31", periods=5)

In [None]:
# can be normalized to midnight
pd.date_range("2012-05-02 12:56:31", periods=5, normalize=True)

# Frequencies and Date Offsets
<a id='frequencies_and_date_offsets'></a>

In [None]:
# a frequency or offeset can be created
from pandas.tseries.offsets import Hour, Minute
hour = Hour()
hour

In [None]:
# now four hours exists and can be used as a freq
four_hours = Hour(4)
four_hours

In [None]:
# putting a integer ahead of a frequence multiplies the two
pd.date_range("2000-01-01", "2000-01-03 23:59", freq="4H")

In [None]:
# Many offsets can be combined by addition
Hour(2) + Minute(30)

In [None]:
# pass frequency strings, like "1h30min", 
# that will effectively be parsed to the same expression
pd.date_range("2000-01-01", periods=10, freq="1h30min")

In [None]:
# Week of Month Dates
monthly_dates = pd.date_range("2012-01-01", "2012-09-01", freq="WOM-3FRI")
list(monthly_dates)

# Shifting (Leading and Lagging) Data
<a id='shifting'></a>

In [None]:
# create a monthly series
ts = pd.Series(np.random.standard_normal(4),
               index=pd.date_range("2000-01-01", periods=4, freq="M"))
ts

In [None]:
# shift forward
ts.shift(2)

In [None]:
# shift backwards
ts.shift(-2)

In [None]:
# specify the freq
ts.shift(2, freq="M")

In [None]:
# shift on date
ts.shift(3, freq="D")

In [None]:
# shift on 90 minutes
ts.shift(1, freq="90T")

In [None]:
# Shfting dates with offsets

In [None]:
# create an offest to shift with
from pandas.tseries.offsets import Day, MonthEnd
now = datetime(2011, 11, 17)
now + 3 * Day()

In [None]:
# examples of shifting 
now + MonthEnd()

In [None]:
now + MonthEnd(2)

In [None]:
#! Anchored offsets can explicitly “roll” 
# dates forward or backward by simply using their rollforward and rollback methods
offset = MonthEnd()

In [None]:
offset.rollforward(now)

In [None]:
offset.rollback(now)

In [None]:
#! ipython id=efd27228c8f54689b2e1e476204ba317
ts = pd.Series(np.random.standard_normal(20),
               index=pd.date_range("2000-01-15", periods=20, freq="4D"))
ts

In [None]:
ts.groupby(MonthEnd().rollforward).mean()

In [None]:
#! ipython id=b6d6087bdfb4474f91a0cf774283f1d1
ts.resample("M").mean()

# Time Zone Handling
<a id='time_zone_handling'></a>

In [None]:
# import a timezone handler
import pytz
pytz.common_timezones[-5:]

In [None]:
# get a time zone object from pytz, use pytz.timezone
tz = pytz.timezone("America/New_York")
tz

# Time Zone Localization and Conversion
<a id='tz_localization'></a>

In [None]:
# time series in pandas are time zone naive
dates = pd.date_range("2012-03-09 09:30", periods=6)
ts = pd.Series(np.random.standard_normal(len(dates)), index=dates)
ts

In [None]:
print(ts.index.tz)

In [None]:
# a tz can be set
pd.date_range("2012-03-09 09:30", periods=10, tz="UTC")

In [None]:
# set a tz by localizing it
ts
ts_utc = ts.tz_localize("UTC")
ts_utc
ts_utc.index

In [None]:
# once it's set if can be converted
ts_utc.tz_convert("America/New_York")

In [None]:
#! ipython id=01e4044008b14d7f9e4990a1e66210e3
ts_eastern = ts.tz_localize("America/New_York")
ts_eastern.tz_convert("UTC")

In [None]:
ts_eastern.tz_convert("Europe/Berlin")

In [None]:
# can convert just an index
ts.index.tz_localize("Asia/Shanghai")

# Operations with Time Zone-Aware Timestamp Objects
<a id='operations_with_tz'></a>

In [None]:
# individual timestamps can be unaware, localized and converted
stamp = pd.Timestamp("2011-03-12 04:00")
stamp_utc = stamp.tz_localize("utc")
stamp_utc

In [None]:
stamp_utc.tz_convert("America/New_York")


In [None]:
# timezone can be passed when creating the object
stamp_moscow = pd.Timestamp("2011-03-12 04:00", tz="Europe/Moscow")
stamp_moscow

In [None]:
'''
Time zone-aware Timestamp objects internally store a UTC timestamp value
as nanoseconds since the Unix epoch (January 1, 1970), 
so changing the time zone does not alter the internal UTC value:

'''
stamp_utc.value

In [None]:
stamp_utc.tz_convert("America/New_York").value

In [None]:
# pandas respects daylight saving time transitions where possibl
stamp = pd.Timestamp("2012-03-11 01:30", tz="US/Eastern")
stamp

In [None]:
stamp + Hour()

In [None]:
# then transitioning out
stamp = pd.Timestamp("2012-11-04 00:30", tz="US/Eastern")
stamp


In [None]:
stamp + 2 * Hour()

# Operations Between Different Time Zones
<a id='operations_between_tz'></a>

In [None]:
# If two time series with different time zones are combined, the result will be UTC
dates = pd.date_range("2012-03-07 09:30", periods=10, freq="B")
ts = pd.Series(np.random.standard_normal(len(dates)), index=dates)
ts
ts1 = ts[:7].tz_localize("Europe/London")
ts2 = ts1[2:].tz_convert("Europe/Moscow")
result = ts1 + ts2
result.index

# Period and Period Arithmetic

In [None]:
# Periods represent time spans, like days, months, quarters, or years
p = pd.Period("2011", freq="A-DEC")
p

In [None]:
# Periods represent time spans, like days, months, quarters, or years
p + 5

In [None]:
p - 2

In [None]:
#! ipython id=e5ff985e90fe44579705c3d3cdf5e5b7
pd.Period("2014", freq="A-DEC") - p

In [None]:
#! ipython id=ef000f892654403b9f476bb7e1d9bacd
periods = pd.period_range("2000-01-01", "2000-06-30", freq="M")
periods

In [None]:
#! ipython id=d3426c6da33b4175891f9d8d4caa4ab8
pd.Series(np.random.standard_normal(6), index=periods)

In [None]:
#! ipython id=47db0721ff4a4ca89bdc74daae703a55
values = ["2001Q3", "2002Q2", "2003Q1"]
index = pd.PeriodIndex(values, freq="Q-DEC")
index

In [None]:
#! ipython id=348880d80dc14062bb80c374d953f7ce
p = pd.Period("2011", freq="A-DEC")
p
p.asfreq("M", how="start")
p.asfreq("M", how="end")
p.asfreq("M")

In [None]:
#! ipython id=669af5f271964c7cbefe3265cde35c6c
p = pd.Period("2011", freq="A-JUN")
p
p.asfreq("M", how="start")
p.asfreq("M", how="end")

In [None]:
#! ipython id=ec04cdb713554c0fa0b4162e372b9876
p = pd.Period("Aug-2011", "M")
p.asfreq("A-JUN")

In [None]:
#! ipython id=979cb059424a48f6a887fb50fdb66adf
periods = pd.period_range("2006", "2009", freq="A-DEC")
ts = pd.Series(np.random.standard_normal(len(periods)), index=periods)
ts
ts.asfreq("M", how="start")

In [None]:
#! ipython id=51ab6bd5e33e433a8786b0a3fe239dd2
ts.asfreq("B", how="end")

In [None]:
#! ipython id=38b59581b62f4808a145d4b5bac7d04c
p = pd.Period("2012Q4", freq="Q-JAN")
p

In [None]:
#! ipython id=03f4b2f382e84260bfccec8df35b5903
p.asfreq("D", how="start")
p.asfreq("D", how="end")

In [None]:
#! ipython id=629d1fcb8b5d42329da5126f166a028f
p4pm = (p.asfreq("B", how="end") - 1).asfreq("T", how="start") + 16 * 60
p4pm
p4pm.to_timestamp()

In [None]:
#! ipython id=2b671198cc014db3bf349878fab5e36a
periods = pd.period_range("2011Q3", "2012Q4", freq="Q-JAN")
ts = pd.Series(np.arange(len(periods)), index=periods)
ts
new_periods = (periods.asfreq("B", "end") - 1).asfreq("H", "start") + 16
ts.index = new_periods.to_timestamp()
ts

In [None]:
#! ipython id=debf05cadb0f4a94a7174e2f8859f352
dates = pd.date_range("2000-01-01", periods=3, freq="M")
ts = pd.Series(np.random.standard_normal(3), index=dates)
ts
pts = ts.to_period()
pts

In [None]:
#! ipython id=6ac8d11fe6a1408cb83be89895617dd6
dates = pd.date_range("2000-01-29", periods=6)
ts2 = pd.Series(np.random.standard_normal(6), index=dates)
ts2
ts2.to_period("M")

In [None]:
#! ipython id=de1111bbae4a4343945546f43c1af706
pts = ts2.to_period()
pts
pts.to_timestamp(how="end")

In [None]:
#! ipython id=d14658aeb9db49f090190ba82c255d4e
data = pd.read_csv("examples/macrodata.csv")
data.head(5)
data["year"]
data["quarter"]

In [None]:
#! ipython id=4b1ab91f7f9d4f958b10ae49617ae6c6
index = pd.PeriodIndex(year=data["year"], quarter=data["quarter"],
                       freq="Q-DEC")
index
data.index = index
data["infl"]

In [None]:
#! ipython id=07232ac59e2a49f18606b1b9422b2889
dates = pd.date_range("2000-01-01", periods=100)
ts = pd.Series(np.random.standard_normal(len(dates)), index=dates)
ts
ts.resample("M").mean()
ts.resample("M", kind="period").mean()

In [None]:
#! ipython id=12dad7e46e2449eca9f2482399b82397
dates = pd.date_range("2000-01-01", periods=12, freq="T")
ts = pd.Series(np.arange(len(dates)), index=dates)
ts

In [None]:
#! ipython id=b187a814d71c4a7bbb8c1c8750d9133f
ts.resample("5min").sum()

In [None]:
#! ipython id=450f99fca60945bab78ddf104cedabf9
ts.resample("5min", closed="right").sum()

In [None]:
#! ipython id=6587ba3145214dc8b3f50ccb40bc88b8
ts.resample("5min", closed="right", label="right").sum()

In [None]:
#! ipython id=c48ee1ea64c2498e8ede907aea647016
from pandas.tseries.frequencies import to_offset
result = ts.resample("5min", closed="right", label="right").sum()
result.index = result.index + to_offset("-1s")
result

In [None]:
#! ipython id=e036e195212b453985a4bc219cd308f3
ts = pd.Series(np.random.permutation(np.arange(len(dates))), index=dates)
ts.resample("5min").ohlc()

In [None]:
#! ipython id=22d7658abb234e16a56b2637142c053c
frame = pd.DataFrame(np.random.standard_normal((2, 4)),
                     index=pd.date_range("2000-01-01", periods=2,
                                         freq="W-WED"),
                     columns=["Colorado", "Texas", "New York", "Ohio"])
frame

In [None]:
#! ipython id=7a564b646cbc45d78d95f6333692fd84
df_daily = frame.resample("D").asfreq()
df_daily

In [None]:
#! ipython id=ecb1c4fc9803419b849c214cddaddae3
frame.resample("D").ffill()

In [None]:
#! ipython id=a84044fc3dbe4f0597b997ec51c96490
frame.resample("D").ffill(limit=2)

In [None]:
#! ipython id=3f5e6bf2f6844db9ae9ac57ee806df7e
frame.resample("W-THU").ffill()

In [None]:
#! ipython id=42a429ef95bc45fdb9c595f3b3ffd163
frame = pd.DataFrame(np.random.standard_normal((24, 4)),
                     index=pd.period_range("1-2000", "12-2001",
                                           freq="M"),
                     columns=["Colorado", "Texas", "New York", "Ohio"])
frame.head()
annual_frame = frame.resample("A-DEC").mean()
annual_frame

In [None]:
#! ipython id=21bc509f9fc340b6882974f3ec17e715
# Q-DEC: Quarterly, year ending in December
annual_frame.resample("Q-DEC").ffill()
annual_frame.resample("Q-DEC", convention="end").asfreq()

In [None]:
#! ipython id=9d82a3b714164b4dad4eceaeadeda604
annual_frame.resample("Q-MAR").ffill()

In [None]:
#! ipython id=f23204097cbd44b3a899d2cbaa35c2bd
N = 15
times = pd.date_range("2017-05-20 00:00", freq="1min", periods=N)
df = pd.DataFrame({"time": times,
                   "value": np.arange(N)})
df

In [None]:
#! ipython id=806fd8e5d2aa413f8c990b6acebde10d
df.set_index("time").resample("5min").count()

In [None]:
#! ipython id=9a28095367094308ad46b129e69586aa
df2 = pd.DataFrame({"time": times.repeat(3),
                    "key": np.tile(["a", "b", "c"], N),
                    "value": np.arange(N * 3.)})
df2.head(7)

In [None]:
#! ipython id=0991852576124da587038d8939d3de61
time_key = pd.Grouper(freq="5min")

In [None]:
#! ipython id=a7c8d07161384a4385a4d82fffa4e7ae
resampled = (df2.set_index("time")
             .groupby(["key", time_key])
             .sum())
resampled
resampled.reset_index()

# Moving Window Functions

In [None]:
#! ipython id=3ff50b0ceeef40c4bbe32b4b8cf3824a
close_px_all = pd.read_csv("examples/stock_px.csv",
                           parse_dates=True, index_col=0)
close_px = close_px_all[["AAPL", "MSFT", "XOM"]]
close_px = close_px.resample("B").ffill()

In [None]:
#! ipython id=3a7907583c5c464eb2d146e8ddcb479f
close_px["AAPL"].plot()
#! figure,id=apple_daily_ma250,title="Apple price with 250-day moving average"
close_px["AAPL"].rolling(250).mean().plot()

In [None]:
#! figure,id=apple_daily_ma250,title="Apple price with 250-day moving average"
close_px["AAPL"].rolling(250).mean().plot()

In [None]:
#! ipython id=cd2ea550f4ab44bebddcdef67fc7990b
plt.figure()
std250 = close_px["AAPL"].pct_change().rolling(250, min_periods=10).std()
std250[5:12]
#! figure,id=apple_daily_std250,title="Apple 250-day daily return standard deviation"
std250.plot()

In [None]:
#! ipython id=504b4010407f4edcac81dfa106681206
expanding_mean = std250.expanding().mean()

In [None]:
#! ipython suppress id=828830f0853b45388516d1d716305702
plt.figure()

In [None]:
#! ipython id=757ccd62b4c64042add7a055b90d3f79
plt.style.use('grayscale')
#! figure,id=stocks_daily_ma60,title="Stock prices 60-day moving average (log y-axis)"
close_px.rolling(60).mean().plot(logy=True)

In [None]:
#! ipython id=d1df05357bef4ef5be80524c35b9407a
close_px.rolling("20D").mean()

In [None]:
#! ipython suppress id=f65d7f1f960c4466af77c7e80a9fdf4c
plt.figure()

In [None]:
#! ipython id=fac9c661db8a4c77bd6e450f0ca0f082
aapl_px = close_px["AAPL"]["2006":"2007"]

ma30 = aapl_px.rolling(30, min_periods=20).mean()
ewma30 = aapl_px.ewm(span=30).mean()

aapl_px.plot(style="k-", label="Price")
ma30.plot(style="k--", label="Simple Moving Avg")
ewma30.plot(style="k-", label="EW MA")
#! figure,id=timeseries_ewma,title="Simple moving average versus exponentially weighted"
plt.legend()

In [None]:
#! ipython suppress id=cf8d4580bade4701b8cc6c1b05c4ed97
plt.figure()

In [None]:
#! ipython id=850496d7c94b423f994cbc710002ccf7
spx_px = close_px_all["SPX"]
spx_rets = spx_px.pct_change()
returns = close_px.pct_change()

In [None]:
#! ipython id=f065c26f19e7491b8f573f8d0f0d67e3
corr = returns["AAPL"].rolling(125, min_periods=100).corr(spx_rets)
#! figure,id=roll_correl_aapl,title="Six-month AAPL return correlation to S&P 500"
corr.plot()

In [None]:
#! ipython suppress id=f67e1433110f4a6a8e72034ead461e09
plt.figure()

In [None]:
#! ipython id=e312071fe8a74319afc03f447030c468
corr = returns.rolling(125, min_periods=100).corr(spx_rets)
#! figure,id=roll_correl_all,title="Six-month return correlations to S&P 500"
corr.plot()

In [None]:
#! ipython suppress id=e0f7372085924896960706add4fa5f56
plt.figure()

In [None]:
#! ipython id=e176480d4d034f1bbaa1b4c09caf9df6
from scipy.stats import percentileofscore
def score_at_2percent(x):
    return percentileofscore(x, 0.02)

result = returns["AAPL"].rolling(250).apply(score_at_2percent)
#! figure,id=roll_apply_ex,title="Percentile rank of 2% AAPL return over one-year window"
result.plot()

In [None]:
#! ipython suppress id=419d9badc37c41888f19b0fd158061dd
%popd

In [None]:
#! ipython suppress id=1ff3016eeceb4d3e9a95f134c7ab9512
pd.options.display.max_rows = PREVIOUS_MAX_ROWS