# 2.4 - Time Series

*   Time series data is an important form of structured data in many different fields, such as finance, economics, ecology, neuroscience, and physics. Anything that is recorded repeatedly at many points in time forms a time series.

*   Many time series are fixed frequency, which is to say that data points occur at regular intervals according to some rule, such as every 15 seconds, every 5 minutes, or once per month. 

*   Time series can also be irregular without a fixed unit of time or offset between units. 

*   Categories of time series:

    *    **Timestamps**   Specific instants in time.

    *    **Fixed periods**   Such as the whole month of January 2017, or the whole year 2020.

    *    **Intervals of time**   Indicated by a start and end timestamp. Periods can be thought of as special cases of intervals.

    *    **Experiment or elapsed time**  Each timestamp is a measure of time relative to a particular start time, starting from 0.

Source: [McKinney, Wes. Python for Data Analysis. O'Reilly Media. 3rd Edition](https://wesmckinney.com/book/time-series)

In [1]:
import numpy as np
import pandas as pd

##  2.4.1 Date and Time Data Types and Tools

In [2]:
from datetime import datetime
now01 = datetime.now()
now01

datetime.datetime(2023, 9, 27, 11, 48, 27, 44235)

In [3]:
now01.year, now01.month, now01.day

(2023, 9, 27)

In [4]:
now01.hour, now01.minute, now01.second, now01.microsecond

(11, 48, 27, 44235)

`datetime` stores both the date and time down to the microsecond. 

`timedelta` represents the temporal difference between two datetime objects.

In [5]:
delta01 = datetime(2023, 9, 28) - datetime(2023, 8, 15, 9, 30)
delta01


datetime.timedelta(days=43, seconds=52200)

In [6]:
delta01.days

43

In [7]:
delta01.seconds

52200

In [8]:
from datetime import timedelta
start01 = datetime(2023, 9, 28)
start01 + timedelta(3)

datetime.datetime(2023, 10, 1, 0, 0)

In [9]:
datetime_module ={"date":	"Store calendar date (year, month, day) using the Gregorian calendar",
                  "time":	"Store time of day as hours, minutes, seconds, and microseconds",
                  "datetime": "Store both date and time",
                  "timedelta": "The difference between two datetime values (as days, seconds, and microseconds)",
                  "tzinfo": "Base type for storing time zone information"
                 }
datetime_module

{'date': 'Store calendar date (year, month, day) using the Gregorian calendar',
 'time': 'Store time of day as hours, minutes, seconds, and microseconds',
 'datetime': 'Store both date and time',
 'timedelta': 'The difference between two datetime values (as days, seconds, and microseconds)',
 'tzinfo': 'Base type for storing time zone information'}

In [10]:
table2_4_01 = pd.DataFrame(datetime_module.items(), columns = ["Type", "Description"])
table2_4_01

Unnamed: 0,Type,Description
0,date,"Store calendar date (year, month, day) using t..."
1,time,"Store time of day as hours, minutes, seconds, ..."
2,datetime,Store both date and time
3,timedelta,The difference between two datetime values (as...
4,tzinfo,Base type for storing time zone information


**Converting Between String and Datetime**


In [11]:
datetime_format ={"%Y":	"Four-digit year",
                  "%y":	"Two-digit year", 
                  "%m":	"Two-digit month [01, 12]", 
                  "%d":	"Two-digit day [01, 31]",
                  "%H":	"Hour (24-hour clock) [00, 23]",
                  "%I":	"Hour (12-hour clock) [01, 12]",
                  "%M":	"Two-digit minute [00, 59]",
                  "%S": "Second [00, 61] (seconds 60, 61 account for leap seconds)",
                  "%f": "Microsecond as an integer, zero-padded (from 000000 to 999999)",
                  "%j":  "Day of the year as a zero-padded integer (from 001 to 336)",
                  "%w":  "Weekday as an integer [0 (Sunday), 6]",
                  "%u":  "Weekday as an integer starting from 1, where 1 is Monday",
                  "%U":  "Week number of the year [00, 53]; Sunday is considered the first day of the week, and days before the first Sunday of the year are 'week 0'",
                 "%W":  "Week number of the year [00, 53]; Monday is considered the first day of the week, and days before the first Monday of the year are 'week 0'",
                  "%z":  "UTC time zone offset as +HHMM or -HHMM; empty if time zone naive",
                  "%Z":  "Time zone name as a string, or empty string if no time zone",
                  "%F":  "Shortcut for %Y-%m-%d (e.g., 2012-4-18)",
                  "%D":  "Shortcut for %m/%d/%y (e.g., 04/18/12)"
                 }

In [12]:
table2_4_02 = pd.DataFrame(datetime_format.items(), columns = ["Type", "Description"])
table2_4_02

Unnamed: 0,Type,Description
0,%Y,Four-digit year
1,%y,Two-digit year
2,%m,"Two-digit month [01, 12]"
3,%d,"Two-digit day [01, 31]"
4,%H,"Hour (24-hour clock) [00, 23]"
5,%I,"Hour (12-hour clock) [01, 12]"
6,%M,"Two-digit minute [00, 59]"
7,%S,"Second [00, 61] (seconds 60, 61 account for le..."
8,%f,"Microsecond as an integer, zero-padded (from 0..."
9,%j,Day of the year as a zero-padded integer (from...


In [13]:
stamp01 = datetime(2023, 9, 27)
str(stamp01)

'2023-09-27 00:00:00'

In [14]:
stamp01.strftime("%Y-%m-%d")

'2023-09-27'

In [15]:
stamp01.strftime("%F")

'2023-09-27'

In [16]:
stamp01.strftime("%D")

'09/27/23'

In [17]:
value01 = "2023-10-02"
datetime.strptime(value01, "%Y-%m-%d")

datetime.datetime(2023, 10, 2, 0, 0)

In [18]:
datestrs01 = ["7/6/2011", "8/6/2011"]
[datetime.strptime(x, "%m/%d/%Y") for x in datestrs01]

[datetime.datetime(2011, 7, 6, 0, 0), datetime.datetime(2011, 8, 6, 0, 0)]

In [19]:
datestrs02 = ["2011-07-06 12:00:00", "2011-08-06 15:00:00"]
pd.to_datetime(datestrs02)

DatetimeIndex(['2011-07-06 12:00:00', '2011-08-06 15:00:00'], dtype='datetime64[ns]', freq=None)

**Handling missing values**

In [80]:
index01 = pd.to_datetime(datestrs01 + [None])
index01

DatetimeIndex(['2011-07-06', '2011-08-06', 'NaT'], dtype='datetime64[ns]', freq=None)

In [81]:
pd.isna(index01)

array([False, False,  True])

`NaT` (**Not a Time**) is pandas’s null value for timestamp data.

In [24]:
### locale-specific date formatting
locale_specifics = {
    "%a":  "Abbreviated weekday name",
    "%A":  "Full weekday name",
    "%b":  "Abbreviated month name",
    "%B":  "Full month name",
    "%c":  "Full date and time (e.g., ‘Tue 01 May 2012 04:20:57 PM’)",
    "%p":  "Locale equivalent of AM or PM",
    "%x":  "Locale-appropriate formatted date (e.g., in the United States, May 1, 2012 yields ‘05/01/2012’)",
    "%X":  "Locale-appropriate time (e.g., ‘04:24:12 PM’)"
}

In [25]:
table2_4_03 = pd.DataFrame(locale_specifics.items(), 
                          columns = ["Type", "Description"])
table2_4_03

Unnamed: 0,Type,Description
0,%a,Abbreviated weekday name
1,%A,Full weekday name
2,%b,Abbreviated month name
3,%B,Full month name
4,%c,"Full date and time (e.g., ‘Tue 01 May 2012 04:..."
5,%p,Locale equivalent of AM or PM
6,%x,"Locale-appropriate formatted date (e.g., in th..."
7,%X,"Locale-appropriate time (e.g., ‘04:24:12 PM’)"


## 2.4.2 Time Series Basics

In [26]:
###  datetime objects in a DatetimeIndex
np.random.seed(12345)
dates01 = [datetime(2011, 1, 2), datetime(2011, 1, 5),
         datetime(2011, 1, 7), datetime(2011, 1, 8),
         datetime(2011, 1, 10), datetime(2011, 1, 12)]
ts01 = pd.Series(np.random.standard_normal(6), index=dates01)
ts01

2011-01-02   -0.204708
2011-01-05    0.478943
2011-01-07   -0.519439
2011-01-08   -0.555730
2011-01-10    1.965781
2011-01-12    1.393406
dtype: float64

In [27]:
ts01.index

DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',
               '2011-01-10', '2011-01-12'],
              dtype='datetime64[ns]', freq=None)

**Arithmetic operations between differently indexed time series automatically align on the dates**

In [28]:
### selecting every second element in `ts01`.
ts01[::2]

2011-01-02   -0.204708
2011-01-07   -0.519439
2011-01-10    1.965781
dtype: float64

In [29]:
ts01 + ts01[::2]

2011-01-02   -0.409415
2011-01-05         NaN
2011-01-07   -1.038877
2011-01-08         NaN
2011-01-10    3.931561
2011-01-12         NaN
dtype: float64

**`pandas` stores timestamps using NumPy’s datetime64 data type at the nanosecond resolution**.

In [30]:
ts01.index.dtype

dtype('<M8[ns]')

### Indexing, Selection, Subsetting

In [31]:
stamp02 = ts01.index[2]
stamp02

Timestamp('2011-01-07 00:00:00')

In [32]:
ts01[stamp02]

-0.5194387150567381

In [33]:
ts01[2]

-0.5194387150567381

In [34]:
### passing a string that is interpretable as a date
ts01["2011-01-10"]

1.9657805725027142

In [35]:
np.random.seed(12345)
longer_ts = pd.Series(np.random.standard_normal(2000),
                      index=pd.date_range("2005-01-01", periods=2000))
longer_ts

2005-01-01   -0.204708
2005-01-02    0.478943
2005-01-03   -0.519439
2005-01-04   -0.555730
2005-01-05    1.965781
                ...   
2010-06-19   -1.341493
2010-06-20   -0.293333
2010-06-21   -0.242459
2010-06-22   -3.056990
2010-06-23    1.918403
Freq: D, Length: 2000, dtype: float64

> For longer time series, a year or only a year and month can be passed to easily select slices of data.

In [36]:
longer_ts["2005"]

2005-01-01   -0.204708
2005-01-02    0.478943
2005-01-03   -0.519439
2005-01-04   -0.555730
2005-01-05    1.965781
                ...   
2005-12-27    0.457279
2005-12-28    0.115699
2005-12-29    1.014042
2005-12-30   -1.135008
2005-12-31   -0.263371
Freq: D, Length: 365, dtype: float64

In [37]:
### specifying the month
longer_ts["2005-08"]

2005-08-01   -0.493779
2005-08-02    1.239980
2005-08-03   -0.135722
2005-08-04    1.430042
2005-08-05   -0.846852
2005-08-06    0.603282
2005-08-07    1.263572
2005-08-08   -0.255491
2005-08-09   -0.445688
2005-08-10    0.468367
2005-08-11   -0.961604
2005-08-12   -1.824505
2005-08-13    0.625428
2005-08-14    1.022872
2005-08-15    1.107425
2005-08-16    0.090937
2005-08-17   -0.350109
2005-08-18    0.217957
2005-08-19   -0.894813
2005-08-20   -1.741494
2005-08-21   -1.052256
2005-08-22    1.436603
2005-08-23   -0.576207
2005-08-24   -2.420294
2005-08-25   -1.062330
2005-08-26    0.237372
2005-08-27    0.000957
2005-08-28    0.065253
2005-08-29   -1.367524
2005-08-30   -0.030280
2005-08-31    0.940489
Freq: D, dtype: float64

In [40]:
### Slicing with datetime objects
longer_ts[datetime(2010, 1, 1):]

2010-01-01   -1.408339
2010-01-02   -0.006248
2010-01-03   -0.558359
2010-01-04   -0.458344
2010-01-05   -1.944667
                ...   
2010-06-19   -1.341493
2010-06-20   -0.293333
2010-06-21   -0.242459
2010-06-22   -3.056990
2010-06-23    1.918403
Freq: D, Length: 174, dtype: float64

In [44]:
### Slicing with datetime objects
longer_ts[datetime(2005, 7, 1):datetime(2005, 9, 30)]

2005-07-01   -0.372642
2005-07-02   -0.926557
2005-07-03    1.755108
2005-07-04    1.209810
2005-07-05    1.270025
                ...   
2005-09-26   -0.400333
2005-09-27    0.449880
2005-09-28    0.399594
2005-09-29   -0.151575
2005-09-30   -2.557934
Freq: D, Length: 92, dtype: float64

In [45]:
### Slicing with timestamps not contained in a time series to perform a range query
longer_ts["2010-01-01":"2010-12-31"]

2010-01-01   -1.408339
2010-01-02   -0.006248
2010-01-03   -0.558359
2010-01-04   -0.458344
2010-01-05   -1.944667
                ...   
2010-06-19   -1.341493
2010-06-20   -0.293333
2010-06-21   -0.242459
2010-06-22   -3.056990
2010-06-23    1.918403
Freq: D, Length: 174, dtype: float64

In [57]:
longer_ts.truncate(after="2009-12-31")

2005-01-01   -0.204708
2005-01-02    0.478943
2005-01-03   -0.519439
2005-01-04   -0.555730
2005-01-05    1.965781
                ...   
2009-12-27   -0.550110
2009-12-28    1.234193
2009-12-29   -0.235629
2009-12-30    1.132309
2009-12-31   -0.228211
Freq: D, Length: 1826, dtype: float64

In [59]:
longer_ts.truncate(before="2010-01-01")

2010-01-01   -1.408339
2010-01-02   -0.006248
2010-01-03   -0.558359
2010-01-04   -0.458344
2010-01-05   -1.944667
                ...   
2010-06-19   -1.341493
2010-06-20   -0.293333
2010-06-21   -0.242459
2010-06-22   -3.056990
2010-06-23    1.918403
Freq: D, Length: 174, dtype: float64

In [62]:
dates01 = pd.date_range("2000-01-01", periods=100, freq="W-WED")
long_df = pd.DataFrame(np.random.standard_normal((100, 4)),
                       index=dates01,
                       columns=["Colorado", "Texas",
                                "New York", "Ohio"])
long_df.loc["2001-05"]

Unnamed: 0,Colorado,Texas,New York,Ohio
2001-05-02,-0.34742,-0.22257,-1.414702,0.48957
2001-05-09,-0.908653,0.296821,-0.049986,-1.313125
2001-05-16,0.546045,0.70487,1.527648,-0.083958
2001-05-23,0.066763,0.702206,-0.458129,-1.114003
2001-05-30,0.829523,1.870004,0.894471,0.29225


### Time Series with Duplicate Indices

In [69]:
dates02 = pd.DatetimeIndex(["2000-01-01", "2000-01-02", "2000-01-02",
                          "2000-01-02", "2000-01-03"])
dup_ts = pd.Series(np.arange(5), index=dates02)
dup_ts

2000-01-01    0
2000-01-02    1
2000-01-02    2
2000-01-02    3
2000-01-03    4
dtype: int64

In [70]:
# Check if the index is not unique
dup_ts.index.is_unique

False

In [76]:
### Aggregating the data having nonunique timestamps
grouped = dup_ts.groupby(level=0)
grouped.count()

2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64

## 2.4.3   Date Ranges, Frequencies, and Shifting

**Convert the sample time series to fixed daily frequency by calling `resample`**

In [73]:
ts01

2011-01-02   -0.204708
2011-01-05    0.478943
2011-01-07   -0.519439
2011-01-08   -0.555730
2011-01-10    1.965781
2011-01-12    1.393406
dtype: float64

In [78]:
resampler = ts01.resample("D")
resampler

<pandas.core.resample.DatetimeIndexResampler object at 0x7f1a69200b50>

In [79]:
pd.DataFrame(resampler, columns = ["Datetime","Value"])

Unnamed: 0,Datetime,Value
0,2011-01-02,2011-01-02 -0.204708 dtype: float64
1,2011-01-03,"Series([], dtype: float64)"
2,2011-01-04,"Series([], dtype: float64)"
3,2011-01-05,2011-01-05 0.478943 dtype: float64
4,2011-01-06,"Series([], dtype: float64)"
5,2011-01-07,2011-01-07 -0.519439 dtype: float64
6,2011-01-08,2011-01-08 -0.55573 dtype: float64
7,2011-01-09,"Series([], dtype: float64)"
8,2011-01-10,2011-01-10 1.965781 dtype: float64
9,2011-01-11,"Series([], dtype: float64)"


### Generating Date Ranges

In [87]:
index02 = pd.date_range("2012-04-01", "2012-06-30")
index02

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',
               '2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',
               '2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',
               '2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',
               '2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',
               '2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',
               '2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',
               '2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',
      

In [89]:
index02.shape

(91,)

In [90]:
pd.date_range(start="2012-04-01", periods=28)

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28'],
              dtype='datetime64[ns]', freq='D')

In [92]:
pd.date_range(end="2012-06-30", periods=20)

DatetimeIndex(['2012-06-11', '2012-06-12', '2012-06-13', '2012-06-14',
               '2012-06-15', '2012-06-16', '2012-06-17', '2012-06-18',
               '2012-06-19', '2012-06-20', '2012-06-21', '2012-06-22',
               '2012-06-23', '2012-06-24', '2012-06-25', '2012-06-26',
               '2012-06-27', '2012-06-28', '2012-06-29', '2012-06-30'],
              dtype='datetime64[ns]', freq='D')

In [142]:
### "BM" frequency means business end of month
pd.Series(pd.date_range("2000-01-01", "2001-01-01", freq="BM"))

0    2000-01-31
1    2000-02-29
2    2000-03-31
3    2000-04-28
4    2000-05-31
5    2000-06-30
6    2000-07-31
7    2000-08-31
8    2000-09-29
9    2000-10-31
10   2000-11-30
11   2000-12-29
dtype: datetime64[ns]

In [175]:
import pandas as pd

bm2000 = pd.date_range("2000-01-01", "2001-01-01", freq="BM")
bm2000_weekday = bm2000.day_name()  
df = pd.DataFrame({"Business End of Month": bm2000.strftime("%Y-%m-%d"), 
                    "Weekday": bm2000_weekday})
df

Unnamed: 0,Business End of Month,Weekday
0,2000-01-31,Monday
1,2000-02-29,Tuesday
2,2000-03-31,Friday
3,2000-04-28,Friday
4,2000-05-31,Wednesday
5,2000-06-30,Friday
6,2000-07-31,Monday
7,2000-08-31,Thursday
8,2000-09-29,Friday
9,2000-10-31,Tuesday


Please refer to [Table 11.4](https://wesmckinney.com/book/time-series#tseries_frequency) for the base-time series frequencies.

> `pandas.date_range` by default preserves the time (if any) of the start or end timestamp.

In [176]:
pd.Series(pd.date_range("2012-05-02 12:56:31", periods=5))

0   2012-05-02 12:56:31
1   2012-05-03 12:56:31
2   2012-05-04 12:56:31
3   2012-05-05 12:56:31
4   2012-05-06 12:56:31
dtype: datetime64[ns]

In [177]:
pd.Series(pd.date_range("2012-05-02 12:56:31", periods=5,  normalize=True))

0   2012-05-02
1   2012-05-03
2   2012-05-04
3   2012-05-05
4   2012-05-06
dtype: datetime64[ns]

### Frequencies and Date Offsets

In [179]:
pd.Series(pd.date_range("2000-01-01", "2000-01-03 23:59", freq="4H"))

0    2000-01-01 00:00:00
1    2000-01-01 04:00:00
2    2000-01-01 08:00:00
3    2000-01-01 12:00:00
4    2000-01-01 16:00:00
5    2000-01-01 20:00:00
6    2000-01-02 00:00:00
7    2000-01-02 04:00:00
8    2000-01-02 08:00:00
9    2000-01-02 12:00:00
10   2000-01-02 16:00:00
11   2000-01-02 20:00:00
12   2000-01-03 00:00:00
13   2000-01-03 04:00:00
14   2000-01-03 08:00:00
15   2000-01-03 12:00:00
16   2000-01-03 16:00:00
17   2000-01-03 20:00:00
dtype: datetime64[ns]

In [180]:
pd.Series(pd.date_range("2000-01-01", periods=10, freq="1h30min"))

0   2000-01-01 00:00:00
1   2000-01-01 01:30:00
2   2000-01-01 03:00:00
3   2000-01-01 04:30:00
4   2000-01-01 06:00:00
5   2000-01-01 07:30:00
6   2000-01-01 09:00:00
7   2000-01-01 10:30:00
8   2000-01-01 12:00:00
9   2000-01-01 13:30:00
dtype: datetime64[ns]

In [186]:
### 'WOM-3FRI': 3rd Friday of the month
monthly_dates = pd.date_range("2023-01-01", "2023-12-31", freq="WOM-3FRI")
pd.DataFrame({"Date":monthly_dates.strftime("%F"),
              "Weekday":monthly_dates.day_name()})

Unnamed: 0,Date,Weekday
0,2023-01-20,Friday
1,2023-02-17,Friday
2,2023-03-17,Friday
3,2023-04-21,Friday
4,2023-05-19,Friday
5,2023-06-16,Friday
6,2023-07-21,Friday
7,2023-08-18,Friday
8,2023-09-15,Friday
9,2023-10-20,Friday


### Shifting dates with offset

In [187]:
from pandas.tseries.offsets import Day, MonthEnd
now02 = datetime(2023, 9, 28)
now02 + 3 * Day()

Timestamp('2023-10-01 00:00:00')

In [188]:
now02 + MonthEnd()

Timestamp('2023-09-30 00:00:00')

In [189]:
offset01 = MonthEnd()
offset01.rollforward(now02)

Timestamp('2023-09-30 00:00:00')

In [196]:
np.random.seed(12345)
ts02 = pd.Series(np.random.standard_normal(53),
               index=pd.date_range("2023-01-01", periods=53, freq="7D"))
ts02

2023-01-01   -0.204708
2023-01-08    0.478943
2023-01-15   -0.519439
2023-01-22   -0.555730
2023-01-29    1.965781
2023-02-05    1.393406
2023-02-12    0.092908
2023-02-19    0.281746
2023-02-26    0.769023
2023-03-05    1.246435
2023-03-12    1.007189
2023-03-19   -1.296221
2023-03-26    0.274992
2023-04-02    0.228913
2023-04-09    1.352917
2023-04-16    0.886429
2023-04-23   -2.001637
2023-04-30   -0.371843
2023-05-07    1.669025
2023-05-14   -0.438570
2023-05-21   -0.539741
2023-05-28    0.476985
2023-06-04    3.248944
2023-06-11   -1.021228
2023-06-18   -0.577087
2023-06-25    0.124121
2023-07-02    0.302614
2023-07-09    0.523772
2023-07-16    0.000940
2023-07-23    1.343810
2023-07-30   -0.713544
2023-08-06   -0.831154
2023-08-13   -2.370232
2023-08-20   -1.860761
2023-08-27   -0.860757
2023-09-03    0.560145
2023-09-10   -1.265934
2023-09-17    0.119827
2023-09-24   -1.063512
2023-10-01    0.332883
2023-10-08   -2.359419
2023-10-15   -0.199543
2023-10-22   -1.541996
2023-10-29 

In [199]:
ts02.shape

(53,)

In [202]:
ts02.groupby(MonthEnd().rollforward).count()

2023-01-31    5
2023-02-28    4
2023-03-31    4
2023-04-30    5
2023-05-31    4
2023-06-30    4
2023-07-31    5
2023-08-31    4
2023-09-30    4
2023-10-31    5
2023-11-30    4
2023-12-31    5
dtype: int64

## 2.4.4   Time Zone Handling

In Python, time zone information comes from the third-party `pytz` library (installable with pip or conda), which exposes the *Olson database*, a compilation of world time zone information. This is especially important for historical data because the DST transition dates (and even UTC offsets) have been changed numerous times depending on the regional laws. In the United States, the DST transition times have been changed many times since 1900!



In [204]:
# !pip install pytz
import pytz
pytz.common_timezones[-1:]

['UTC']

In [205]:
 pytz.timezone("Asia/Manila")

<DstTzInfo 'Asia/Manila' LMT-1 day, 8:04:00 STD>

## 2.4.5 Periods and Period Arithmetic

**Reading Assignment**

## 2.4.6 Resampling and Frequency Conversion
**Reading Assignment**

## 2.4.7   Moving Window Functions
**Reading Assignment**