# Resampling
Resampling is a method to convert the time-based observed data into a
different interval. In other words, it is used to change the time frequency
into another time-frequency format. For instance, say we want to change
monthly data into a year-wise format or upsample week data into hours.
We would perform either upsample or downsample operations. For that,
every data object must have a DateTime-like index(Datetimeindex,
PeriodIdnex, TimedeltaIndex). The following example illustrates the
result of resampling operations in Pandas:

In [18]:
import pandas as pd
df = pd.read_csv(r'https://raw.githubusercontent.com/FuTSA23/time-series-analysis-datasets/main/daily-total-female-births-CA-with_nulls.csv ',index_col =0, parse_dates=['date'])
df.head(5)

Unnamed: 0_level_0,births
date,Unnamed: 1_level_1
1959-01-01,35.0
1959-01-02,32.0
1959-01-03,30.0
1959-01-04,31.0
1959-01-05,44.0


In [17]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 365 entries, 1959-01-01 to 1959-12-31
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   births  349 non-null    float64
dtypes: float64(1)
memory usage: 5.7 KB


## Resampling by Month
Here is how to resample by month:

In [12]:
df.births.resample('M').mean()


date
1959-01-31    38.592593
1959-02-28    40.692308
1959-03-31    39.571429
1959-04-30    40.103448
1959-05-31    38.833333
1959-06-30    40.241379
1959-07-31    41.935484
1959-08-31    43.580645
1959-09-30    48.551724
1959-10-31    44.129032
1959-11-30    45.000000
1959-12-31    42.758621
Freq: M, Name: births, dtype: float64

## Resampling by Quarter
Here is how to resample by quarter:

In [13]:
df.births.resample('Q').mean()


date
1959-03-31    39.604938
1959-06-30    39.715909
1959-09-30    44.604396
1959-12-31    43.966292
Freq: Q-DEC, Name: births, dtype: float64

## Resampling by Year
Here is how to resample by year:

In [14]:
df.births.resample('Y').mean()

date
1959-12-31    42.048711
Freq: A-DEC, Name: births, dtype: float64

## Resampling by Week
Here is how to resample by week:

In [15]:
df.births.resample('W').mean()

date
1959-01-04    32.000000
1959-01-11    36.833333
1959-01-18    42.500000
1959-01-25    43.000000
1959-02-01    35.142857
1959-02-08    39.833333
1959-02-15    42.857143
1959-02-22    41.833333
1959-03-01    40.000000
1959-03-08    40.000000
1959-03-15    36.571429
1959-03-22    40.666667
1959-03-29    40.333333
1959-04-05    41.142857
1959-04-12    37.857143
1959-04-19    38.166667
1959-04-26    40.142857
1959-05-03    42.285714
1959-05-10    39.000000
1959-05-17    37.428571
1959-05-24    42.000000
1959-05-31    39.000000
1959-06-07    42.142857
1959-06-14    39.857143
1959-06-21    36.500000
1959-06-28    39.142857
1959-07-05    44.000000
1959-07-12    44.285714
1959-07-19    41.142857
1959-07-26    40.285714
1959-08-02    43.000000
1959-08-09    44.571429
1959-08-16    43.285714
1959-08-23    40.857143
1959-08-30    45.285714
1959-09-06    48.166667
1959-09-13    45.571429
1959-09-20    45.857143
1959-09-27    52.714286
1959-10-04    51.428571
1959-10-11    45.000000
1959-10-18 

## Resampling on a Semimonthly Basis
Here is how to resample on a semimonthly basis:

In [16]:
df.births.resample('SM').mean()

date
1958-12-31    35.750000
1959-01-15    42.071429
1959-01-31    38.142857
1959-02-15    43.166667
1959-02-28    38.142857
1959-03-15    40.214286
1959-03-31    38.333333
1959-04-15    41.285714
1959-04-30    39.133333
1959-05-15    39.400000
1959-05-31    40.800000
1959-06-15    38.142857
1959-06-30    43.733333
1959-07-15    41.375000
1959-07-31    43.733333
1959-08-15    43.250000
1959-08-31    45.857143
1959-09-15    50.266667
1959-09-30    47.800000
1959-10-15    41.312500
1959-10-31    44.428571
1959-11-15    45.133333
1959-11-30    41.357143
1959-12-15    44.200000
1959-12-31    50.000000
Freq: SM-15, Name: births, dtype: float64