# stf-decomposition demo
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/thodson-usgs/stf-decomposition/blob/main/notebooks/stf-decomposition-demo.ipynb)  
This notebook demonstrates STF: seasonal-seasonal trend decomposition using the Fast Fourier transform.

TODO add short summary

In [5]:
%%capture
# Add package to Colab environment
#!pip install stf-decomposition
!pip install git+https://github.com/thodson-usgs/stf-decomposition.git
!pip install intake requests aiohttp
!pip install statsmodels==0.11.0

In [6]:
# load the data catalog
import intake
url = 'https://raw.githubusercontent.com/thodson-usgs/stf-decomposition/main/data/stf_data_catalog.yml'
cat = intake.open_catalog(url)
list(cat)

['air_passengers', 'co2', 'covid_usa', 'retail_sales']

In [7]:
# load a dataset
df = cat['co2'].read()

df.head()

ModuleNotFoundError: No module named 'msgpack'

## Basic usage
TODO add short summary

In [None]:
# TODO demonstrate basic usage on co2 dataset
from stf_decomposition import STF

## Comparison with STL
The functionality of stf_decomposition was modeled after the 
[statsmodels.tsa.seasonal.STL](https://www.statsmodels.org/devel/examples/notebooks/generated/stl_decomposition.html) library. 
The motivation in modeling STF after STL was to create a decomposition framework
that is already familiar to users for continuity.
We will show how the trend, seasonal, and residual components compare between STL and STF on three datasets.

### Seasonality of Retail Sales
This dataset contains advanced estimates of monthly sales in millions of dollars for retail trade in the US.

In [None]:
# TODO make a multi-panel that compares STF and STL on several datasets
# you can add more datasets to the data/stf_data_catalog.yml either through our github repo or any URL
from stf_decomposition import STF
from statsmodels.tsa.seasonal import STL
import statsmodels
import pandas as pd

retail = cat['retail_sales'].read()

retail.index = pd.to_datetime(retail["DATE"])

stl = STL(retail["RSXFSN"], seasonal = 13)
res = stl.fit()
fig = res.plot()

# Perform STF with user input seasonal window 
stf = STF(retail["RSXFSN"], "blackman", seasonal = 13)
res = stf.fit()
fig = res.plot()

ModuleNotFoundError: No module named 'stf_decomposition'

### Seasonality of United States Air Passenger Totals: 1949-1960
This dataset contains monthly totals of air passengers in the US from 1949 to 1960.

In [None]:
air = cat['air_passengers'].read()

air.index = pd.to_datetime(air["Month"])

stl = STL(air["#Passengers"], seasonal = 13)
res = stl.fit()
fig = res.plot()

# Perform STF with user input seasonal window 
stf = STF(air["#Passengers"], "blackman", seasonal = 13)
res = stf.fit()
fig = res.plot()

NameError: name 'cat' is not defined

### Seasonality of Covid-19 Case Totals
This dataset contains covid-19 daily case totals. The original dataset contains daily case totals in wide format so cleaning code is included to sum the totals and change to long format. 


In [None]:
# Read in data
covid = cat['covid_usa'].read()

# Clean data to be in long format instead of wide
covid_dates = covid.iloc[:, 11:]
covid_dates = covid_dates.T
covid_dates["total"] = covid_dates.sum(axis=1)
covid_dates.index = pd.to_datetime(covid_dates.index, format = '%m/%d/%y')

stl = STL(covid_dates["total"])
res = stl.fit()
fig = res.plot()

# Perform STF with user input seasonal window 
stf = STF(covid_dates["total"], "blackman")
res = stf.fit()
fig = res.plot()

NameError: name 'cat' is not defined

## Window Optimization
Seasonal window optimization is performed if a user does not give a seasonal input. 
The optimization method used in seasonal window selection is 
[scipy.optimize.brute] (https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.brute.html).
This optimization method was chosen as it returns the global minimum 
of a given function over a given range of values, in this case a range of 3 to 100.


In [None]:
# Window Optimization
co2 = cat['co2'].read()

co2.index = pd.to_datetime(co2["date"])

stf = STF(co2["CO2"], "hanning")
res = stf.fit()
fig = res.plot()

print(stf.seasonal)

In [8]:
import time
air = cat['air_passengers'].read()

air.index = pd.to_datetime(air["Month"])

start_time = time.time()
stf = STF(air["#Passengers"], "hanning")
res = stf.fit()
fig = res.plot()
print(res.seasonal)
print("--- %s seconds ---" % (time.time() - start_time))

start_time = time.time()
stf = STF(air["#Passengers"], "hanning", seasonal = 13)
res = stf.fit()
fig = res.plot()
print("--- %s seconds ---" % (time.time() - start_time))


ModuleNotFoundError: No module named 'msgpack'