### What is a time series?

It is a series of observations at discrete points in time.

We usually assume:
* Time is equally spaced.
* Only one variable.

Types of patterns: ① Trend, ② Seasonality, ③ Cycle

### Smoothing
**Rolling Average Smoothing**:

$$s_t = \frac{1}{m}\sum_{k=-m}^{m}x_{t+k} \tag{1}$$ 

where $m$ is number of one side, and window size: $w = 2m + 1$

Advantages: ① Simple, ② fast/time-efficient<br/>
Disadvantage: Unexpected behaviour for some frequencies (like flipping in regular frequency)

**Weighted Average Smoothing**:

$$s_t = \sum^m_{k=-m}x_{t+k}g(k), \sum^m_{k=-m}g(k)=1, \tag{2}$$



$g(k)$ is kernel, it can be box, gaussian, ...

When kernel is box kernel: $g(k)=\frac{1}{m}$, it become rolling average smoothing.

When kernel is gaussian, it can remove flipping behaviour, but less time-efficient.

**Weighted average smoothing can see as convolution**

Definition of discrete Convolution:

$$(f*g)(t) = \sum^{\infty}_{k=-\infty} f(t-k)g(k) \tag{3}$$

As applying to the $formula (2)$, when out of limit $[-m,m]$, $g(k)=0$. $f(t-k)$ can be seen as flip $g(k)$ to calculate, but $g(k)$ will be same if it is symmetric function.

Why do we need to see weighted averaging smoothing as convolution?
There are some convolution features that will be good for calculating when using weighted smoothing:<br/>
1. commutativity: $f*g = g*f$
2. Associativity: $(f*g)*h = f*(g*h), a(f*g)=(af)*g$
3. Distributivity: $(f+g)*h = (f*h) + (g*h)$

### Fourier Transform

Fourer Transform: Mathematical mapping from time domain to frequency domain.

Discrete Fourier Transform (DFT): Evenly spaced time series to discretized frequency domain.

Fast Fourier Transform (FFT): Highly efficient algorithm implementing DFT $O(nlog(n))$

$ Period: P = 0.25s \rightleftharpoons Frequency:f = \frac{1}{P} = 4s^{-1}=4Hz$

After **Discrete Fourier Transform**, time domain can be split into two frequency domain functions: Real (even function) and Imaginary (odd function).

We can use coordinate (Real, Imaginary) to find the Phase $\varphi$ and amplitude $r$ of each frequency. And draw each frequency function in time domain using formula: $f(t) = 2r \cos(\frac{2\pi}{N}t + \varphi)$

If we sum each frequency function in time domain, it will reconstruct back.

**Spectrum**: plot only absolute amplitude $|r|=|F(x)|$, or spectral density $r^2 = |F(x)|^2$

From the Spectrum plot, we can find the Harmonics, and the Fundamental Frequency (this first peak), which is periodic patterns.

**DFT properties**

|Time domain|Spectrum|Frequency space|
|--|--|--|
|$x \cdot a, a$ is constant|$F(x \cdot a) = \|F(x)\| \cdot a$| $F(x\cdot a) = F(x) \cdot \alpha$|
|$x+y$|$\|F(x+y)\|$|$F(x+y) = F(x) + F(y)$|

**Spectrograms**

Track frequencies over time.

### Smoothing in Frequency space
Using convolution theorem to solve it: $$x*y = F^{-1}(F(x) \cdot F(y)) \tag{4}$$, which means convolution in time domain corresponds to multiplication in frequency domain.
Three smoothing/filtering kernels:
||Convolution in Time Domain|Multiplication in Frequency domain|
|--|--|--|
|Rolling Average|box|filtering with "wavy" weights|
|Gaussian|thin Gaussian (vice versa)| fat Gaussian|
|Brickwall filter|"wavy"|brickwall|

### Missing Fundamentals

Fundamental frequency will be determined by the greatest common divisor in the harmonic values. Such as if we have (220hz, 440hz, 660hz), the fundamental frequency will be 220hz, if we add higher harmonic 770hz, the fundamental frequency will be 110hz. More detailed and vivid can be seen from this [video](https://www.youtube.com/watch?v=AZ8qZCGg4Bk)


### Autocorrelation

It is correlation of a time series with lagging versions.

Advantages:<br/>
* Periodic patterns show up as first peak so that can handle missing fundamental problem.
* can handle low frequencies better.

Disadvantages:<br/>
* Not so great at disentangling multiple frequencies.
