---
title: "1 Characteristics of Time Series 1.6 Vector-Valued and Multidimensional Series"
author: "Aaron Smith"
date: '2022-11-20'
output: html_document
---

This code is modified from Time Series Analysis and Its Applications, by Robert H. Shumway, David S. Stoffer 
https://github.com/nickpoison/tsa4

The most recent version of the package can be found at
https://github.com/nickpoison/astsa/

You can find demonstrations of astsa capabilities at
https://github.com/nickpoison/astsa/blob/master/fun_with_astsa/fun_with_astsa.md

In addition, the News and ChangeLog files are at
https://github.com/nickpoison/astsa/blob/master/NEWS.md.

The webpages for the texts and some help on using R for time series analysis can be found at 
https://nickpoison.github.io/.

UCF students can download it for free through the library.

```{r,eval = FALSE}
#install.packages(
#  pkgs = "remotes"
#)
#remotes::install_github(
#  repo = "nickpoison/astsa/astsa_build"
#)
```

```{r}
options(
  digits = 3,
  scipen = 99
)
rm(
  list = ls()
)
```

We frequently encounter situations in which the relationships between a number of jointly measured time series are of interest. 

Frequently, we will be working with multiple time series in time and space.

For example, in the previous sections, we considered discovering the relationships between the SOI and Recruitment series. Hence, it will be useful to consider the notion of a vector time series

$$
x_t = \begin{pmatrix}
x_{t1} \\ x_{t2} \\ \vdots \\ x_{tp},
\end{pmatrix}
$$

which contains as its components $p$ univariate time series.

We denote the $p \times 1$ column vector of the observed series as $x_t$. The row vector $x'_t$ is its transpose. For the stationary case, the $p \times 1$ mean vector

$$\mu = E(x_t) = \begin{pmatrix}
\mu_{t1} \\ \mu_{t2} \\ \vdots \\ \mu_{tp},
\end{pmatrix}$$

Notice that we use $x_t$ for univariate time series and vector time series (notation overlap), we use context to know which version is being discussed.

## autocovariance matrix

The autocovariance matrix is the $p \times p$ matrix

$$
\Gamma(h) = E[(x_{t+h} - \mu)(x_t - \mu)']
$$

## cross-covariance matrix

$$
\gamma_{ij}(h) = E[(x_{t+h,i} - \mu_i)(x_{t,j} - \mu_j)'], \ i,j = 1,\ldots,p \\
\gamma_{ij}(h) = \gamma_{ji}(-h) \\
\Gamma(h) = \Gamma'(-h)
$$

## sample autocovariance matrix

sample autocovariance matrix of the vector series $x_t$ is the $p \times p$ matrix of sample cross-covariances

$$
\widehat{\Gamma}(h) = \dfrac{1}{n} \sum_{t = 1}^{n-h}(x_{t+h} - \bar{x})(x_t - \bar{x}) \\
\bar{x} = \dfrac{1}{n}\sum_{t = 1}^{n} x_t \text{ (sample mean)} \\
\widehat{\Gamma}(h) = \widehat{\Gamma}'(-h)
$$

# Example 1.30 Soil Surface Temperatures

two-dimensional temperature series in the soiltemp data set is indexed by a row and column that represent positions on
a 64 Ã— 36 spatial grid on an agricultural field. 

The value of the temperature measured at each row and column, is denoted by subscripts corresponding to row then column.

The two-dimensional plot shows that a distinct change occurs around row 40, where the oscillations along the row axis become fairly stable and periodic. 

Averaging over the 36 columns, we may compute an average value for each row. The noise present in the first part of the two-dimensional series is nicely averaged out, and we see a clear and consistent temperature signal.

Spatial Grid of Surface Soil Temperatures

A 64 by 36 matrix of surface soil temperatures.

The format is: num [1:64, 1:36] 6.7 8.9 5 6.6 6.1 7 6.5 8.2 6.7 6.6 ...

3-d plot

```{r}
data(
  list = "soiltemp",
  package = "astsa"
)
par(
  mar = c(1,1,1,1)
)
persp(
  x = 1:64,
  y = 1:36,
  z = soiltemp,
  phi = 30,
  theta = 30,
  scale = FALSE,
  expand = 4,
  ticktype = "detailed",
  xlab = "rows",
  ylab = "cols",
  zlab = "temperature"
)
#dev.new()
```

time series plot of the row means

```{r}
astsa::tsplot(
  x = rowMeans(
    x = soiltemp
  ),
  xlab = "row",
  ylab = "Average Temperature"
)
```

Use ggplot to visualize the data

```{r}
M_soiltemp <- as.data.frame(
  x = soiltemp
)
rownames(M_soiltemp) <- 1:nrow(M_soiltemp)
colnames(M_soiltemp) <- 1:ncol(M_soiltemp)
M_soiltemp$row <- rownames(M_soiltemp)
M_soiltemp <- as.data.frame(
  x = tidyr::gather(
    data = M_soiltemp,
    key = "column",
    value = "soil_temperature",
    -row
  )
)
M_soiltemp$row <- factor(
  x = M_soiltemp$row,
  levels = sort(
    x = as.integer(unique(M_soiltemp$row)),
    decreasing = TRUE
  )
)
M_soiltemp$column <- factor(
  x = M_soiltemp$column,
  levels = sort(
    x = as.integer(unique(M_soiltemp$column)),
    decreasing = FALSE
  )
)
library(ggplot2)
ggplot(M_soiltemp) + 
  aes(x = column,y = row,fill = soil_temperature) + 
  geom_tile() + 
  scale_fill_gradient(low = "#0000FFFF",high = "#FF0000FF") +
  theme_bw() + 
  theme(legend.position = "none")
ggplot(M_soiltemp) +
  aes(x = row,y = soil_temperature) + 
  geom_boxplot() + 
  theme_bw()
ggplot(M_soiltemp) +
  aes(x = column,y = soil_temperature) + 
  geom_boxplot() + 
  theme_bw()
```

## autocovariance function of a stationary multidimensional process

The autocovariance function of a stationary multidimensional process, $x_s$, can be defined as a function of the multidimensional lag vector,

$$
\gamma(h) = E[(x_{s+h} - \mu)(x_s - \mu)] \\
h = \begin{pmatrix} h1 \\ h2 \\ \vdots \\ h_r \end{pmatrix} \\
\mu = E(x_s) \ (x_s \text{ is multidimensional, the mean does not depend on spatial coordinate})
$$

For two-dimensions

$$
\gamma(h_1,h_2) = E[(x_{s_1+h_1,s_2 + h_2} - \mu)(x_{s_1,s_2} - \mu)]
$$

which is a function of lag, both in the row ($h_1$) and column ($h_2$) directions.

## multidimensional sample autocovariance

The multidimensional sample autocovariance function is defined as

$$
\widehat{\gamma}(h) = (S_1S_2\dots S_r)^{-1}\sum_{s_1}\sum_{s_2}\dots\sum_{s_r}(x_{s+h} - \bar{x})(x_{s} - \bar{x}) \\
s = \begin{pmatrix} s_1 \\ s_2 \\ \vdots \\ s_r \end{pmatrix} \\
1 \leq s_i \leq S_i - h_i \\
i = 1,2,\ldots ,r
$$

The mean is computed over the r-dimensional array 

$$
\bar{x} = (S_1S_2\dots S_r)^{-1}\sum_{s_1}\sum_{s_2}\dots\sum_{s_r} x_{s_1,s_2,\dots , s_r} \\
1 \leq s_i \leq S_i
$$

##  multidimensional sample autocorrelation

$$
\widehat{\rho}(h) = \dfrac{\widehat{\gamma}(h)}{\widehat{\gamma}(0)}
$$

# Example 1.31 Sample ACF of the Soil Temperature Series

The autocorrelation function of the two-dimensional temperature process can be written as

$$
\widehat{\gamma}(h_1,h_2) = (S_1S_2)^{-1}\sum_{s_1}\sum_{s_2} (x_{s_1 + h_1,s_2 + h_2} - \bar{x})(x_{s_1,s_2} - \bar{x}) \\
\widehat{\rho}(h_1,h_2) = \dfrac{\widehat{\gamma}(h_1,h_2)}{\widehat{\gamma}(0,0)}
$$

The autocorrelation plot for the soil temperature shows a systematic periodic variation that appears along the rows.

The autocovariance over columns seems to be strongest for $h_1 = 0$, implying columns may form replicates of some underlying process that has a periodicity over the rows. 

This idea can be investigated by examining the row means series.

One way to calculate a 2d ACF in R is by using the fast Fourier transform (FFT).

The 2d autocovariance function is obtained in two steps and is contained in inverse output; $\widehat{\gamma}(0,0)$ is the (1,1) element so that $\widehat{\rho}(h_1,h_2)$ is obtained by dividing each element by that value.

The 2d ACF is contained in rs below, and the rest of the code is simply to arrange the results to yield a nice display.

```{r}
fft_soiltemp <- fft(
  z = soiltemp-mean(
    x = soiltemp
    )
)
abs_fft_soiltemp = abs(
  x = fft_soiltemp
)^2/(nrow(soiltemp)*ncol(soiltemp)) # see Ch 4 for info on FFT
Re_fft_inverse = Re(
  z = fft(
    z = abs_fft_soiltemp,
    inverse = TRUE
  )/sqrt(
    x = nrow(soiltemp)*ncol(soiltemp)
  )
)  # ACovF
rs = Re_fft_inverse/Re_fft_inverse[1,1] # ACF
rs2 = cbind(
  rs[1:41,21:2],
  rs[1:41,1:21]
)   #  these lines are just to center
rs3 = rbind(
  rs2[41:2,],
  rs2
)                #  the 0 lag  
par(
  mar = c(1.1,2.6,0.1,0.1)
)
persp(
  x = -40:40,
  y = -20:20,
  z = rs3,
  phi = 30,
  theta = 30,
  expand = 30,
  scale = "FALSE",
  ticktype = "detailed",
  xlab = "row lags",
  ylab = "column lags",
  zlab = "ACF"
)
```
