In [None]:

4 Estimating the Model
We are now ready to fit the spatio-temporal model to data. since the estimation
is described in Section of vignette("ST_intro", package="SpatioTemporal"),
we focus here on the details of the output from the estimation functions.
4.1 Parameter Estimation
Before estimating the parameters, we can look at the important dimensions
of the model:
> model.dim <- loglikeSTdim(mesa.model)
> str(model.dim)
List of 12
$ T : int 280
$ m : int 3
$ n : int 25
$ n.obs : int 25
$ p : Named int [1:3] 4 2 2
..- attr(*, "names")= chr [1:3] "const" "V1" "V2"
$ L : int 1
$ npars.beta.covf: Named int [1:3] 2 2 2
..- attr(*, "names")= chr [1:3] "exp" "exp" "exp"
$ npars.beta.tot : Named int [1:3] 2 2 2
..- attr(*, "names")= chr [1:3] "exp" "exp" "exp"
$ npars.nu.covf : int 2
$ npars.nu.tot : int 4
$ nparam.cov : int 10
$ nparam : int 19
T gives us the number of time points; m the number of β-fields; n the number
of locations; n.obs the number of observed locations (equal to n in this case,
since we have no unobserved locations); p a vector giving the number of
regression coefficients for each of the β-fields; L the number of spatio-temporal
20

In [None]:

4.1 Parameter Estimation
covariates. The rest of the output gives numbers of parameters by field, and
total number of parameters. An important dimension is nparam.cov, which
gives us the total number of covariance parameters. Since the regression
coefficients are estimated by profile likelihood, in essence only the covariance
parameters need starting values specified. We do this accordingly:
> x.init <- cbind(c( rep(2, model.dim$nparam.cov-1), 0),
c( rep(c(1,-3), model.dim$m+1), -3, 0))
Here each column of x.init contains a starting value for the optimisation
process in estimating the MLE’s of the 10 covariance parameters. Note
that these are starting values for only the optimisation of the covariance
parameters; once those have been optimised, the maximum-likelihood estimate
of the regression coefficients can be inferred using generalised least
squares (see Lindstr¨om et al., 2013, for details). In general x.init should
be a (nparam.cov)-by-(number of starting points) matrix, or just a vector of
length nparam.cov vector if only one starting point is desired.
What parameters are we specifying the starting points for? We can verify
this using loglikeSTnames, which gives the order of variables in x.init and
also tells us which of the parameters are logged. Specifying all=FALSE gives
us only the covariance parameters.

In [None]:

> rownames(x.init) <- loglikeSTnames(mesa.model, all=FALSE)
> x.init
[,1] [,2]
log.range.const.exp 2 1
log.sill.const.exp 2 -3
log.range.V1.exp 2 1
log.sill.V1.exp 2 -3
log.range.V2.exp 2 1
log.sill.V2.exp 2 -3
nu.log.range.exp 2 1
nu.log.sill.exp 2 -3
nu.log.nugget.(Intercept).exp 2 -3
nu.log.nugget.typeFIXED.exp 0 0
We are now ready to estimate the model parameters!
WARNING: The following steps are time-consuming.
> est.mesa.model <- estimate(mesa.model, x.init,
type="p", hessian.all=TRUE)
21
Tutorial for SpatioTemporal
ALTERNATIVE: Load pre-computed results.
> data(est.mesa.model, package="SpatioTemporal")
End of alternative
The function estimate() (strictly estimate.STmodel) estimates all the model
parameters. Specifying type="p" indicates we want to maximize the profile
likelihood. hessian.all=TRUE indicates we want the Hessian for all model
parameters; if we leave this entry blank, the default will compute the Hessian
for only the log-covariance parameters.
From the output, which is mainly due to the R-function optim, we see that the
two optimisation consumed 93 and 95 function evaluations each and ended
with the same value, 5748.562. The exact behaviour, including amount of
progress information, of optim is controlled by the pass-through argument
control = list(trace=3, maxit=1000).
The log-likelihood function called by estimate() is included in the package
as loglikeST, with loglikeSTgrad and loglikeSTHessian computing the
(finite difference) gradient and hessian of the log-likelihood functions. In
case of trouble with the optimisation the user is recommended to study the
behaviour of the log-likelihood at the troublesome parameter values.
Here we just verify that the log-likelihood value given parameters from the
optimisation actually equals the maximum reported from the optimisation.
> loglikeST(est.mesa.model$res.best$par, mesa.model)
[1] 5748.563
> est.mesa.model$res.best$value
[1] 5748.563
4.2 Evaluating the Results
The first step in evaluating the optimisation results is to study the message
included in the output from estimate(), as well as the converged parameter
values from the two starting points:
> print(est.mesa.model)

In [None]:

22
4.2 Evaluating the Results
Optimisation for STmodel with 2 starting points.
Results: 2 converged, 0 not converged, 0 failed.
Best result for starting point 2, optimisation has converged
No fixed parameters.
Estimated parameters for all starting point(s):
[,1] [,2]
gamma.lax.conc.1500 0.0008974712 0.0008978266
alpha.const.(Intercept) 3.7403623155 3.7401860388
alpha.const.log10.m.to.a1 -0.2021666282 -0.2021109943
alpha.const.s2000.pop.div.10000 0.0402093462 0.0402186866
alpha.const.km.to.coast 0.0374438274 0.0374357454
alpha.V1.(Intercept) -0.7424941400 -0.7427147900
alpha.V1.km.to.coast 0.0173683410 0.0173855016
alpha.V2.(Intercept) -0.1290170229 -0.1291748864
alpha.V2.km.to.coast 0.0155335639 0.0155418776
log.range.const.exp 2.4236209882 2.4252079596
log.sill.const.exp -2.7543738952 -2.7530652909
log.range.V1.exp 2.9250935694 2.9212909392
log.sill.V1.exp -3.5194123375 -3.5230779066
log.range.V2.exp 1.7870147389 1.7833703171
log.sill.V2.exp -4.6784625732 -4.6776455920
nu.log.range.exp 4.3830937438 4.3833567348
nu.log.sill.exp -3.2127991617 -3.2127208289
nu.log.nugget.(Intercept).exp -4.4124997680 -4.4118811769
nu.log.nugget.typeFIXED.exp 0.6769586148 0.6750069008
Function value(s):
[1] 5748.562 5748.563
The message at the top of the output indicates that of our 2 starting points
both converged, and the best overall result was found for the first starting
value.
The function estimate() determines convergence for a given optimisation
by studying the convergence field in the output from optim, with 0 indicating
a successful completion; followed by an evaluation of the eigenvalues of
the Hessian (the 2nd derivative of the log-likelihood) to determine if the matrix
is negative definite; indicating that the optimisation has found a (local)
maximum.

In [None]:

23
Tutorial for SpatioTemporal
Included in the output from estimate()
> names(est.mesa.model)
[1] "res.best" "res.all" "summary"
is the results from all the optimisations and the best possible result. Here
res.all is a list with the optimisation results for each starting point, and
res.best contains the “best” optimisation results.
Examining the optimisation results
> names(est.mesa.model$res.best)
[1] "par" "value" "counts" "convergence"
[5] "message" "hessian" "conv" "par.cov"
[9] "par.all" "hessian.all"
> names(est.mesa.model$res.all[[1]])
[1] "par" "value" "counts" "convergence"
[5] "message" "hessian" "conv" "par.cov"
[9] "par.all"
> names(est.mesa.model$res.all[[2]])
[1] "par" "value" "counts" "convergence"
[5] "message" "hessian" "conv" "par.cov"
[9] "par.all"
we see that the results include several different fields, several of which are
taken directly from the output of the optim function —
par The estimated log-covariance parameters.
value The value of the log-likelihood.
counts The number of function evaluations.
convergence and message Convergence information from optim.
conv An indicator of convergence that combines convergence with a check
if the Hessian is negative definite
hessian The Hessian of the profile log-likelihood, from optim
24

In [None]:

4.2 Evaluating the Results
par.cov A data frame containing estimates, estimated standard errors, initial
or fixed values depending on whether we estimated or fixed the
various parameters (in this case, all were estimated), and t-statistics
for the log-covariance parameters
par.all The same summary as par.cov, but for all the parameters of the
model. The regression coefficents are computed using generalised least
squares (See Lindstr¨om et al., 2013, for details.).
hessian.all The Hessian of the full log-likelihood (computed by
loglikeSTHessian), this is only computed for the best result point,
par.est$res.best.
Refer back to the output from print(est.mesa.model); we consider now the
two columns of parameter estimates resulting from the two starting values.
The parameters are similar but not identical, with the biggest difference
being for log.range.V1. The differences have to do with where and how the
numerical optimisation stopped/converged. Due to the few locations (only
25) the log-likelihood is flat, implying that even with some variability in the
parameter values we will still obtain very similar log-likelihood values.
The flat log-likelihood implies that some parameter estimates will be rather
uncertain. Extracting the estimated parameters and parameter uncertaintie,
we note large standard-deviations for the β-field covariance parameters.
> coef(est.mesa.model, pars="cov")[,1:2]
par sd
log.range.const.exp 2.4252080 0.59198470
log.sill.const.exp -2.7530653 0.39329269
log.range.V1.exp 2.9212909 0.72546209
log.sill.V1.exp -3.5230779 0.54042577
log.range.V2.exp 1.7833703 0.56569386
log.sill.V2.exp -4.6776456 0.33558690
nu.log.range.exp 4.3833567 0.09639940
nu.log.sill.exp -3.2127208 0.06161702
nu.log.nugget.(Intercept).exp -4.4118812 0.05634108
nu.log.nugget.typeFIXED.exp 0.6750069 0.11763780
This is due to the number of “observations” that go into estimating the β-
field covariance parameters; there are only 25 locations. On the other hand,
the entire contingent of observations (4577 in this data set) can be used to
25
Tutorial for SpatioTemporal
estimate the covariance parameters of the spatial-temporal residual fields.
Another way of seeing this is that we have only one replicate of each β-field
— given by the regression of observations on the smooth-temporal basis functions
— but T = 280 replicates of the residual field, one for each timepoint
— given by the residuals from the regression. Either way, the larger sample
size for the residual field is making the standard error for those covariance
parameters smaller, leading to tighter confidence intervals.



In [None]:
4.3 Predictions
Having estimated the model parameters we use predict.STmodel to compute
the conditional expectations for different parts of the model.
WARNING: The following steps are time-consuming.
> pred.mesa.model <- predict(mesa.model, est.mesa.model,
pred.var=TRUE)
ALTERNATIVE: Load pre-computed results.
> data(pred.mesa.model, package="SpatioTemporal")
End of alternative
The results from predict contains the following elements
> names(pred.mesa.model)
[1] "opts" "pars" "beta" "EX.mu"
[5] "EX.mu.beta" "EX" "VX" "VX.pred"
[9] "I"
described in detail by the plot-function
> print(pred.mesa.model)
Prediction for STmodel.
Regression parameters:
0 Spatio-temporal covariate(s).
8 beta-fields regression parameters in x$pars.
26
4.3 Predictions
Regression parameters are assumed to be known and
prediction variances do NOT include
uncertainties in regression parameters.
Prediction of beta-fields, (x$beta):
List of 3
$ mu: num [1:25, 1:3] 3.79 3.67 4.03 3.23 3.69 ...
..- attr(*, "dimnames")=List of 2
$ EX: num [1:25, 1:3] 3.7 3.37 4.19 3.67 3.52 ...
..- attr(*, "dimnames")=List of 2
$ VX: num [1:25, 1:3] 0.000186 0.000186 0.001112 0.002198 0.000186 ...
..- attr(*, "dimnames")=List of 2
Predictions for 280 times at 25 locations.
List of 3
$ EX.mu : num [1:280, 1:25] 4.82 4.62 4.43 4.25 4.08 ...
..- attr(*, "dimnames")=List of 2
$ EX.mu.beta: num [1:280, 1:25] 4.39 4.24 4.1 3.97 3.86 ...
..- attr(*, "dimnames")=List of 2
$ EX : num [1:280, 1:25] 4.55 4 4.03 4.18 3.72 ...
..- attr(*, "dimnames")=List of 2
Variances have been computed.
List of 2
$ VX : num [1:280, 1:25] 0.00509 0.00505 0.00501 0.00499 0.00497 ...
..- attr(*, "dimnames")=List of 2
$ VX.pred: num [1:280, 1:25] 0.0172 0.0172 0.0171 0.0171 0.0171 ...
..- attr(*, "dimnames")=List of 2
The most important components of these results are the estimated β-fields
and their variances (EX.beta and VX.beta); as well as the conditional expectations
and variances at all the 280 × 25 space-time locations (EX and
VX).
All the components of EX are compute conditional on the estimated parameters
and observed data. The components are:
opts Options used in the call to predict, or implicitly assumed.
pars The regression parameters in (2) and (3), computed using generalised
least squares (see Lindstr¨om et al., 2013, for details).
27
Tutorial for SpatioTemporal
EX The expected spatio-temporal process (1), or
E(y(s, t)|Ψ, observations).
EX.mu The regression component of the spatio-temporal process,
µ(s, t) = X
L
l=1
γlMl(s, t) +Xm
i=1
Xiαifi(t).
Note that this differs from (2).
EX.mu.beta The mean part (2) of the spatio-temporal process (1); this
includes the conditional expectations of the β-fields,
µβ(s, t) = X
L
l=1
γlMl(s, t) +Xm
i=1
fi(t)E(βi
|Ψ, observations).
VX The conditional variance of the spatio-temporal process in EX.
VX.pred The predictive conditional variance for the spatio-temporal process
in EX (essentially VX, plus the nugget in the ν-field).
beta A structure containing reconstructions and uncertainties for the latent
β-fields.
I An index vector that can be used to extract the observed spatio-temporal
locations from EX, EX.mu, EX.mu.beta, etc.
First we compare the β-fields computed by fitting each of the times series of
observations to the smooth trends,
> beta <- estimateBetaFields(mesa.model)
with the β-fields obtained from the full model, see Figure 3.
> par(mfrow=c(2,2), mar=c(3.3,3.3,1.5,1), mgp=c(2,1,0), pty="s")
> for(i in 1:3){
plotCI(x=beta$beta[,i], y=pred.mesa.model$beta$EX[,i],
uiw=1.96*beta$beta.sd[,i], err="x",
main=paste("Beta-field for f", i, "(t)", sep=""),
xlab="Empirical estimate",
ylab="Spatio-Temporal Model",
pch=NA, sfrac=0.005, asp=1)
28
4.3 Predictions
plotCI(x=beta$beta[,i], y=pred.mesa.model$beta$EX[,i],
uiw=1.96*sqrt(pred.mesa.model$beta$VX[,i]),
add=TRUE, pch=NA, sfrac=0.005)
abline(0, 1, col="grey")
}
We can see from Figure 3 that the two ways of computing the β-fields lead to
very comparable results. The largest discrepancies lie with the coefficient for
the second temporal trend, where it appears the coefficients calculated via
conditional expectation are larger than those calculated by fitting the time
series to the temporal trend. However, the uncertainty in these coefficients
is large.
3.0 3.5 4.0
3.0 3.5 4.0
Beta−field for f1(t)
Empirical estimate
Spatio−Temporal Model
−1.0 −0.8 −0.6 −0.4 −0.2
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Beta−field for f2(t)
Empirical estimate
Spatio−Temporal Model
−0.4 −0.2 0.0 0.2
−0.3 −0.1 0.1 0.3
Beta−field for f3(t)
Empirical estimate
Spatio−Temporal Model

In [None]:
Figure 3: Comparing the two estimates of the β–field for the constant temporal
trend and for the two smooth temporal trends.

In addition to predictions of the β-fields, predict also computes the conditional
expectation at all the 280 × 25 space-time locations. As an example
we study 4 of these locations, see Figure 4.

In [None]:
par(mfrow=c(4,1),mar=c(2.5,2.5,2,.5))
for(i in c(1,10,17,22)){
 plot(pred.mesa.model, ID=i, STmodel=mesa.model,
 col=c("black","red","grey"), lwd=1)
 plot(pred.mesa.model, ID=i, pred.type="EX.mu",
 col="green", lwd=1, add=TRUE)
 plot(pred.mesa.model, ID=i, pred.type="EX.mu.beta",
 col="blue", lwd=1, add=TRUE)
}


In [None]:
Plotting these predictions along with 95% confidence intervals, the components
of the predictions, and the observations at 4 different locations indicates
that the predictions capture the seasonal variations in the data, see
Figure 4. The important thing to note here is that the predictions are computed
as the conditional expectation of a latent field given observations. For
unobserved locations this distinction does not matter, but for observed locations
this implies smoothing over the nugget in the ν-fields resulting in
E(x(s, t)|y(s, t)) 6= y(s, t), where y(s, t) is an observations of the latent field,
x(s, t) at time t and locations s. Thus predictions do not coincides with observations.
Adding the components of the predictions that are due to only
the regression (green) and both regression and β-fields (blue) allows us to
investigate how the different parts of the model capture the observations.