## Forecasting Retail Turnover

In this notebook I attempt to find a forecasting model for the Quarterly Retail Turnover time series data set.  We set out to find which model will perform best.  The models that will be compared are the ARIMA model, Exponential Smoothing Model, and a Regression-based Model.  In order to find which model is best among the three, we can set aside a portion of the last observations as a holdout sample and fit our model to the rest of the observations.  Whichever model best explains the autocorrelation in the errors, passes the white noise test, and gives the most accurate predictions for our holdout sample will be our best model.  We end this project by then describing the benefits and limitations of using the different models.


### Data 

The data used in this project comes from [here.]( https://datamarket.com/data/set/22n7/quarterly-retail-turnover-m-current-jun-1982-dec-1991#!ds=22n7&display=line
)


In [145]:
title 'Retail Turnover from 1982 to 1992';
ods noproctitle;
options validvarname=v7;
%let path = /folders/myfolders/sasuser.v94/quarterly-retail-turnover.csv;

proc import datafile="&path" dbms=csv out=retail replace;
run;

proc print data=retail;
run;

Obs,Quarter,Quarterly_Retail_Turnover
1,1982Q3,13423.2
2,1982Q4,13128.8
3,1983Q1,15398.8
4,1983Q2,12964.2
5,1983Q3,13133.5
6,1983Q4,13271.7
7,1984Q1,15596.3
8,1984Q2,13018.0
9,1984Q3,13409.3
10,1984Q4,13304.2


The dataset has 39 observations, so we can hold back the last 12 observations as our holdout sample when performing model estimation.  Just a quick note, this RetailTurnover.ipynb is a symoblic link pointing to a file in a shared folder being used with VirtualBox (I'm using SAS University Edition).  That's why the relative path here seems off.  I did this so I could keep all of my data analysis notebooks in a single repository.

In [146]:
proc sgplot data=retail;
    series x=quarter y=quarterly_retail_turnover /markers;
run;

Here we just plot the data to see if there’s anything that stands out.  One of the first things we notice is that the turnover seems to fluctuate around an average that increases in a linear fashion over time.  This suggests that we should include a linear trend component when using a time series regression model.  We can also see increasing seasonal variation which means we should apply a transformation to our response variable and incorporate 3 seasonal dummy variables for the 4 quarters.

In [147]:
data retail;
    set retail;

    if qtr(Quarter)=1 then
        do;
            Q1=1;
            Q2=0;
            Q3=0;
        end;
    else if qtr(Quarter)=2 then
        do;
            Q1=0;
            Q2=1;
            Q3=0;
        end;
    else if qtr(Quarter)=3 then
        do;
            Q1=0;
            Q2=0;
            Q3=1;
        end;
    else
        do;
            Q1=0;
            Q2=0;
            Q3=0;
        end;
    log_turnover=log(quarterly_retail_turnover);
run;

title2 "Regression Model (Attempt 1)";

proc arima data=retail plots (only)=(residual(corr) forecast(forecastonly));
    identify var=log_turnover crosscorr=(Quarter Q1 Q2 Q3);
    estimate input=(Quarter Q1 Q2 Q3);
    forecast lead=12 back=12 id=Quarter interval=qtr;
run;

Name of Variable = log_turnover,Name of Variable = log_turnover.1
Mean of Working Series,9.586662
Standard Deviation,0.085236
Number of Observations,39.0

Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise
To Lag,Chi-Square,DF,Pr > ChiSq,Autocorrelations,Autocorrelations.1,Autocorrelations.2,Autocorrelations.3,Autocorrelations.4,Autocorrelations.5
6,34.12,6,<.0001,-0.057,0.04,-0.057,0.851,-0.118,-0.011

Correlation of log_turnover and Quarter,Correlation of log_turnover and Quarter.1
Variance of input =,1056154
Number of Observations,39

Correlation of log_turnover and Q1,Correlation of log_turnover and Q1.1
Variance of input =,0.190664
Number of Observations,39.0

Correlation of log_turnover and Q2,Correlation of log_turnover and Q2.1
Variance of input =,0.177515
Number of Observations,39.0

Correlation of log_turnover and Q3,Correlation of log_turnover and Q3.1
Variance of input =,0.190664
Number of Observations,39.0

Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation
Parameter,Estimate,Standard Error,t Value,Approx Pr > |t|,Lag,Variable,Shift
MU,9.171,0.02259,405.98,<.0001,0,log_turnover,0
NUM1,3.829e-05,2.22413e-06,17.22,<.0001,0,Quarter,0
NUM2,0.15681,0.0063738,24.6,<.0001,0,Q1,0
NUM3,-0.02541,0.0065451,-3.88,0.0005,0,Q2,0
NUM4,0.000952,0.0063738,0.15,0.8822,0,Q3,0

0,1
Constant Estimate,9.170998
Variance Estimate,0.000203
Std Error Estimate,0.014245
AIC,-216.279
SBC,-207.961
Number of Residuals,39.0

Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates
Variable Parameter,log_turnover MU,Quarter NUM1,Q1 NUM2,Q2 NUM3,Q3 NUM4
log_turnover MU,1.0,-0.98,-0.109,-0.137,-0.172
Quarter NUM1,-0.98,1.0,-0.032,0.0,0.032
Q1 NUM2,-0.109,-0.032,1.0,0.486,0.498
Q2 NUM3,-0.137,0.0,0.486,1.0,0.486
Q3 NUM4,-0.172,0.032,0.498,0.486,1.0

Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals
To Lag,Chi-Square,DF,Pr > ChiSq,Autocorrelations,Autocorrelations.1,Autocorrelations.2,Autocorrelations.3,Autocorrelations.4,Autocorrelations.5
6,31.05,6,<.0001,0.611,0.399,0.082,-0.11,-0.249,-0.311
12,49.78,12,<.0001,-0.399,-0.341,-0.249,-0.135,-0.071,-0.059
18,71.73,18,<.0001,0.043,0.219,0.283,0.33,0.267,0.095
24,107.73,24,<.0001,-0.066,-0.199,-0.244,-0.356,-0.255,-0.285

Model for variable log_turnover,Model for variable log_turnover.1
Estimated Intercept,9.170998

Input Number 1,Input Number 1.1
Input Variable,Quarter
Overall Regression Factor,0.000038

Input Number 2,Input Number 2.1
Input Variable,Q1
Overall Regression Factor,0.156811

Input Number 3,Input Number 3.1
Input Variable,Q2
Overall Regression Factor,-0.02541

Input Number 4,Input Number 4.1
Input Variable,Q3
Overall Regression Factor,0.000952

Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover
Obs,Forecast,Std Error,95% Confidence Limits,95% Confidence Limits.1,Actual,Residual
28,9.5546,0.0142,9.5267,9.5825,9.5644,0.0098
29,9.5845,0.0142,9.5566,9.6124,9.5969,0.0125
30,9.587,0.0142,9.5591,9.615,9.6061,0.0191
31,9.7474,0.0142,9.7195,9.7753,9.762,0.0146
32,9.5686,0.0142,9.5407,9.5965,9.5878,0.0192
33,9.5985,0.0142,9.5705,9.6264,9.6073,0.0089
34,9.601,0.0142,9.5731,9.6289,9.6023,0.0013
35,9.7614,0.0142,9.7334,9.7893,9.7477,-0.0136
36,9.5826,0.0142,9.5547,9.6105,9.567,-0.0156
37,9.6124,0.0142,9.5845,9.6403,9.586,-0.0265


Looking at our output shows that our intercept, linear trend, Q1, and Q2 variables are all signficant.  We can see this by conducting the following hypothesis testing:


  * $H_0: b_i = 0$
  * $H_a:$ $b_i \neq$ 0
 

Our null hypothesis states that the given independant variable doesn't contribute any explanatory power to our model whereas the alternative hypothesis states that it does.  The p-value here is the likelihood that we would observe the given $T$-statistic assuming that our null hypothesis is true (Area under $T^{\mathrm{34}}_{0.025}$ to the right of the $T$-statistic).  Our significance level or $\alpha$ is set at 0.05.  Our $\alpha$ is just the probability of getting a Type 1 error/False positive.  Since the p-value is lower then our alpha, we can reject our null hypothesis in favor of the alternative.  This means that this coefficient has some explanatory power.  The only coefficicent that isn't significant is Q3 but that needs to stay in our model if we choose to use seasonal components.


When looking at the autocorrelation plots we can see a significant spike at lag 1 which suggests significant autocorrelation with lag 1.  Looking at the White Noise probabilty plot, we see that all of the lags are signigicant so we reject the null hypothesis:

  * $H_0:$ Prediction errors represent white noise.


Since the white noise test failed and there is autocorrelation present we can try a multiple linear regression model with a trend variable, seasonal dummy variables, and autoregressive errors.  The ARIMA procedure in SAS allows us to obtain point estimates for our intercept and independant variables as well as model the error terms with a first-order autogregressive process.

In [148]:
title2 "Regression Model (Attempt 2)";

proc arima data=retail plots (only)=(residual(corr) forecast(forecast));
    identify var=log_turnover crosscorr=(Quarter Q1 Q2 Q3);
    estimate p=(1) input=(Quarter Q1 Q2 Q3);
    forecast back=12 lead=12 id=Quarter interval=qtr out=forecast1;
run;

Name of Variable = log_turnover,Name of Variable = log_turnover.1
Mean of Working Series,9.586662
Standard Deviation,0.085236
Number of Observations,39.0

Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise
To Lag,Chi-Square,DF,Pr > ChiSq,Autocorrelations,Autocorrelations.1,Autocorrelations.2,Autocorrelations.3,Autocorrelations.4,Autocorrelations.5
6,34.12,6,<.0001,-0.057,0.04,-0.057,0.851,-0.118,-0.011

Correlation of log_turnover and Quarter,Correlation of log_turnover and Quarter.1
Variance of input =,1056154
Number of Observations,39

Correlation of log_turnover and Q1,Correlation of log_turnover and Q1.1
Variance of input =,0.190664
Number of Observations,39.0

Correlation of log_turnover and Q2,Correlation of log_turnover and Q2.1
Variance of input =,0.177515
Number of Observations,39.0

Correlation of log_turnover and Q3,Correlation of log_turnover and Q3.1
Variance of input =,0.190664
Number of Observations,39.0

Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation
Parameter,Estimate,Standard Error,t Value,Approx Pr > |t|,Lag,Variable,Shift
MU,9.21278,0.04357,211.47,<.0001,0,log_turnover,0
"AR1,1",0.68195,0.13255,5.14,<.0001,1,log_turnover,0
NUM1,3.421e-05,4.39999e-06,7.78,<.0001,0,Quarter,0
NUM2,0.15742,0.0036228,43.45,<.0001,0,Q1,0
NUM3,-0.02522,0.0042464,-5.94,<.0001,0,Q2,0
NUM4,0.0005199,0.0036144,0.14,0.8865,0,Q3,0

0,1
Constant Estimate,2.930107
Variance Estimate,0.000125
Std Error Estimate,0.011187
AIC,-234.293
SBC,-224.311
Number of Residuals,39.0

Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates,Correlations of Parameter Estimates
Variable Parameter,log_turnover MU,"log_turnover AR1,1",Quarter NUM1,Q1 NUM2,Q2 NUM3,Q3 NUM4
log_turnover MU,1.0,0.238,-0.992,0.022,-0.05,-0.137
"log_turnover AR1,1",0.238,1.0,-0.231,0.02,-0.015,-0.028
Quarter NUM1,-0.992,-0.231,1.0,-0.064,0.004,0.092
Q1 NUM2,0.022,0.02,-0.064,1.0,0.555,0.324
Q2 NUM3,-0.05,-0.015,0.004,0.555,1.0,0.557
Q3 NUM4,-0.137,-0.028,0.092,0.324,0.557,1.0

Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals
To Lag,Chi-Square,DF,Pr > ChiSq,Autocorrelations,Autocorrelations.1,Autocorrelations.2,Autocorrelations.3,Autocorrelations.4,Autocorrelations.5
6,1.99,5,0.8503,-0.02,0.172,-0.091,-0.025,-0.08,0.009
12,6.61,11,0.8296,-0.206,-0.109,-0.043,-0.013,0.005,-0.174
18,13.89,17,0.6747,-0.114,0.148,0.135,0.158,0.171,-0.003
24,26.7,23,0.2689,-0.037,-0.109,0.043,-0.299,0.067,-0.159

Model for variable log_turnover,Model for variable log_turnover.1
Estimated Intercept,9.212785

Autoregressive Factors,Autoregressive Factors.1
Factor 1:,1 - 0.68195 B**(1)

Input Number 1,Input Number 1.1
Input Variable,Quarter
Overall Regression Factor,0.000034

Input Number 2,Input Number 2.1
Input Variable,Q1
Overall Regression Factor,0.15742

Input Number 3,Input Number 3.1
Input Variable,Q2
Overall Regression Factor,-0.02522

Input Number 4,Input Number 4.1
Input Variable,Q3
Overall Regression Factor,0.00052

Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover
Obs,Forecast,Std Error,95% Confidence Limits,95% Confidence Limits.1,Actual,Residual
28,9.5545,0.0112,9.5326,9.5765,9.5644,0.0099
29,9.5829,0.0135,9.5564,9.6094,9.5969,0.014
30,9.5852,0.0145,9.5568,9.6136,9.6061,0.0209
31,9.7456,0.0149,9.7163,9.7748,9.762,0.0165
32,9.5658,0.0151,9.5362,9.5955,9.5878,0.0219
33,9.5946,0.0152,9.5648,9.6244,9.6073,0.0127
34,9.5972,0.0153,9.5672,9.6271,9.6023,0.0051
35,9.7577,0.0153,9.7277,9.7876,9.7477,-0.01
36,9.5781,0.0153,9.5481,9.608,9.567,-0.0111
37,9.6069,0.0153,9.5769,9.6369,9.586,-0.021


After we add AR(1) to this model we see no more significant spikes in the autocorrelation plots for any of the lags.  We can also see that this model passes the white noise test.  Looking at the forecasts of the holdout samples we can see that the model follows the data fairly well.  The model can be written as:

$\widehat{log(Turnover_{T})} = 9.21278 + 0.00003(Quarter) + 0.15742(Q1) - 0.02522(Q2) + 0.00052(Q3) + 0.68195(log(Turnover_{T-1}))$

Where T is the number of days from January 1, 1960 to the begining of the Quarter.

We can now build an exponential smoothing model and see how it compares with this regression-based one.

In [149]:
title2 "ESM Model";

proc esm data=retail seasonality=4 back=12 lead=12 outfor=forecast2
    plot=(corr modelforecasts) print=(estimates);
    id Quarter interval=qtr;
    forecast log_turnover / model=seasonal;
run;

Input Data Set,Input Data Set.1
Name,WORK.RETAIL
Label,
Time ID Variable,Quarter
Time Interval,QTR
Length of Seasonal Cycle,4
Forecast Horizon,12
Forecast Back,12

Variable Information,Variable Information.1
Name,log_turnover
Label,
First,1982:3
Last,1992:1
Number of Observations Read,39

Seasonal Exponential Smoothing Parameter Estimates,Seasonal Exponential Smoothing Parameter Estimates,Seasonal Exponential Smoothing Parameter Estimates,Seasonal Exponential Smoothing Parameter Estimates,Seasonal Exponential Smoothing Parameter Estimates
Parameter,Estimate,Standard Error,t Value,Approx Pr > |t|
Level Weight,0.94111,0.13165,7.15,<.0001
Seasonal Weight,0.2643,0.66935,0.39,0.6963


In an Exponential Smoothing Model we use a smoothing constant ($\alpha$) that allows us to give more weigh to the most recent observations.  The weight declines exponentially as time passes so older observations will successively be given lower weights.  $\alpha$ is a number between 0 and 1.  The larger $\alpha$ is the more our predicted value will depend on the more current values and vice versa.  The alpha is typically selected to minimize the SSE of the errors.  Gamma ($\gamma$) is another smoothing constant used in the forecast and is also between 0 and 1.  $\alpha$ is the smoothing constant for the level of our series and $\gamma$ is the smoothing constant of the seasonal factors.  In the output we see that SAS has estimated  $\alpha$ = 0.94111 and $\gamma$ = 0.26430.  The general equation for this model can be found [here](http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_tffordet_sect021.htm).

In the output we don't see any significant lags in the autocorrelation plots.  This model also passes the white noise test so we can say that the autocorrelation for the plots have been dealt with.  In the forecast plot we can see the prediction interval for the predicted value gradually widening as time goes on which is to be expected.  The model becomes less accurate as time passes.  

Next we look at an ARIMA model.

In [150]:
title2 "ARIMA Model";

proc arima data=retail plots (only)=(residual(corr) forecast(forecast));
    identify var=log_turnover(1, 4);
    estimate q=(4) noint;
    forecast back=12 lead=12 id=Quarter interval=qtr out=forecast3;
run;

Name of Variable = log_turnover,Name of Variable = log_turnover.1
Period(s) of Differencing,14.0
Mean of Working Series,0.001029
Standard Deviation,0.015395
Number of Observations,34.0
Observation(s) eliminated by differencing,5.0

Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise,Autocorrelation Check for White Noise
To Lag,Chi-Square,DF,Pr > ChiSq,Autocorrelations,Autocorrelations.1,Autocorrelations.2,Autocorrelations.3,Autocorrelations.4,Autocorrelations.5
6,6.83,6,0.3367,0.133,-0.001,-0.118,-0.348,-0.121,0.041

Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation,Conditional Least Squares Estimation
Parameter,Estimate,Standard Error,t Value,Approx Pr > |t|,Lag
"MA1,1",0.5377,0.15517,3.47,0.0015,4

0,1
Variance Estimate,0.0002
Std Error Estimate,0.014138
AIC,-192.131
SBC,-190.605
Number of Residuals,34.0

Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals,Autocorrelation Check of Residuals
To Lag,Chi-Square,DF,Pr > ChiSq,Autocorrelations,Autocorrelations.1,Autocorrelations.2,Autocorrelations.3,Autocorrelations.4,Autocorrelations.5
6,0.85,5,0.9735,0.053,0.016,-0.124,0.034,-0.041,0.017
12,3.62,11,0.9797,-0.081,-0.016,0.118,-0.072,0.065,-0.152
18,12.25,17,0.7847,-0.202,0.005,0.196,0.174,0.122,0.092
24,17.19,23,0.7998,-0.115,-0.075,0.061,-0.168,0.02,-0.031

Model for variable log_turnover,Model for variable log_turnover.1
Period(s) of Differencing,14

Moving Average Factors,Moving Average Factors.1
Factor 1:,1 - 0.5377 B**(4)

Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover,Forecasts for variable log_turnover
Obs,Forecast,Std Error,95% Confidence Limits,95% Confidence Limits.1,Actual,Residual
28,9.5504,0.0141,9.5227,9.5781,9.5644,0.014
29,9.5756,0.02,9.5364,9.6147,9.5969,0.0214
30,9.5796,0.0245,9.5316,9.6276,9.6061,0.0266
31,9.7461,0.0283,9.6907,9.8016,9.762,0.0159
32,9.5618,0.035,9.4931,9.6304,9.5878,0.026
33,9.5869,0.0407,9.5072,9.6666,9.6073,0.0204
34,9.591,0.0456,9.5015,9.6804,9.6023,0.0113
35,9.7575,0.0501,9.6593,9.8557,9.7477,-0.0098
36,9.5731,0.057,9.4614,9.6849,9.567,-0.0062
37,9.5983,0.0632,9.4745,9.7221,9.586,-0.0123


In this Autoregressive integrated Moving Average (ARIMA) model we use differencing instead of trend and seasonal components/dummy variables.  We use differencing to transform our data from a nonstationary time series (non constant mean or variance) to a stationary series.  In this time series we take a first difference and a fourth difference to account for the linear trend and seasonal pattern.  After some experimenting, I found this ARIMA model to perform the best.  In addition to the differencing, we also included a fourth-order moving average component in the model.  


In the autocorrelation function plots we see that there are no significant lags.  We can also see that this model passes the white noise test.  Just like the ESM model, the prediction intervals of the forecasted value continues to widen as move towards our holdout data because this model also becomes less precise as time passes.

Next we move to compare the three models and see which performed best.

In [151]:
title2 "Model Forecast Comparison";

data forecast;
    set forecast1;
    Time=Quarter;
    Regression=exp(forecast);
    Regression_Error = exp(residual);
    format Time yyq6.;
    
    set forecast2;
    Actual=exp(actual);
    ESM=exp(predict);
    ESM_Error = exp(error);
    
    set forecast3;
    ARIMA=exp(forecast);
    ARIMA_Error=exp(residual);

    keep Time Actual Regression ESM ARIMA Regression_Error ESM_Error ARIMA_Error;
run;


proc sgplot data=forecast;
    scatter x=Time y=Actual/ markerattrs=(color=Black);
    series x=Time y=Regression/ lineattrs=(color=Red);
    series x=Time y=ESM/ lineattrs=(color=Blue);
    series x=Time y=ARIMA/ lineattrs=(color=Green);
    refline "01Apr1989"d /axis=x lineattrs=(pattern=dash);
    xaxis grid;
run;

By eyeing the plot it isn't immediately obvious which model performed best.  The ESM model seems to have the worst performance of the three.  It's almost always the furthest point from the actual values in our holdout sample.  The regression and ARIMA model seem to perform fairly similar but I would say that the ARIMA model performs slightly better.  To me it seems to follow both the holdout sample and the fitted data best.  


### Discussion
---


In [152]:
ods proctitle;
title;
title2;