# Chapter 10. Basic Regression Analysis with Time Series Data

In this chapter, we begin to study the properties of OLS for estimating linear regression models using time series data. In Section 10-1, we discuss some conceptual differences between time series and cross-sectional data. Section 10-2 provides some examples of time series regressions that are often estimated in the empirical social sciences. We then turn our attention to the finite sample prop- erties of the OLS estimators and state the Gauss-Markov assumptions and the classical linear model assumptions for time series regression. Although these assumptions have features in common with those for the cross-sectional case, they also have some significant differences that we will need to highlight.

In addition, we return to some issues that we treated in regression with cross-sectional data, such as how to use and interpret the logarithmic functional form and dummy variables. The important top- ics of how to incorporate trends and account for seasonality in multiple regression are taken up in Section 10-5.

## 10.1 The Nature of Time Series Data

An obvious characteristic of time series data that distinguishes them from cross-sectional data is tem- poral ordering. For example, in Chapter 1, we briefly discussed a time series data set on employment, the minimum wage, and other economic variables for Puerto Rico. In this data set, we must know that the data for 1970 immediately precede the data for 1971. For analyzing time series data in the social sciences, we must recognize that the past can affect the future, but not vice versa.

Another difference between cross-sectional and time series data is more subtle. In Chapters 3 and 4, we studied statistical properties of the OLS estimators based on the notion that samples were randomly drawn from the appropriate population. Understanding why cross-sectional data should be viewed as random outcomes is fairly straightforward: a different sample drawn from the population will generally yield different values of the independent and dependent variables (such as education, experience, wage, and so on). Therefore, the OLS estimates computed from different random samples will generally differ, and this is why we consider the OLS estimators to be random variables.

How should we think about randomness in time series data? Certainly, economic time series sat- isfy the intuitive requirements for being outcomes of random variables. For example, today we do not know what the Dow Jones Industrial Average will be at the close of the next trading day. We do not know what the annual growth in output will be in Canada during the coming year. Since the outcomes of these variables are not foreknown, they should clearly be viewed as random variables.

Formally, a sequence of random variables indexed by time is called a stochastic process or a time series process . ("Stochastic" is a synonym for random.) When we collect a time series data set, we obtain one possible outcome, or realization , of the stochastic process. We can only see a single realization because we cannot go back in time and start the process over again. (This is analogous to cross-sectional analysis where we can collect only one random sample.) However, if certain conditions in history had been different, we would generally obtain a different realization for the stochastic process, and this is why we think of time series data as the outcome of random variables. The set of all possible realizations of a time series process plays the role of the population in cross-sectional analysis. The sample size for a time series data set is the number of time periods over which we observe the variables of interest.

## 10.2 Examples of Time Series Regression Models

In this section, we discuss two examples of time series models that have been useful in empirical time series analysis and that are easily estimated by ordinary least squares. We will study additional models in Chapter 11.

## 10-2a Static Models

Suppose that we have time series data available on two variables, say y and z , where $y_t$ and $z_t$ are dated contemporaneously. A static model relating y to z is

\begin{equation}
y_t=\beta_0+\beta_1 z_t+u_t. t=1,2,\ldots,n \tag{10.1}
\end{equation}

The name "static model" comes from the fact that we are modeling a contemporaneous relationship between y and z . Usually, a static model is postulated when a change in z at time t is believed to have an immediate effect on $y:\Delta y_t=\beta_1 \Delta z_t, when \Delta u_t=0$. Static regression models are also used when we are interested in knowing the tradeoff between y and z .

Naturally, we can have several explanatory variables in a static regression model. Let $mrdrte_t$ denote the murders per 10,000 people in a particular city during year t , let $convrte_t$ denote the murder conviction rate, let $unem_t$ be the local unemployment rate, and let $yngmle_t$ be the fraction of the population consisting of males between the ages of 18 and 25. Then, a static multiple regression model explaining murder rates is

\begin{equation}
mrdrte_t=\beta_0+\beta_1 convrte_t+\beta_2 unem_t+\beta_3 yngmle_t+u_t \tag{10.3}
\end{equation}

Using a model such as this, we can hope to estimate, for example, the ceteris paribus effect of an increase in the conviction rate on a particular criminal activity.

## 10-2b Finite Distributed Lag Models

In a finite distributed lag (FDL) model , we allow one or more variables to affect y with a lag. For example, for annual observations, consider the model

\begin{equation}
gfr_t=\alpha_0+\delta_0 pe_t+\delta_1 pe_{t-1}+\delta_2 pe_{t-2}+u_t \tag{10.4}
\end{equation}

where $gfr_t$ is the general fertility rate (children born per 1,000 women of childbearing age) and $pe_t$ is the real dollar value of the personal tax exemption. The idea is to see whether, in the aggregate, the decision to have children is linked to the tax value of having a child. Equation (10.4) recognizes that, for both biological and behavioral reasons, decisions to have children would not immediately result from changes in the personal exemption.

Equation (10.4) is an example of an FDL model of order two in which $\delta_0$ is the immediate change in y due to the one-unit increase in z at time t . $\delta_0$ is usually called the impact propensity or impact multiplier .

Similarly, $\delta_1=y_{t+1}-y_{t-1}$ is the change in y one period after the temporary change and $\delta_2=y_{t+2}-y_{t-1}$  is the change in y two periods after the change. At time t+3, y has reverted back to its initial level: $y_{t+3}=y_{t-1}$ . This is because we have assumed that only two lags of z appear in (10.5). When we graph the $\delta_j$as a function of j , we obtain the lag distribution , which summarizes the dynamic effect that a temporary increase in z has on y

The sum of the coefficients on current and lagged z , $\delta_0+\delta_1+\delta_2$ , is the long-run change in y given a permanent increase in z and is called the long-run propensity (LRP) or long-run multiplier . The LRP is often of interest in distributed lag models.

As an example, in equation (10.4), $\delta_0$ measures the immediate change in fertility due to a one-dollar increase in pe . As we mentioned earlier, there are reasons to believe that $\delta_0$ is small, if not zero. But $\delta_1$ or $\delta_2$ , or both, might be positive. If pe permanently increases by one dollar, then, after two years, gfr will have changed by $\delta_0+\delta_1+\delta_2$ . This model assumes that there are no further changes after two years. Whether this is actually the case is an empirical matter.

## 10-3 Finite Sample Properties of OLS under Classical Assumptions

In this section, we give a complete listing of the finite sample, or small sample, properties of OLS under standard assumptions. We pay particular attention to how the assumptions must be altered from our cross-sectional analysis to cover time series regressions. Refer to Wooldridge 2016 for a detailed development.

## 10-3a Unbiadsedness of OLS

### Assumption TS.1 Linear in Parameters

The first assumption simply states that the time series process follows a model that is linear in its parameters.

We should think of Assumption TS.1 as being essentially the same as Assumption MLR.1 (the first cross-sectional assumption), but we are now specifying a linear model for time series data.

### Assumption TS.2 No Perfect Collinearity

In the sample (and therefore in the underlying time series process), no independent variable is constant nor a perfect linear combination of the others.

We discussed this assumption at length in the context of cross-sectional data in Chapter 3. The issues are essentially the same with time series data. Remember, Assumption TS.2 does allow the explanatory variables to be correlated, but it rules out perfect correlation in the sample.

### Assumption TS.3 Zero Conditional Mean

For each t , the expected value of the error $u_t$ , given the explanatory variables for all time periods, is zero. Mathematically,

This is a crucial assumption, and we need to have an intuitive grasp of its meaning. As in the cross- sectional case, it is easiest to view this assumption in terms of uncorrelatedness: Assumption TS.3 implies that the error at time t , $u_t$ , is uncorrelated with each explanatory variable in every time period. The fact that this is stated in terms of the conditional expectation means that we must also correctly specify the functional relationship between y t and the explanatory variables. If $u_t$ is independent of X and $E(u_t)=0$, then Assumption TS.3 automatically holds.

It is important to see that Assumption TS.3 puts no restriction on correlation in the independent variables or in the $u_t$ across time. Assumption TS.3 only says that the average value of $u_t$ is unrelated to the independent variables in all time periods.

Anything that causes the unobservables at time t to be correlated with any of the explanatory variables in any time period causes Assumption TS.3 to fail. Two leading candidates for failure are omitted variables and measurement error in some of the regressors. But the strict exogeneity assumption can also fail for other, less obvious reasons.

Assumption TS.3 requires not only that u t and z t are uncorrelated, but that u t is also uncorrelated with past and future values of z . z can have no lagged effect on y . If z does have a lagged effect on y , then we should estimate a distributed lag model.

Explanatory variables that are strictly exogenous cannot react to what has happened to y in the past. A factor such as the amount of rainfall in an agricultural production function satisfies this requirement: rainfall in any future year is not influenced by the output during the current or past years. But something like the amount of labor input might not be strictly exogenous, as it is chosen by the farmer, and the farmer may adjust the amount of labor based on last year's yield. Policy variables, such as growth in the money supply, expenditures on welfare, and highway speed limits, are often influenced by what has happened to the outcome variable in the past. In the social sciences, many explanatory variables may very well violate the strict exogeneity assumption.

### Theorem 10.1 Unbiasedness of OLS

Under Assumptions TS.1, TS.2, and TS.3, the OLS estimators are unbiased conditional on X , and therefore unconditionally as well when the expectations exist: $E(\hat \beta_j = \beta_j,j=0,1,\ldots,k)$

## 10-3b The Variances of the OLS Estimators and the Gauss-Markov Theorem

We need to add two assumptions to round out the Gauss-Markov assumptions for time series regres- sions. The first one is familiar from cross-sectional analysis.

### Assumption TS.4 Homoskedasticity

This assumption means that $Var(u_i|X)$ cannot depend on X- it is sufficient that $u_t$ and X are independent - and that Var(u_t) is constant over time. When TS.4 does not hold, we say that the errors are heteroskedastic , just as in the cross-sectional case.

When $Var(u_t|X)$ does depend on X , it often depends on the explanatory variables at time t , $x_t$ . In Chapter 12, we will see that the tests for heteroskedasticity from Chapter 8 can also be used for time series regressions, at least under certain assumptions.

### Assumption TS.5 No Serial Correlation

Conditional on X, the errors in two different time periods are uncorrelated: $Corr(u_t,u_s|X)=0$ for all $t \neq s$

The easiest way to think of this assumption is to ignore the conditioning on X . Then, Assumption TS.5 is simply

\begin{equation}
Corr(u_i,u_s)=0,  t \neq s \tag{10.12}
\end{equation}

When (10.12) is false, we say that the errors in (10.8) suffer from serial correlation , or auto- correlation , because they are correlated across time.

Importantly, Assumption TS.5 assumes nothing about temporal correlation in the independent variables.

### Theorem 10.2 OLS Sampling Variances

Under the time series Gauss-Markov Assumptions TS.1 through TS.5, the variance of $\hat \beta_j$ , conditional on X , is

\begin{equation}
Var(\hat \beta_j|X)= \sigma^2/[SST_j(1-R_j^2)],j=1,\ldots,k \tag{10.13}
\end{equation}

Equation (10.13) is the same variance we derived in Chapter 3 under the cross-sectional Gauss-Markov assumptions.

### Theorem 10.3 Unbiased Estimation of $\sigma_2$

Under Assumptions TS.1 through TS.5, the estimator $\hat \sigma^2=SSR/df$ is an unbiased estimator of $\hat \sigma^2$ ,

### Theorem 10.4 Gauss-Markov Theorem

Under Assumptions TS.1 through TS.5, the OLS estimators are the best linear unbiased estimators conditional on X

The bottom line here is that OLS has the same desirable finite sample properties under TS.1 through TS.5 that it has under MLR.1 through MLR.5.

### Assumption TS.6 Normality

The errors $u_t$ are independent of X and are independently and identically distributed as $Normal(0,\sigma^2)$

Assumption TS.6 implies TS.3, TS.4, and TS.5, but it is stronger because of the independence and normality assumptions.

### Theorem 10.5 Normal Sampling Distributions

Under Assumptions TS.1 through TS.6, the CLM assumptions for time series, the OLS estimators are normally distributed, conditional on X . Further, under the null hypothesis, each t statistic has a t distribution, and each F statistic has an F distribution. The usual construction of confidence intervals is also valid.

The implications of Theorem 10.5 are of utmost importance. It implies that, when Assumptions TS.1 through TS.6 hold, everything we have learned about estimation and inference for cross-sectional regressions applies directly to time series regressions. Thus, t statistics can be used for testing statistical significance of individual explanatory variables, and F statistics can be used to test for joint significance.

Just as in the cross-sectional case, the usual inference procedures are only as good as the underlying assumptions. The classical linear model assumptions for time series data are much more restrictive than those for cross-sectional data -;in particular, the strict exogeneity and no serial correlation assumptions can be unrealistic. Nevertheless, the CLM framework is a good starting point for many applications.

### Wooldridge Example 10.2 Effects of Inflation and Deficits on Interest Rates

The data in INTDEF come from the 2004 Economic Report of the President (Tables B-73 and B-79) and span the years 1948 through 2003. The variable i3 is the three-month T-bill rate, inf is the annual inflation rate based on the consumer price index (CPI), and def is the federal budget deficit as a per- centage of GDP. The estimated model is

In [3]:
library(foreign)
intdef <- read.dta("https://github.com/thousandoaks/Wooldridge/blob/master/intdef.dta?raw=true")

# Linear regression of static model:
summary( lm(i3~inf+def,data=intdef)  )



Call:
lm(formula = i3 ~ inf + def, data = intdef)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.9948 -1.1694  0.1959  0.9602  4.7224 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.73327    0.43197   4.012  0.00019 ***
inf          0.60587    0.08213   7.376 1.12e-09 ***
def          0.51306    0.11838   4.334 6.57e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.843 on 53 degrees of freedom
Multiple R-squared:  0.6021,	Adjusted R-squared:  0.5871 
F-statistic: 40.09 on 2 and 53 DF,  p-value: 2.483e-11


These estimates show that increases in inflation or the relative size of the deficit increase short-term interest rates, both of which are expected from basic economics. For example, a ceteris paribus one percentage point increase in the inflation rate increases i3 by .605 points. Both inf and def are very statistically significant, assuming, of course, that the CLM assumptions hold.

## 10-4 Functional Form, Dummy Variables and Index Numbers

All of the functional forms we learned about in earlier chapters can be used in time series regressions. The most important of these is the natural logarithm: time series regressions with constant percentage effects appear often in applied work.

### Wooldridge Example 10.3. Puerto Rican Employment and the Minimum Wage

Annual data on the Puerto Rican employment rate, minimum wage, and other variables are used by Castillo-Freeman and Freeman (1992) to study the effects of the U.S. minimum wage on employment in Puerto Rico. A simplified version of their model gives

In [8]:
library(foreign)
prminwge <- read.dta("https://github.com/thousandoaks/Wooldridge/blob/master/prminwge.dta?raw=true")

# Linear regression of static model:
summary( lm(log(prepop)~log(mincov)+log(usgnp),data=prminwge)  )


Call:
lm(formula = log(prepop) ~ log(mincov) + log(usgnp), data = prminwge)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.117133 -0.036998 -0.005943  0.028182  0.113938 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) -1.05442    0.76541  -1.378   0.1771  
log(mincov) -0.15444    0.06490  -2.380   0.0229 *
log(usgnp)  -0.01219    0.08851  -0.138   0.8913  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.0557 on 35 degrees of freedom
Multiple R-squared:  0.6605,	Adjusted R-squared:  0.6411 
F-statistic: 34.04 on 2 and 35 DF,  p-value: 6.17e-09


The estimated elasticity of prepop with respect to mincov is -.154, and it is statistically significant with t=-2.37. Therefore, a higher minimum wage lowers the employment rate, something that classical economics predicts. The GNP variable is not statistically significant, but this changes when we account for a time trend in the next section.

We can use logarithmic functional forms in distributed lag models, too. For example, for quar- terly data, suppose that money demand $(M_t)$ and gross domestic product (GDP_t) are related by

\begin{equation}
log(M_t)=\alpha_0+\delta_0 log(GDP_t)+\delta_1 log(GDP_{t-1})+\delta_2 log(GDP_{t-2})+\delta_3 log(GDP_{t-3})+\delta_4 log(GDP_{t-4})+u_t 
\end{equation}

The impact propensity in this equation, $\delta_0$ , is also called the short-run elasticity : it measures the immediate percentage change in money demand given a 1% increase in GDP . The LRP, $\delta_0+\delta_1+\ldots+\delta_4$ , is sometimes called the long-run elasticity : it measures the percentage increase in money demand after four quarters given a permanent 1% increase in GDP

Binary or dummy independent variables are also quite useful in time series applications. Since the unit of observation is time, a dummy variable represents whether, in each time period, a certain event has occurred. For example, for annual data, we can indicate in each year whether a Democrat or a Republican is president of the United States by defining a variable $democ_t$ , which is unity if the president is a Democrat, and zero otherwise. Or, in looking at the effects of capital punishment on murder rates in Texas, we can define a dummy variable for each year equal to one if Texas had capital punishment during that year, and zero otherwise.

Often, dummy variables are used to isolate certain periods that may be systematically different from other periods covered by a data set.

### Wooldridge Example 10.4. Effects of Personal Exemption on Fertility Rates

The general fertility rate ( gfr ) is the number of children born to every 1,000 women of childbearing age. For the years 1913 through 1984, the equation,

\begin{equation}
gfr_t=\beta_0+\beta_1 pe_t+\beta_2 ww2_t+\beta_3 pill_t +u_t
\end{equation}

explains gfr in terms of the average real dollar value of the personal tax exemption ( pe ) and two binary variables. The variable ww2 takes on the value unity during the years 1941 through 1945, when the United States was involved in World War II. The variable pill is unity from 1963 onward, when the birth control pill was made available for contraception.

Using the data in FERTIL3, which were taken from the article by Whittington, Alm, and Peters (1990) we compute:

In [13]:
# Libraries for dynamic lm, regression table and F tests
library(foreign);library(dynlm);library(lmtest);library(car)
fertil3<-read.dta("https://github.com/thousandoaks/Wooldridge/blob/master/fertil3.dta?raw=true")

# Define Yearly time series beginning in 1913
tsdata <- ts(fertil3, start=1913)

# Linear regression of model using dynlm:
res <- dynlm(gfr ~ pe + ww2 + pill, data=tsdata)
coeftest(res)




t test of coefficients:

              Estimate Std. Error t value  Pr(>|t|)    
(Intercept)  98.681755   3.208129 30.7599 < 2.2e-16 ***
pe            0.082540   0.029646  2.7842  0.006944 ** 
ww2         -24.238395   7.458253 -3.2499  0.001797 ** 
pill        -31.594034   4.081068 -7.7416 6.455e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Res.Df,RSS,Df,Sum of Sq,F,Pr(>F)
69,16335.91,,,,
68,14664.27,1.0,1671.636,7.751578,0.006943926


Each variable is statistically significant at the 1% level against a two-sided alternative. We see that the fertil- ity rate was lower during World War II: given pe , there were about 24 fewer births for every 1,000 women of childbearing age, which is a large reduction. (From 1913 through 1984, gfr ranged from about 65 to 127.) Similarly, the fertility rate has been substantially lower since the introduction of the birth control pill.

The variable of economic interest is pe . The average pe over this time period is $100.40, ranging from zero to $243.83. The coefficient on pe implies that a $12.00 increase in pe increases gfr by about one birth per 1,000 women of childbearing age. This effect is hardly trivial.

In Section 10-2, we noted that the fertility rate may react to changes in pe with a lag. Estimating a distributed lag model with two lags gives

In [18]:
# Linear regression of model with lags:
res2 <- dynlm(gfr ~ pe + L(pe) + L(pe,2) + ww2 + pill, data=tsdata)
coeftest(res2)


t test of coefficients:

               Estimate  Std. Error t value  Pr(>|t|)    
(Intercept)  95.8704975   3.2819571 29.2114 < 2.2e-16 ***
pe            0.0726718   0.1255331  0.5789    0.5647    
L(pe)        -0.0057796   0.1556629 -0.0371    0.9705    
L(pe, 2)      0.0338268   0.1262574  0.2679    0.7896    
ww2         -22.1264975  10.7319716 -2.0617    0.0433 *  
pill        -31.3049888   3.9815591 -7.8625 5.634e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


The coefficients on the pe variables are estimated very imprecisely, and each one is individually insignificant. It turns out that there is substantial correlation between $pe,pe_{t-1},pe_{t-2}$ , and this multicollinearity makes it difficult to estimate the effect at each lag. However, $pe,pe_{t-1},pe_{t-2}$ are jointly significantly different from zero: the F statistic has a p-value=.0116 as computed as follows:

In [21]:
# F test. H0: all pe coefficients are=0
linearHypothesis(res2, matchCoefs(res2,"pe"))


Res.Df,RSS,Df,Sum of Sq,F,Pr(>F)
67,15459.75,,,,
64,13032.64,3.0,2427.104,3.972964,0.01165201


Thus, pe does have an effect on gfr [as we already saw in (10.18)], but we do not have good enough estimates to determine whether it is contemporaneous or with a one- or two-year lag (or some of each).

In [None]:
The estimated LRP, and whether it is different from zero, is computed as follows:

In [22]:
# Calculating the LRP
b<-coef(res2)
b["pe"]+b["L(pe)"]+b["L(pe, 2)"]

# F test. H0: LRP=0
linearHypothesis(res2,"pe + L(pe) + L(pe, 2) = 0")

Res.Df,RSS,Df,Sum of Sq,F,Pr(>F)
65,15358.41,,,,
64,13032.64,1.0,2325.765,11.42124,0.001240844


According to the previous F-test it is significantly different from zero with a p-value of 0.0012

Binary explanatory variables are the key component in what is called an event study. In an event study, the goal is to see whether a particular event influences some outcome.

Before we give an example of an event study, we need to discuss the notion of an index number and the difference between nominal and real economic variables. An index number typically aggregates a vast amount of information into a single quantity. Index numbers are used regularly in time series analysis, especially in macroeconomic applications. An example of an index number is the index of industrial production (IIP), computed monthly by the Board of Governors of the Federal Reserve. The IIP is a measure of production across a broad range of industries, and, as such, its magnitude in a particular year has no quantitative meaning. In order to interpret the magnitude of the IIP, we must know the base period and the base value . In the 1997 Economic Report of the President ( ERP ), the base year is 1987, and the base value is 100. (Setting IIP to 100 in the base period is just a convention; it makes just as much sense to set IIP 5 1 in 1987, and some indexes are defined with 1 as the base value.) Because the IIP was 107.7 in 1992, we can say that industrial production was 7.7% higher in 1992 than in 1987. We can use the IIP in any two years to compute the percentage difference in industrial output during those two years. For example, because IIP 5 61.4 in 1970 and IIP 5 85.7 in 1979, industrial production grew by about 39.6% during the 1970s.

Another important example of an index number is a price index , such as the CPI. We already used the CPI to compute annual inflation rates in Example 10.1. As with the industrial production index, the CPI is only meaningful when we compare it across different years (or months, if we are using monthly data). In the 1997 ERP , CPI 5 38.8 in 1970 and CPI 5 130.7 in 1990. Thus, the general price level grew by almost 237% over this 20-year period. (In 1997, the CPI is defined so that its average in 1982, 1983, and 1984 equals 100; thus, the base period is listed as 1982&#8211;1984.)

## 10-5 Trends and Seasonality

### 10-5a Characterizing Trending Time Series

Many economic time series have a common tendency of growing over time. We must recognize that some series contain a time trend in order to draw causal inference using time series data. Ignoring the fact that two sequences are trending in the same or opposite directions can lead us to falsely conclude that changes in one variable are actually caused by changes in another variable. In many cases, two time series processes appear to be correlated only because they are both trending over time for rea- sons related to other unobserved factors.

What kind of statistical models adequately capture trending behavior? One popular formulation is to write the series ${y_t}$ as

\begin{equation}
y_t=\alpha_0+\alpha_1 t+ e_t, t=1,2,\ldots,  \tag{10.24}
\end{equation}

where, in the simplest case, ${e_t}$ is an independent, identically distributed (i.i.d.) sequence with $E(et)=0$ and $Var(e_t)=\sigma^2_e$ . Note how the parameter a 1 multiplies time, t , resulting in a linear time trend .

Interpreting $a_1$ in (10.24) is simple: holding all other factors (those in $e_t$ ) fixed, $\alpha_1$ measures the change in $y_t$ from one period to the next due to the passage of time.

Another way to think about a sequence that has a linear time trend is that its average value is a linear function of time:

\begin{equation}
E(y_t)=\alpha_0+\alpha_1 t   \tag{10.25}
\end{equation}

Many economic time series are better approximated by an exponential trend , which follows when a series has the same average growth rate from period to period. In practice, an exponential trend in a time series is captured by modeling the natural logarithm of the series as a linear trend (assuming that $y_t>0$):

\begin{equation}
log(y_t)=\beta_0+\beta_1 t+ e_t, t=1,2,\ldots,  \tag{10.26}
\end{equation}

How do we interpret b 1 in (10.26)?. $\beta_1$ is approximately the average per period growth rate in $y_t$ . For example, if t denotes year and $\beta_1=.027$, then $y_t$ grows about 2.7% per year on average.

### 10-5b Using Trending Variables in Regression Analysis

Accounting for explained or explanatory variables that are trending is fairly straightforward in regression analysis. First, nothing about trending variables necessarily violates the classical linear model Assumptions TS.1 through TS.6. However, we must be careful to allow for the fact that unobserved, trending factors that affect $y_t$ might also be correlated with the explanatory variables. If we ignore this possibility, we may find a spurious relationship between $y_t$ and one
or more explanatory variables. The phenomenon of finding a relationship between two or more trending variables simply because each is growing over time is an example of a spurious regression problem . Fortunately, adding a time trend eliminates this problem.

For concreteness, consider a model where two observed factors, $x_{t1}$ and $x_{t2}$ , affect $y_t$ . In addition, there are unobserved factors that are systematically growing or shrinking over time. A model that captures this is

\begin{equation}
y_t=\beta_0+\beta_1 x_{t1}+\beta_2 x_{t2}+\beta_3 t+ u_t  \tag{10.31}
\end{equation}

This fits into the multiple linear regression framework with $x_{t3}=t$. Allowing for the trend in this equation explicitly recognizes that $y_t$ may be growing $(\beta_3 >0)$ or shrinking $(\beta_3 <0)$ over time for reasons essentially unrelated to $x_{t1}$ and $x_{t2}$ . If (10.31) satisfies assumptions TS.1, TS.2, and TS.3, then omitting t from the regression and regressing y t on $x_{t1}, x_{t2}$ will generally yield biased estimators of $\beta_1, \beta_2$ : we have effectively omitted an important variable, t , from the regression. This is especially true if $x_{t1},x_{t2}$ are themselves trending, because they can then be highly correlated with t . The next example shows how omitting a time trend can result in spurious regression.

### Wooldridge Example 10.7. Housing Investment and Prices

The data in HSEINV are annual observations on housing investment and a housing price index in the United States for 1947 through 1988. Let invpc denote real per capita housing investment (in thou- sands of dollars) and let price denote a housing price index (equal to 1 in 1982). A simple regression in constant elasticity form, which can be thought of as a supply equation for housing stock, gives

In [4]:
install.packages('stargazer')
library(foreign);library(dynlm);library(stargazer)

Installing package into '/home/nbuser/R'
(as 'lib' is unspecified)

Please cite as: 

 Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
 R package version 5.2.1. https://CRAN.R-project.org/package=stargazer 



In [16]:
hseinv <- read.dta("https://github.com/thousandoaks/Wooldridge/blob/master/hseinv.dta?raw=true")
# Define Yearly time series beginning in 1947
tsdata <- ts(hseinv, start=1947)

# Linear regression of model with lags:
res1 <- dynlm(log(invpc) ~ log(price)                , data=tsdata)
#res2 <- dynlm(log(invpc) ~ log(price) + trend(tsdata), data=tsdata)

# Pretty regression table
stargazer(res1,type="text")




                        Dependent variable:    
                    ---------------------------
                            log(invpc)         
-----------------------------------------------
log(price)                   1.241***          
                              (0.382)          
                                               
Constant                     -0.550***         
                              (0.043)          
                                               
-----------------------------------------------
Observations                    42             
R2                             0.208           
Adjusted R2                    0.189           
Residual Std. Error       0.155 (df = 40)      
F Statistic           10.530*** (df = 1; 40)   
Note:               *p<0.1; **p<0.05; ***p<0.01


The elasticity of per capita investment with respect to price is very large and statistically significant; it is not statistically different from one. We must be careful though. To account for the trending behaviour of the variables, we add a time trend into the model:

In [17]:
res2 <- dynlm(log(invpc) ~ log(price) + trend(tsdata), data=tsdata)

# Pretty regression table
stargazer(res2, type="text")



                        Dependent variable:    
                    ---------------------------
                            log(invpc)         
-----------------------------------------------
log(price)                    -0.381           
                              (0.679)          
                                               
trend(tsdata)                0.010***          
                              (0.004)          
                                               
Constant                     -0.913***         
                              (0.136)          
                                               
-----------------------------------------------
Observations                    42             
R2                             0.341           
Adjusted R2                    0.307           
Residual Std. Error       0.144 (df = 39)      
F Statistic           10.080*** (df = 2; 39)   
Note:               *p<0.1; **p<0.05; ***p<0.01


The story is much different now: the estimated price elasticity is negative and not statistically different from zero. The time trend is statistically significant, and its coefficient implies an approximate 1% increase in invpc per year, on average. From this analysis, we cannot conclude that real per capita housing investment is influenced at all by price. There are other factors, captured in the time trend, that affect invpc, but we have not modeled these.

The results in the first model show a spurious relationship between invpc and price due to the fact both invpc and price are trending upward over time. In particular, if we regress log(invpc) on t, refer to the following code , we obtain a coefficient on the trend equal to .0081; the regression of log(price) on t yields a trend coefficient equal to .0044.

In [20]:
resinvpct <- dynlm(log(invpc) ~ t, data=tsdata)
respricet <-dynlm(log(price)~ t, data=tsdata)
stargazer(resinvpct,respricet,type="text")



                                  Dependent variable:     
                              ----------------------------
                                log(invpc)    log(price)  
                                   (1)            (2)     
----------------------------------------------------------
t                                0.008***      0.004***   
                                 (0.002)       (0.0004)   
                                                          
Constant                        -0.841***      -0.188***  
                                 (0.045)        (0.011)   
                                                          
----------------------------------------------------------
Observations                        42            42      
R2                                0.335          0.729    
Adjusted R2                       0.319          0.722    
Residual Std. Error (df = 40)     0.142          0.033    
F Statistic (df = 1; 40)        20.190***     107.566**

In some cases, adding a time trend can make a key explanatory variable more significant. This can happen if the dependent and independent variables have different kinds of trends (say, one upward and one downward), but movement in the independent variable about its trend line causes movement in the dependent variable away from its trend line.

### Wooldridge Example 10.8. Fertility Equation

If we add a linear time trend to the fertility equation (10.18), we obtain

In [23]:
# Libraries for dynamic lm, regression table and F tests
library(foreign);library(dynlm);library(lmtest);library(car)
fertil3<-read.dta("https://github.com/thousandoaks/Wooldridge/blob/master/fertil3.dta?raw=true")

# Define Yearly time series beginning in 1913
tsdata <- ts(fertil3, start=1913)

# Linear regression of model using dynlm:
res3 <- dynlm(gfr ~ pe + ww2 + pill+ trend(tsdata), data=tsdata)
stargazer(res3,type="text")


                        Dependent variable:    
                    ---------------------------
                                gfr            
-----------------------------------------------
pe                           0.279***          
                              (0.040)          
                                               
ww2                         -35.592***         
                              (6.297)          
                                               
pill                           0.997           
                              (6.262)          
                                               
trend(tsdata)                -1.150***         
                              (0.188)          
                                               
Constant                    111.769***         
                              (3.358)          
                                               
-----------------------------------------------
Observations                    72     

The coefficient on pe is more than triple the estimate from Example 10.4, and it is much more statistically significant. Interestingly, pill is not significant once an allowance is made for a linear trend. As can be seen by the estimate, gfr was falling, on average, over this period, other factors being equal.

The coefficient on pe is more than triple the estimate from (10.18), and it is much more statistically significant. Interestingly, pill is not significant once an allowance is made for a linear trend. As can be seen by the estimate, gfr was falling, on average, over this period, other factors being equal.

### 10-5e Seasonality

If a time series is observed at monthly or quarterly intervals (or even weekly or daily), it may exhibit seasonality . For example, monthly housing starts in the Midwest are strongly influenced by weather. Although weather patterns are somewhat random, we can be sure that the weather during January will usually be more inclement than in June, and so housing starts are generally higher in June than in January. One way to model this phenomenon is to allow the expected value of the series, y t , to be different in each month. As another example, retail sales in the fourth quarter are typically higher than in the previous three quarters because of the Christmas holiday. Again, this can be captured by allowing the average retail sales to differ over the course of a year. This is in addition to possibly allowing for a trending mean. For example, retail sales in the most recent first quarter were higher than retail sales in the fourth quarter from 30 years ago, because retail sales have been steadily growing. Nevertheless, if we compare average sales within a typical year, the seasonal holiday factor tends to make sales larger in the fourth quarter.

Even though many monthly and quarterly data series display seasonal patterns, not all of them do. For example, there is no noticeable seasonal pattern in monthly interest or inflation rates. In addi- tion, series that do display seasonal patterns are often seasonally adjusted before they are reported for public use. A seasonally adjusted series is one that, in principle, has had the seasonal factors removed from it. Seasonal adjustment can be done in a variety of ways, and a careful discussion is beyond the scope of this text. [See Harvey (1990) and Hylleberg (1992) for detailed treatments.]

Seasonal adjustment has become so common that it is not possible to get seasonally unadjusted data in many cases. Quarterly U.S. GDP is a leading example. In the annual Economic Report of the President , many macroeconomic data sets reported at monthly frequencies (at least for the most recent years) and those that display seasonal patterns are all seasonally adjusted. The major sources for macroeconomic time series, including Citibase , also seasonally adjust many of the series. Thus, the scope for using our own seasonal adjustment is often limited.

Sometimes, we do work with seasonally unadjusted data, and it is useful to know that simple methods are available for dealing with seasonality in regression models. Generally, we can include a set of seasonal dummy variables to account for seasonality in the dependent variable, the independent variables, or both.

The approach is simple. Suppose that we have monthly data, and we think that seasonal patterns within a year are roughly constant across time. For example, since Christmas always comes at the same time of year, we can expect retail sales to be, on average, higher in months late in the year than in earlier months. Or, since weather patterns are broadly similar across years, housing starts in the Midwest will be higher on average during the summer months than the winter months. A general model for monthly data that captures these phenomena is

\begin{equation}
y_t=\beta_0+\delta_1 feb_t+\delta_2 mar_t+\delta_3 apr_t+\ldots+\delta_11 dec_t+\beta_1 x_{t1}+\ldots+\beta_k x_{tk}+ u_t  \tag{10.41}
\end{equation}

where $feb_t,mar_t,\dots,dec_t$ are dummy variables indi- cating whether time period t corresponds to the appropriate month. In this formulation, January is the base month, and $\beta_0$ is the intercept for January. If there is no seasonality in y t , once the $x_{tj}$ have been controlled for, then $\delta_1$ through $\delta_{11}$ are all zero. This is easily tested via an F test.

### Wooldridge Example 10.11 Effects of Antidumping Fillings

In Example 10.5 (refer to Wooldridge 2016), we used monthly data (in the file BARIUM) that have not been seasonally adjusted. Therefore, we should add seasonal dummy variables to make sure none of the important conclusions change. It could be that the months just before the suit was filed are months where imports are higher or lower, on average, than in other months. When we add the 11 monthly dummy variables as in (10.41) we find that the seasonal dummies are jointly insignificant. In addition, nothing important changes in the estimates once statistical significance is taken into account.

In [33]:
library(foreign);library(dynlm);library(lmtest)

In [35]:

barium <- read.dta("https://github.com/thousandoaks/Wooldridge/blob/master/barium.dta?raw=true")

# Define monthly time series beginning in Feb. 1978
tsdata <- ts(barium, start=c(1978,2), frequency=12)


## the command dynlm automatically creates and adds the appropriate dummies when using the expression session(tsobj)

res <- dynlm(log(chnimp) ~ log(chempi)+log(gas)+log(rtwex)+befile6+
                          affile6+afdec6+ season(tsdata) , data=tsdata )
summary(res)


Time series regression with "ts" data:
Start = 1978(2), End = 1988(12)

Call:
dynlm(formula = log(chnimp) ~ log(chempi) + log(gas) + log(rtwex) + 
    befile6 + affile6 + afdec6 + season(tsdata), data = tsdata)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.98535 -0.36207  0.07366  0.41786  1.37734 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)       16.779215  32.428645   0.517   0.6059    
log(chempi)        3.265062   0.492930   6.624 1.24e-09 ***
log(gas)          -1.278140   1.389008  -0.920   0.3594    
log(rtwex)         0.663045   0.471304   1.407   0.1622    
befile6            0.139703   0.266808   0.524   0.6016    
affile6            0.012632   0.278687   0.045   0.9639    
afdec6            -0.521300   0.301950  -1.726   0.0870 .  
season(tsdata)Feb -0.417711   0.304444  -1.372   0.1728    
season(tsdata)Mar  0.059052   0.264731   0.223   0.8239    
season(tsdata)Apr -0.451483   0.268386  -1.682   0.0953 .  
season(ts

If the data are quarterly, then we would include dummy variables for three of the four quarters, with the omitted category being the base quarter. Sometimes, it is useful to interact seasonal dummies with some of the $x_{tj}$ to allow the effect of $x_{tj}$ on $y_t$ to differ across the year. Just as including a time trend in a regression has the interpretation of initially detrending the data, including seasonal dummies in a regression can be interpreted as deseasonalizing the data.