# Longley's Economic Regression Data

To demonstrate multiple linear regression, we're going to use the [`longley`](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/longley.html) dataset from the R [`datasets`](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html) package. It is a macroeconomic dataset which provides a well-known example for a highly collinear regression. For convenience, a copy of this dataset is provided at http://cobweb.cs.uga.edu/~mec/longley.csv. First, let's load in the data using a [`Relation`](http://cobweb.cs.uga.edu/~jam/scalation_1.3/scalation_mathstat/target/scala-2.12/api/scalation/relalgebra/Relation$.html) to see what's available:

In [None]:
import scalation.relalgebra.Relation
val url = "http://cobweb.cs.uga.edu/~mec/longley.csv"
val rel = Relation(url, "longley", "SDDDDDDD", 0, ",")
rel.show()

Suppose we want to model `Employed` using the other variables in a multiple linear regression. We first need to create the design matrix `x` and response vector `y` from the `Relation`. Then we create and train a `Regression` model.

In [None]:
import scalation.analytics.Regression
val (x, y) = rel.toMatriDD((1 to 6).toSeq, 7)
val rg = new Regression(x, y)
rg.train()
rg.report()

The resulting model is known to be highly collinear, as evidenced by the large p-values in the table.

## References

* J. W. Longley (1967) An appraisal of least-squares programs from the point of view of the user. *Journal of the American Statistical Association* 62, 819–841.
* Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) *The New S Language.* Wadsworth & Brooks/Cole.