# 07 Multiple Regression

## 01 Introduction

### 1. Introduction

- This lecture discusses the case of regression analysis with multiple predictors. The news is mainly the model search aspect, namely among a set of potential descriptive variables to look for a subset that describes the response sufficiently well.
- 이 강의는 다중 예측 변수를 사용한 회귀 분석 사례에 대해 논의합니다. 주로 모델 탐색 측면을 다루며, 즉 응답을 충분히 설명하는 부분 집합을 찾기 위해 잠재적 설명 변수 집합 중에서 선택하는 것입니다.

- The basic model for multiple regression analysis is 

<img src="https://latex.codecogs.com/svg.image?y=\beta_0&plus;\beta_1&space;x_1&plus;\cdots&plus;\beta_k&space;x_k&plus;\epsilon">

where <img src="https://latex.codecogs.com/svg.image?x_1,\cdots,x_k"> are explanatory variables(also called predictors) and the parameters <img src="https://latex.codecogs.com/svg.image?\beta_1,\cdots,\beta_k"> can be estimated using the method of least squares.  
여기서 <img src='https://latex.codecogs.com/svg.image?x_1,\cdots,x_k'>는 설명 변수(또는 예측 변수)이며, 파라미터 <img src='https://latex.codecogs.com/svg.image?\beta_1,\cdots,\beta_k'>는 최소 제곱법을 사용하여 추정될 수 있습니다.


## 02 Model and Estimation

### 1. Linear Model

- One very general form for the model:  
<img src="https://latex.codecogs.com/svg.image?Y=f(X_1,X_2,X_3)&plus;\epsilon">

where f is some unknown function and ε is an error

- Since we usually don't have enough data to try to estimate f directly, we usually have to assume that it has some more restricted form, perhaps linear as in  
- 보통 우리는 f를 직접 추정하기에 충분한 데이터가 없기 때문에, 보다 제한된 형식을 갖는 것으로 가정해야 합니다. 아마도 선형적인 형태일 것입니다.
<img src="https://latex.codecogs.com/svg.image?Y=\beta_0&plus;\beta_1&space;X_1&plus;\beta_2&space;X_2&plus;\beta_3&space;X_3&plus;\epsilon">

- In a linear model the parameters enter linearly - the predictors do not have to be linear.
- 선형 모델에서는 모수(파라미터)가 선형적으로 입력되며, 예측 변수는 선형적일 필요가 없습니다.

### 2. Matrix Representation

- Given the actual data, we may write:  
<img src="https://latex.codecogs.com/svg.image?Y=\beta_0&plus;\beta_1&space;X_1&plus;\beta_2&space;X_2&plus;\beta_3&space;X_3&plus;\epsilon">

- Let

<img src="https://latex.codecogs.com/svg.image?Y=\begin{pmatrix}y_1\\y_2\\\vdots\\y_n\end{pmatrix}">
<img src="https://latex.codecogs.com/svg.image?X=\begin{pmatrix}1&x_{11}&x_{12}&x_{13}\\1&x_{21}&x_{22}&x_{23}\\1&\cdots&\cdots&\cdots\\1&x_{n1}&x_{n2}&x_{n3}\\\end{pmatrix}">
<img src="https://latex.codecogs.com/svg.image?\epsilon=\begin{pmatrix}\epsilon_1\\\epsilon_2\\\vdots\\\epsilon_n\end{pmatrix}"> <br>

<img src="https://latex.codecogs.com/svg.image?Y=X\beta&plus;\epsilon">

### 3. Least squares estimation

- Least square estimate of β, called <img src="https://latex.codecogs.com/svg.image?\hat\beta"> minimizes SSE

<img src="https://latex.codecogs.com/svg.image?\sum{\epsilon_i^2}=\epsilon^T\epsilon=(y-X\beta)^T(y-X\beta)"> <br>

편미분값을 0으로 만드는 β 값을 찾으면 된다.

<img src="https://latex.codecogs.com/svg.image?\frac{\partial}{\partial\beta}(y-X\beta)^T(y-X\beta)"> <br>
<img src="https://latex.codecogs.com/svg.image?=-2X^TY&plus;2X^TX\beta=0"> <br>

- Differentiating with respect to β and setting to zero, we find that <img src="https://latex.codecogs.com/svg.image?\hat{\beta}"> satisfies
- β에 대해 미분하고 그 결과가 0이 되는 β 값이 <img src="https://latex.codecogs.com/svg.image?\hat{\beta}">입니다.

<img src="https://latex.codecogs.com/svg.image?\hat{\beta}=(X^T&space;X)^{-1}X^T&space;Y">

- Predicted values: <img src="https://latex.codecogs.com/svg.image?\hat{y}=X\hat{\beta}=X(X^T&space;X)^{-1}X^T&space;y=Hy,\;\;\;H=X(X^T&space;X)^{-1}X^T">  
Residuals: <img src="https://latex.codecogs.com/svg.image?\hat{\epsilon}=y-X\hat{\beta}=y-\hat{y}=(I-H)y">  
Residual sum of squares: <img src="https://latex.codecogs.com/svg.image?\hat{\epsilon}^T\hat{\epsilon}=y^T(I-H)(I-H)y=y^T(I-H)y">

- Assume the errors are uncorrelated and have equal variance, <img src="https://latex.codecogs.com/svg.image?Var(\epsilon)=I\sigma^2">
- 오차가 상관되지 않고 동일한 분산을 가졌다고 가정하면, <img src="https://latex.codecogs.com/svg.image?Var(\epsilon)=I\sigma^2">

### 4. Mean and variance of <img src="https://latex.codecogs.com/svg.image?\hat{\beta}">

<img src="https://latex.codecogs.com/svg.image?\hat{\beta}=(X^T&space;X)^{-1}X^T&space;Y">

- Mean <img src="https://latex.codecogs.com/svg.image?E\hat{\beta}=(X^T&space;X)^{-1}X^T&space;X\beta=\beta"> unbiased
- <img src="https://latex.codecogs.com/svg.image?Var(\hat{\beta})=Var(Ay)=A\cdot&space;Var(y)A^T">
<img src="https://latex.codecogs.com/svg.image?=(X^T&space;X)^{-1}X^T\sigma^2&space;I&space;X(X^T&space;X)^{-1}=(X^T&space;X)^{-1}\sigma^2">

- Standard error of <img src="https://latex.codecogs.com/svg.image?\hat{\beta_i}:\;\;se(\hat{\beta_i})=\sqrt{(X^T&space;X)_{ii}^{-1}\hat{\sigma}}">

### 5. Estimating <img src="https://latex.codecogs.com/svg.image?\sigma^2">

- ANOVA Table

||SS|Df|MS|F-value|
|-|-|-|-|-|
|Regress|SSR|p|MSR|MSR/MSE|
|Error|SSE|n-p-1|MSE||
|Total|SST|n-1|||

<br>
<img src="https://latex.codecogs.com/svg.image?\hat{\sigma}^2=\frac{SSE}{n-p-1}:MSE">  

- Coefficient of determination: <img src="https://latex.codecogs.com/svg.image?R^2=\frac{SSR}{SST}">


### 6. Example

```R
result = lm(y~a+b+c+d)
summary(result)
anova(result)
```

## 03 Inference: Example

### 1. Recall: The model

- Model

<img src="https://latex.codecogs.com/svg.image?y=X\beta&plus;\epsilon">

- We assume that the errors are independent and identically normally distributed with mean 0 and variance <img src="https://latex.codecogs.com/svg.image?y=\sigma^2">, i.e.

<img src="https://latex.codecogs.com/svg.image?\epsilon\sim&space;N(0,\sigma^2&space;I)"> <br>
<img src="https://latex.codecogs.com/svg.image?y\sim&space;N(X\beta,\sigma^2&space;I)">

### 2. Examples

- Let's illustrate this test and others using an old economic dataset on 50 different countries. These data are averages over 1960-1970 (to remove business cycle or other short-term fluctuations).
- 이 테스트와 다른 테스트를 설명하기 위해 1960년부터 1970년까지 50개 국가에 대한 오래된 경제 데이터를 사용해보겠습니다. 이 데이터는 비즈니스 사이클이나 기타 단기적 변동을 제거하기 위해 1960년부터 1970년까지의 평균입니다.

- dpi is per-capita disposable income in U.S. dollars; ddpi is the percent rate of change in per capita disposable income; sr is aggregate personal saving divided by disposable income. The percentage population under 15(pop 15) and over 75(pop 75) are also recorded. The data come from Belsley, Kuh, and Welsch(1980).  
- dpi는 1인당 처분가능소득(달러)이고; ddpi는 1인당 처분가능소득의 변화율입니다. sr은 처분가능소득에 대한 종합 개인 저축입니다. 15세 미만 인구 백분율(pop 15)과 75세 이상 인구 백분율(pop 75)도 기록되어 있습니다. 이 데이터는 Belsley, Kuh 및 Welsch(1980)에서 가져온 것입니다.