
## 最小二乘(OLS)回归
### 公式推导
中间的一些公式及定理可以参考[随机向量](https://kkdominant.github.io/posts/652.html)
设$Y$为因变量,对$Y$有影响的自变量有$p - 1$个,${X}_{1},\cdots ,{X}_{p - 1}$,它们之间有线性关系
$$
	Y = {\beta }_{0} + {\beta }_{1}{X}_{1} + \cdots  + {\beta }_{p - 1}{X}_{p - 1} + e,
$$


$e$为随机误差,${\beta }_{0},{\beta }_{1},\cdots ,{\beta }_{p - 1}$为未知回归参数,${\beta }_{0}$称为常数项,${\beta }_{1},\cdots ,{\beta }_{p - 1}$称为回归系数。

设$\left( { {X}_{1},\cdots ,{X}_{p - 1},Y}\right) $的$n$组观察值$\left( { {x}_{i1},\cdots ,{x}_{ {ip} - 1},{y}_{i} }\right) ,i = 1,2,\allowbreak \cdots ,n$,则有

$${y}_{i} = {\beta }_{0} + {\beta }_{1}{x}_{i1} + \cdots  + {\beta }_{p - 1}{x}_{ {ip} - 1} + {e}_{i},\;i = 1,2,\cdots ,n.$$



误差${e}_{1},\cdots ,{e}_{n}$满足Gauss-Markov (G-M) 假定:

$$\left\{  \begin{array}{lll} \left( a\right) & E\left( {e}_{i}\right)  = 0; \\  \left( b\right) & \operatorname{Var}\left( {e}_{i}\right)  = {\sigma }^{2}; \\  \left( c\right) & \operatorname{Cov}\left( { {e}_{i},{e}_{j} }\right)  = 0,i \neq  j. \end{array}\right.$$



将模型用矩阵表示:

$\left( \begin{matrix} {y}_{1} \\  {y}_{2} \\  \vdots \\  {y}_{n} \end{matrix}\right)  = \left( \begin{matrix} 1 & {x}_{11} & {x}_{12} & \cdots & {x}_{1,p - 1} \\  1 & {x}_{21} & {x}_{22} & \cdots & {x}_{2,p - 1} \\  \vdots & \vdots & \vdots & \vdots & \vdots \\  1 & {x}_{n,1} & {x}_{n,2} & \cdots & {x}_{n,p - 1} \end{matrix}\right) \left( \begin{matrix} {\beta }_{0} \\  {\beta }_{1} \\  \vdots \\  {\beta }_{p - 1} \end{matrix}\right)  + \left( \begin{matrix} {e}_{1} \\  {e}_{2} \\  \vdots \\  {e}_{n} \end{matrix}\right)$

即:${y}_{n \times  1} = {X}_{n \times  p}{\beta }_{p \times  1} + {e}_{n \times  1},$



此处$e$满足G-M 假定:$E\left( e\right)  = 0,\;\operatorname{Cov}\left( e\right)  = {\sigma }^{2}I.$


此处$y$为$n \times  1$观察向量,$X$为$n \times  p$设计阵,$\beta $为$p \times  1$回归参数向量,$e$为$n \times  1$随机误差向 量,$\beta $和${\sigma }^{2}$未知,我们目的是求$\beta $和${\sigma }^{2}$的估计。

记$Q\left( \beta \right) =\left( y-X\beta \right) ^{\prime}\left( y-X\beta \right) $

$$
\frac{\partial Q\left( \beta \right)}{\beta}=0\Longleftrightarrow -X^{\prime}y+X^{\prime}X\beta =0
$$

$$	\,\,\hat{\beta}^{ols}=\left( X^{\prime}X \right) ^{-1}X^{\prime}y$$

###  多元线性回归$\hat{\beta}$的最小二乘估计(LS)的矩

设$\widehat{\beta }$为$\beta $的 LS 估计,则有

$$E\left( \widehat{\beta }\right)  = \beta ,\operatorname{Cov}\left( \widehat{\beta }\right)  = {\sigma }^{2}{\left( {X}^{\prime }X\right) }^{-1}.$$
在多元线性回归模型中:
$\hat{\beta}=(X^\prime X)^{-1}X'Y$假设$y_i$互不相关方差为常量$\sigma^2,E\left( Y \right) =E\left( X\beta +e \right) =X\beta$ 
	证明:  由$\hat{\beta}=(X^\prime X)^{-1}X'Y$

根据定理$\mathrm{COV}\left( AX,BY \right) =A\mathrm{COV}\left( X,Y \right) B^{\prime}$可得: 

$$
	\begin{align*}
		E\left( \hat{\beta} \right) &=(X^{\prime}X)^{-1}X^{\prime}E\left( Y \right) =(X^{\prime}X)^{-1}X^{\prime}X\beta =\beta \\
		\mathrm{Var}\left( \hat{\beta} \right) &=\mathrm{Cov}\left( \hat{\beta},\hat{\beta} \right) 
		&\\
		&=\mathrm{COV}\left( (X^{\prime}X)^{-1}X^{\prime}Y,(X^{\prime}X)^{-1}X^{\prime}Y \right) 
		&\\
		&=(X^{\prime}X)^{-1}X^{\prime}\mathrm{Cov}\left( Y,Y \right) \left( (X^{\prime}X)^{-1}X^{\prime} \right) ^{\prime}
		&\\
		&=\sigma ^2(X^{\prime}X)^{-1}X^{\prime}X(X^{\prime}X)^{-1}
		&\\
		&=\sigma ^2(X^{\prime}X)^{-1}
		&
		\end{align*}
$$
		
因此$\hat{\beta}$是$\beta$的无偏估计 



## 回归模型拟合
### lm函数
1. 用法

```r
lm(formula, data, subset, weights, na.action,
   method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,
   singular.ok = TRUE, contrasts = NULL, offset, ...)
```
formula为拟合函数,例如:$y\sim x_1+x_2$
在`R语言实战`这本书中[^1] 有各个参数的用法
![](https://kkdominant.oss-cn-shanghai.aliyuncs.com/img/202406192148257.png)

In [4]:
x<- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
y <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
fit<-lm(x~y)
summary(fit)


Call:
lm(formula = x ~ y)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.81190 -0.35012  0.09068  0.33302  0.71789 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.5992     1.0904   6.052 0.000305 ***
y            -0.3362     0.2309  -1.456 0.183512    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5499 on 8 degrees of freedom
Multiple R-squared:  0.2095,	Adjusted R-squared:  0.1106 
F-statistic:  2.12 on 1 and 8 DF,  p-value: 0.1835


* Residuals后面是残差的四分位数

* Coefficients是拟合系数及p值

* Residual standard error残差标准差

* Multiple R-squared Multiple R-squared表示模型拟合的度量
* F-statistic  F-统计量，表示模型的显著性

这里F-statistic的p值大于0.05，表明模型整体不显著


## 多项式回归
拟合模型$y=\hat{\beta_0}+\hat{\beta_1}x+\hat{\beta_2}x^2$

In [8]:
fit1 <- lm(y ~ x + I(x^2))
summary(fit1)


Call:
lm(formula = y ~ x + I(x^2))

Residuals:
     Min       1Q   Median       3Q      Max 
-1.13550 -0.29425 -0.08672  0.02931  1.25722 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -8.7690    18.1200  -0.484    0.643
x             5.9427     7.1427   0.832    0.433
I(x^2)       -0.6428     0.6980  -0.921    0.388

Residual standard error: 0.7557 on 7 degrees of freedom
Multiple R-squared:  0.2949,	Adjusted R-squared:  0.09342 
F-statistic: 1.464 on 2 and 7 DF,  p-value: 0.2944


> $y=\hat{\beta_0}+\hat{\beta_1}x+\hat{\beta_2}x^2$仍然是线性模型、可以令$t=x^2$
> 但$y=\hat{\beta_0}+\hat{\beta_1}x+\sin(\hat{\beta_2}x)$不是线性模型、因为$\hat{\beta_2}$未知