# Регуляризация 

Отстраненные штуки:

In [None]:
def f(x):
    """
        Документации не будет, ахахах
    """
    return x**2

f()

Основные штуки: 

In [1]:
import pandas as pd
import numpy as np

# ok, here we go again
data = pd.read_csv('Advertising.csv', index_col=0)
data.head()

Unnamed: 0,TV,Radio,Newspaper,Sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


$$
y = x^{\alpha} z^{\beta}
$$

$$
\ln y = \alpha \ln x + \beta \ln z
$$

In [2]:
data['ln_Sales'] = data['Sales'].apply(np.log)
data['ln_TV'] = data['TV'].apply(np.log)

In [4]:
y = data['ln_Sales'].values
y[:10]

array([3.09557761, 2.34180581, 2.2300144 , 2.91777073, 2.55722731,
       1.97408103, 2.46809953, 2.58021683, 1.56861592, 2.360854  ])

In [5]:
X = data[['ln_TV', 'Newspaper', 'Radio']].values
X[:10]

array([[ 5.438514  , 69.2       , 37.8       ],
       [ 3.79548919, 45.1       , 39.3       ],
       [ 2.84490938, 69.3       , 45.9       ],
       [ 5.02058562, 58.5       , 41.3       ],
       [ 5.19739145, 58.4       , 10.8       ],
       [ 2.16332303, 75.        , 48.9       ],
       [ 4.05178495, 23.5       , 32.8       ],
       [ 4.78915702, 11.6       , 19.6       ],
       [ 2.1517622 ,  1.        ,  2.1       ],
       [ 5.29731687, 21.2       ,  2.6       ]])

## Разбиение на трейн и тест 

![](https://miro.medium.com/max/875/1*_7OPgojau8hkiPUiHoGK_w.png)


Что делает мето `.fit`?


__Модель:__ 

$$
y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{3i}
$$

__Обычная линейная регресиия:___

$$
MSE = \frac{1}{n} \sum_{i=1}^n [y_i - (\beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{3i})]^2 \to \min_{\beta_0, \beta_1, \beta_2,\beta_3}
$$

In [6]:
from sklearn.model_selection import train_test_split

# Случайно разбиваем выборку на две части
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
X_train.shape, X_test.shape

((160, 3), (40, 3))

In [13]:
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)
model.coef_

array([3.63127292e-01, 1.69236765e-04, 1.34080604e-02])

$$
MAE(y, \hat{y}) = \frac{1}{n} \sum_{i=1}^n |y_i - \hat{y}_i|
$$

In [21]:
import numpy as np
from sklearn.metrics import r2_score, mean_absolute_error

y_pred = model.predict(X_train)

np.mean(np.abs(y_pred - y_train)) # mae своими руками

0.0455921938257903

In [25]:
y_pred = model.predict(X_train)

print("Метрики для тренировочной выборки:")
print('mae:', mean_absolute_error(y_pred, y_train))
print('r2:', r2_score(y_pred, y_train))
print('-'*50)

y_pred = model.predict(X_test)

print("Метрики для тестовой выборки:")
print('mae:', mean_absolute_error(y_pred, y_test))
print('r2:', r2_score(y_pred, y_test))

Метрики для тренировочной выборки:
mae: 0.0455921938257903
r2: 0.9713949509987423
--------------------------------------------------
Метрики для тестовой выборки:
mae: 0.04794655974947582
r2: 0.9779578696898982


## Кросс-валидация 

![](https://long-short.pro/wp-content/uploads/sites/3/2013/06/crossvalidation.png)

In [35]:
from sklearn.model_selection import cross_val_score

model = LinearRegression()

metrics = cross_val_score(model, X_train, y_train, cv=5, n_jobs=-1, 
                          #scoring = 'r2',
                          scoring='neg_mean_absolute_error')
metrics

array([-0.04257416, -0.06009321, -0.06129719, -0.03917744, -0.03570213])

In [33]:
-1*np.mean(metrics)

0.04776882662500724

# Регуляризация 

In [51]:
from sklearn.linear_model import Ridge, Lasso

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# model = Ridge(alpha=20)
model = Lasso(alpha=1)

model.fit(X_train, y_train)
model.coef_

array([0.        , 0.        , 0.00842513])

In [52]:
y_pred = model.predict(X_train)

print("Метрики для тренировочной выборки:")
print('mae:', mean_absolute_error(y_pred, y_train))
print('r2:', r2_score(y_pred, y_train))
print('-'*50)

y_pred = model.predict(X_test)

print("Метрики для тестовой выборки:")
print('mae:', mean_absolute_error(y_pred, y_test))
print('r2:', r2_score(y_pred, y_test))

Метрики для тренировочной выборки:
mae: 0.29441840396369623
r2: -9.064091240049798
--------------------------------------------------
Метрики для тестовой выборки:
mae: 0.21892110883539564
r2: -3.587081455163105


## Ручные задачи


__Модель:__ 

$$
y_i = \beta x_i
$$

__Обычная линейная регресиия:__

$$
MSE = \frac{1}{n} \sum_{i=1}^n [y_i - \beta x_i]^2 \to \min_{\beta}
$$

__Решение:__

$$
\hat{\beta} = \frac{ \sum x_i y_i }{ \sum x^2_i}
$$

__Ridge регресиия:__

$$
L =  \frac{1}{n} \sum_{i=1}^n [y_i - (\beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{3i})]^2 + \alpha \sum \beta_k^2 \to \min_{\beta_0, \beta_1, \beta_2,\beta_3}
$$

$$
L = \frac{1}{n} \sum_{i=1}^n [y_i - \beta x_i]^2 + \alpha \cdot \beta^2 \to \min_{\beta}
$$

__Решение модели:__

$$
L' = \frac{1}{n} \cdot \sum -2x_i(y_i - \beta x_i) + 2 \alpha \beta = 0
$$


$$
\sum x_i y_i -  \beta \sum x^2_i - n \alpha \beta = 0
$$

$$
\hat{\beta} = \frac{ \sum x_i y_i }{ \sum x^2_i + n \alpha}
$$

__LASSO-регресия__

$$
L =  \frac{1}{n} \sum_{i=1}^n [y_i - (\beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{3i})]^2 + \alpha \sum |\beta_k|\to \min_{\beta_0, \beta_1, \beta_2,\beta_3}
$$

### Упражнение

Переписать Ridge-регрессию в виде задачи оптимизации с ограничениями


$$
L =  \frac{1}{n} \sum_{i=1}^n [y_i - (\beta_1 x_{1i} + \beta_2 x_{2i})]^2 + \alpha \cdot (\beta_1^2 + \beta_2^2) \to \min_{\beta_1, \beta_2}
$$

Батюшки! Да это же лагранджиан!

\begin{equation*}
\begin{cases}
& \frac{1}{n} \sum_{i=1}^n [y_i - ( \beta_1 x_{1i} + \beta_2 x_{2i})]^2 \to \min_{\beta_1, \beta_2}  \\
& \beta_1^2+ \beta^2_2  \le C
\end{cases}
\end{equation*}

А Lasso-регрессию? 

\begin{equation*}
\begin{cases}
& \frac{1}{n} \sum_{i=1}^n [y_i - ( \beta_1 x_{1i} + \beta_2 x_{2i})]^2 \to \min_{\beta_1, \beta_2}  \\
& |\beta_1|+ |\beta_2|  \le C
\end{cases}
\end{equation*}

![](https://i.stack.imgur.com/U13c4.png)

![](https://miro.medium.com/max/761/1*nrWncnoJ4V_BkzEf1pd4MA.png)