# Week01: Regression & Optimization

# Linear Regression

$$
Y \sim X_1, X_2, .., X_D
$$

where $Y$ is the dependent variable, $X_1, X_2, .., X_D$ are independent variables. Hence we can form an equation:

$$
Y = w_0 + w_1X_1 + w_2X_2 + .. + w_DX_D = w_0 + \sum_{i = 1}^{D} w_iX_i
$$

### Example 7.2 (Mr. Len Bui's book)

In [1]:
import pandas as pd
import numpy as np

In [2]:
income = np.array([31, 50, 47, 45, 39, 50, 35, 40, 45, 50])
expenses = np.array([29, 42, 38, 30, 29, 41, 23, 36, 42, 48])

First case: $w_0 = 0, w_1 = 0.8$

In [3]:
w0 = 0
w1 = 0.8

In [4]:
expenses_hat = income * w1 + w0

In [5]:
expenses_hat

array([24.8, 40. , 37.6, 36. , 31.2, 40. , 28. , 32. , 36. , 40. ])

In [6]:
res = np.mean((expenses - expenses_hat) ** 2)

In [7]:
res

20.464

In [8]:
(expenses - expenses_hat) ** 2

array([17.64,  4.  ,  0.16, 36.  ,  4.84,  1.  , 25.  , 16.  , 36.  ,
       64.  ])

In [9]:
def mse_error(w0, w1):
    expenses_hat = income * w1 + w0
    return np.mean((expenses - expenses_hat) ** 2)

Second case: $w_0 = 0, w_1 = 0.9$

In [10]:
mse_error(w0 = 0, w1 = .9)

27.56600000000001

A simple estimate of $w_0$ and $w_1$:
$$
\hat{w_1} = \frac{\bar{xy} - \bar{x}\bar{y}}{\bar{x^2} - \bar{x}^2} \\
\hat{w_0} = \bar{y} - \hat{w_1}\bar{x}
$$

In [11]:
mean_xy = np.mean(income * expenses)
mean_x = np.mean(income)
mean_x_squared = np.mean(income ** 2)
mean_y = np.mean(expenses)

In [12]:
mean_xy

1585.1

In [13]:
mean_x

43.2

In [14]:
mean_x_squared

1906.6

In [15]:
mean_y

35.8

In [16]:
w1_hat = (mean_xy - mean_x * mean_y) / (mean_x_squared - mean_x ** 2)

In [17]:
w1_hat

0.9549058473736441

In [18]:
w0_hat = mean_y - w1_hat * mean_x

In [19]:
w0_hat

-5.451932606541433

In [20]:
mse_error(w0 = w0_hat, w1 = w1_hat)

17.957928642220025

# Optimization

Try running `simann_bing.py` and `simann_chatgpt.py` for the Simulated Annealing Algorithm.