# 3. Linear Models

## Logistic Regression

### Binary Classification Problems

Setting:
* $ X $ is a multiset of feature vectors from an inner product space $ \mathbf{X}, \mathbf{X} \in \mathbb{R} $
* $ C = \{0, 1\} $ is a set of two classes
* $ D = \{(\mathbf{x}_1, c_1), \dots, (\mathbf{x}_n, c_n)\} \subseteq X \times C $ is a multiset of examples

Learning task:
* Fit $ D $ using a logistic function $ y() $.

Examples for binary classification problems:
* E-Mail is spam or ham?
* Patient infected or healthy?
* Customer creditworthy or not?

### Linear Regression

![img1](img/topic3img1.png)

* Linear Regression: $ y(\mathbf{x}) = \mathbf{w}^T \mathbf{x} $
* Classification: Predict "spam" if $ y(\mathbf{x}) \geq 0 $ else "ham"

Restrict the range of $ y(\mathbf{x}) $ to reflect the two-class classification semantics:

$ -1 \leq y(\mathbf{x}) \leq 1 $ or $ 0 \leq y(\mathbf{x}) \leq 1 $ 

### Sigmoid (Logistic) Function

$ \sigma(z) = \frac{1}{1 + e^{-z}} $

Linear Regression $ \circ $ Sigmoid Function $ \rightarrow $ Logistic Model Function

$ \mathbf{w}^T \mathbf{x} \circ \frac{1}{1 + e^{-z}} \rightarrow y(\mathbf{x}) \equiv \sigma(\mathbf{w}^T \mathbf{x}) = \frac{1}{1 + e^{\mathbf{w}^T \mathbf{x}}} $

$ y: \mathbb{R}^{p + 1} \rightarrow (0; 1) $

This is interpreted as the estimated probability for th event $ \boldsymbol{\mathsf{C}} = 1 $:
* $ y(\mathbf{x}) = P(\boldsymbol{\mathsf{C}}=1 \mid \boldsymbol{\mathsf{X}}=\mathbf{x}; \mathbf{w}) =: p(1 \mid \mathbf{x}; \mathbf{w}) $ "Probability for C=1 given x, parameterized w"
* * $ 1- y(\mathbf{x}) = P(\boldsymbol{\mathsf{C}}=0 \mid \boldsymbol{\mathsf{X}}=\mathbf{x}; \mathbf{w}) =: p(0 \mid \mathbf{x}; \mathbf{w}) $ "Probability for C=0 given x, parameterized w"

Example (email spam classification):

\begin{equation*}
\begin{split}
\mathbf{x} = 
\begin{pmatrix}
x_0 \\ 
x_1
\end{pmatrix}
\begin{pmatrix}
1 \\ 
|\text{obscene words}|
\end{pmatrix},
\mathbf{x}_1 = 
\begin{pmatrix}
1 \\
5
\end{pmatrix}
\text{ and }
y(\mathbf{x}_1) = 0.67
\end{split}
\end{equation*}

$ \Rightarrow $ 67% chance that this email is spam.


Recap: **Linear Regression for classification**
![img2](img/topic3img2.png)

Recap: **Logistic Regression for classification**
![img2](img/topic3img3.png)

### The BGD Algorithm

Algorithm: Batch Gradient Descent

Input: 
- $ D $ (multiset of examples $ (\mathbf{x}, c) $ with $ x \in \mathbb{R}^p, c \in \{0, 1\} $)
- $ \eta $ Learning rate, small positive constant

Output:

$ \mathbf{w} $ weight vector from $ \mathbb{R}^{p + 1} $ (= hypothesis)

![img4](img/topic3img4.png)


(Repeat until convergence):

`FOREACH (x, c) in D DO:`
- [Model Function evaluation]
- [Calculation of residual]
- [Calculation of derivative of the loss, accumulate for D]
`ENDDO`
- Parameter Vector update = one gradient step down

![img5](img/topic3img5.png)

More complex polynomials will entail more conplex decision boundaries (see lecture notes)

...

## Loss Computation in Detail

2nd part of ML stack: "Optimization Objective"
* Objective: Minimize Loss
* Regularization: None
* Loss: 0/1 loss, squared loss, logistic loss, cross-entropy loss, hinge loss

* The pointwise loss $ l(c, y(\mathbf{x})) $ quantifies the error introduced by some $ \mathbf{x} $. The loss depends on the hypothesis $ y() $ and the true class $ c $ of $ \mathbf{x} $.
* For $ y(\mathbf{x}) = \mathbf{w}^T \mathbf{x} $ we define the following pointwise loss functions
  - 0/1 loss : $ l_{0/1}(c, y(\mathbf{x})) = I_{\neq}(c, \text{sign}(y(\mathbf{x}))) $ which is zero if $ c = \text{sign}(y(\mathbf{x})) $ and 1 otherwise
  - Squared loss : $ l_2(c, y(\mathbf{x})) = (c - y(\mathbf{x}))^2 $

![img6](img/topic3img6.png)

* For $ y(\mathbf{x}) = \sigma(\mathbf{w}^T \mathbf{x}) = \frac{1}{1 + e^{-\mathbf{w}^T \mathbf{x}}} $ we define the following pointwise loss functions:
  - 0/1 loss : $ l_{0/1}(c, y(\mathbf{x})) = I_{\neq}(c, \lfloor y(\mathbf{x}) + 0.5 \rfloor) $
  - Logistic loss : $ l_{\sigma}(c, y(\mathbf{x})) = -log(y(\mathbf{x})) $ if $ c = 1 $, $ -log(1 - y(\mathbf{x})) $ if $ c = 0$

![img7](img/topic3img7.png)