**1.1**

Logistic regression models the probability of a binary outcome using the sigmoid function applied to a linear model:

$\hat{y}_i = \sigma(w^\top x_i + b)$

$\sigma(z) = \frac{1}{1 + e^{-z}}$

Because the target variable satisfies
$y_i ∈ {0,1}$, each label is modeled as a Bernoulli random variable with parameter
$y^i$.

Likelihood of a single obersvation:

$p(y_i \mid x_i; w, b) = \hat{y}_i^{y_i}(1 - \hat{y}_i)^{1 - y_i}$

Assuming the data points are conditionally independent, the likelihood of the full dataset
$D={(xi,yi)}$ is:

$p(D \mid \theta) = \prod_{i=1}^{n} \hat{y}_i^{y_i}(1 - \hat{y}_i)^{1 - y_i}$

$\theta = (w, b)$

**MLE**

Maximum Likelihood Estimation chooses parameters that maximize the probability of observing the data:

$\hat{\theta}_{\text{MLE}} = \arg\max_{\theta} \; p(D \mid \theta)$

Taking the logarithm of the likelihood gives the log-likelihood:

$\ell(\theta) = \sum_{i=1}^{n} \left[ y_i \log(\hat{y}_i) + (1 - y_i)\log(1 - \hat{y}_i)
\right]$

Minimizing the negative log-likelihood yields the binary cross-entropy loss:

$J_{\text{MLE}}(\theta) = -\ell(\theta)$

**MAP**

Maximum A Posteriori estimation includes a prior distribution over the parameters:

$\hat{\theta}_{\text{MAP}} = \arg\max_{\theta} \; p(\theta \mid D)$

Using Bayes’ rule, this can be written as:

$\hat{\theta}_{\text{MAP}}
= \arg\max_{\theta}
\left[
\log p(D \mid \theta) + \log p(\theta)
\right]$

MAP minimizes this objective:

$J_{\text{MAP}}(\theta)
=
-\ell(\theta) - \log p(\theta)$


**The prior term log⁡p(θ) acts as a regularization term. Therefore, MAP estimation corresponds to regularized logistic regression, while MLE corresponds to the unregularized case.**

https://www.geeksforgeeks.org/data-science/mle-vs-map/

**1.2**

The machine learning problem considered for this assignment binary loan default prediction. Given borrower and loan features
$x∈R^d$, such as income, debt-to-income ratio, credit utilization, and prior delinquencies, the target variable is defined as:

y = 1, if the borrower defaults
y = 0, otherwise.

Logistic regression is the appropriate model for this task because it directly estimates the probability $p(y=1∣x)$, which is needed for risk-based decision making. In addition, its coefficients are easily interpretable, making it suitable for financial applications where model interpretability is important.

An alternative linear classifier is the linear Support Vector Machine (SVM). While both models learn linear decision boundaries, logistic regression optimizes a likelihood-based loss, whereas SVM focuses on maximizing the margin.

https://www.geeksforgeeks.org/machine-learning/support-vector-machine-algorithm/

**1.3**



For this task, each data point consists of a feature vector  $x_i∈R^d$ representing borrower and loan characteristics, such as income, debt-to-income ratio, credit utilization, and prior delinquencies. The corresponding target variable  $y_i∈{0,1}$ indicates whether the borrower defaults on the loan.

The logistic regression model estimates the probability of default as:

$p(y_i​ = 1∣ x_i​)=σ(w^\top x_i​ + b)$.

Here, the parameter vector wcaptures the influence of each feature on the log-odds of default, while the bias term b represents the baseline default risk.

The derivation in Task 1.1 assumes that observations are independent and that the binary outcomes follow a Bernoulli distribution. Additionally, logistic regression assumes a linear relationship between the features and the log-odds of the outcome. In practice, this implies that features are appropriately scaled and that strong multicollinearity is addressed.