# Examples of MapReduce in machine learning algorithms
This document covers selected examples of MapReduce usage when applying machine learning algorithms.

## Linear regression
Fitting a linear regression:

$\mathbf{y}_{n\times 1} = \mathbf{X}_{n\times p}~\beta_{p\times 1} + \boldsymbol{\varepsilon}_{n\times 1}$

via least-squares involves 

$\min_\beta \ell(\beta) = \|\mathbf{y} - \mathbf{X} \beta \|^2.$

The corresponding solution is 

$\widehat{\beta} = (\mathbf{X}^\mathsf{T}\mathbf{X})^{-1} \mathbf{X}^\mathsf{T}\mathbf{y}.$

**Observations:**
- What are the components required to compute the estimated coefficients?
- Which step is the slowest?

Typically, in a case where linear regression makes sense, $n \gg p$, such that $\mathbf{X}^\mathsf{T}\mathbf{X}$ is easy to invert.

*Note:* Cholesky decomposition is used in practice, for computational stability.

Notice the required computations are the following:

$\mathbf{X}^\mathsf{T}\mathbf{X} = \sum_{i=1}^n \mathbf{x}_i \mathbf{x}_i^\mathsf{T}, \quad \mathbf{X}^\mathsf{T}\mathbf{y} = \sum_{i=1}^n \mathbf{x_i} y_i.$

`map`:

`reduce`:

### [Colab in-class example](https://github.com/mosesyhc/de300-wn2024-notes/blob/main/examples/ex-linear-mr-class.ipynb)

## Logistic regression
Fitting a logistic regression typically maximizes the log-likelihood function:

$\ell(\beta) = \sum_{i=1}^{n} \left[ y_i \log(p(\mathbf{x}_i)) + (1 - y_i) \log(1 - p(\mathbf{x}_i)) \right], \quad \log \frac{p(\mathbf{x}_i)}{1 - p(\mathbf{x}_i)} = \beta^\mathsf{T}\mathbf{x}_i.$

Or equivalently,

$\ell(\beta) = \sum_{i=1}^n y_i (\beta^\mathsf{T}\mathbf{x}_i) - \log (1 + \exp\{\beta^\mathsf{T}\mathbf{x}_i\}).$

How do we design the MapReduce parts for logistic regression?

`map`:

`reduce`:

### Difference between logistic and linear regression
The maximum likelihood estimator $\widehat{\beta}$ does not admit a close-formed solution.

```{figure} ../img/logreg-withSGD.png
---
width: 80%
name: logreg-SGD
---
MapReduce implementation snippet of logistic regression with SGD.
```