# L6a: The Multiplicative Weights Algorithm (MWA)
In this lecture, we will discuss the Multiplicative Weights Algorithm (MWA) and its applications in online learning. The key ideas discussed in this lecture are as follows:
* [Online Learning](https://en.wikipedia.org/wiki/Online_machine_learning) is a type of machine learning where the model learns from a stream of data. Thus, unlike traditional supervised machine learning, the model is not trained on a fixed dataset. Instead, it learns from new data as it arrives. This idea dates back to the 1950s and has been studied extensively in the context of game theory, optimization, and machine learning.
* [The Multiplicative Weights Algorithm (MWA)](https://en.wikipedia.org/wiki/Multiplicative_weight_update_method) is a simple and powerful algorithm for online learning. The MWA is a type of sequential prediction algorithm that updates the weights of a set of experts based on their performance on past predictions. The key idea is to assign higher weights to experts that perform well and lower weights to experts that perform poorly. This allows the algorithm to adapt to changing data distributions and learn from its mistakes.

Today, we are going to borrow notes from several sources: [Arora et al., The Multiplicative Weights Update Method: A Meta-Algorithm and Applications, Theory of Computing, Volume 8 (2012), pp. 121–164](https://theoryofcomputing.org/articles/v008a006/v008a006.pdf) and the [15-859 CMU Lecture 16](https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15859-f11/www/notes/lecture16.pdf) and [15-850 CMU Lecture 17](https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15859-f11/www/notes/lecture17.pdf).

## Weighted Majority Algorithm
The wieghted majority algorithm is one implmentation of the multiplicative weights algorithm. This is known as the Prediction from Expert Advice problem.

Suppose we play a game between an omniscient Adversary and an Aggregator, e.g., a decision making agent who is advised by $n$ experts. The game proceeds in rounds $t = 1, 2, \ldots, T$, where in each round the aggregator makes a _binary_ decision $y_t \in \{-1, 1\}$, and the adversary reveals the true outcome $y_t$. The aggregator's goal is to minimize the regret, i.e., the difference between the aggregator's loss and the loss of the best expert in hindsight.

Select a parameter $\eta\leq{1/2}$. Initially, the Aggregator assigns a weight of $w_1^i = 1$ to each expert $i$. In each round $t$, the following steps occur:
1. The Aggregator receives the predictions that is the weighted majority of the experts’ predictions based on the weights 
$\left\{w_t^i \mid i = 1,2,\dots,n\right\}$. 
2. The Adversary reveals the true outcome $y_t\in\left\{-1,1\right\}$. 
3. For every expert $i$ which predicts _incorrectly_, the Aggregator updates the weight as $w_{t+1}^i = w_t^i\cdot(1-\eta)$.

Then after $T$ rounds, let $m_{i}^(T)$ be the number of mistakes made by expert $i$ and $M^{(T)}$ be the number of mistakes made by the Aggregator. Then we have the bound for every expert $i$:
$$
M^{(T)} \leq 2\left(1+\eta\right)m_i^{(T)} + \frac{2\ln{n}}{\eta}
$$