# Notes

> The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples.

> In effect, our method trains the model to be easy to fine-tune.

> The primary contribution of this work is a simple model and task-agnostic algorithm for meta-learning that trains a model’s parameters such that a small number of gradient updates will lead to fast learning on a new task.

### Model-Agnostic Meta-Learning

**1. Meta-Learning Problem Set-Up**

> The model or learner is trained during a meta-learning phase on a set
> of tasks, such that the trained model can quickly adapt to new tasks using only a small number of examples or trials.

> In effect, the meta-learning problem treats entire tasks as training examples.

We specifically deal with a model $f$ that maps inputs $x$ to outputs $a$. We consider an individual task:

$$
T = \{ \mathcal{L}(x_1, a_1, …, x_H, a_H), q(x_1), q(x_{t+1}|x_t, a_t), H \}
$$

With the loss function $\mathcal{L}$, distribution over initial observations $q(x_1)$, transition distribution $q(x_{t+1}|x_t,a_t)$, and episode length $H$.

Given a distribution of tasks $p(\mathcal{T})$, we want our model to be able to learn a new task $\mathcal{T_i}$ drawn from $p(\mathcal{T})$ with $K$ samples.

For each training step, a new task is drawn and trained on using gradient descent. Then the test set is validated, and the loss from the test set is used to improve $f$.

> In effect, the test error on sample tasks $\mathcal{T}_i$ serves as the training error of the meta-learning process.

**2. A Model-Agnostic Meta-Learning Algorithm**

> The intuition behind this approach is that some internal representations are more transferrable than others.

Some internal representations are easier to fine-tune to new tasks. We want to push the model to learn representations that are most sensitive to new task loss. In other words, small changes to the model parameters given the loss of a new task should result in big changes to the network outputs.

![Screenshot 2024-11-06 at 10.03.33 PM.png](../../../images/Screenshot_2024-11-06_at_10.03.33_PM.png)

The algorithm is basically performing gradient descent on a number of tasks to update parameters, then using the sum of validation errors across tasks for these new parameters to further update parameters with the meta step-size $\beta$

![Screenshot 2024-11-06 at 10.05.40 PM.png](../../../images/Screenshot_2024-11-06_at_10.05.40_PM.png)

### Experimental Evaluation

**1. Regression**

The first task is training a model to fit to a new sine curve of variable amplitude and wave-length.

They test the model by fine-tuning it on $K= \{ 5, 10, 25 \}$ samples from a new task.

![Screenshot 2024-11-06 at 10.12.15 PM.png](../../../images/Screenshot_2024-11-06_at_10.12.15_PM.png)

MAML is way better than standard pre-training for updating to this task.

**2. Classification**

They also test MAML against N-way classification of the Omniglot and MiniImagenet datasets for classifying among $N$ classes randomly selected by the task.

> MAML compares well to the state-of-the-art results on this task, narrowly outperforming the prior methods.

> A significant computational expense in MAML comes from the use of second derivatives when back-propagating the meta-gradient through the gradient operator in the meta-objective.

**3. Reinforcement Learning**

They use REINFORCE as the main RL algorithm and TRPO as the meta-optimizer.

They first test MAML out on a symbol 2D target navigation problem.

> The results show that MAML can learn a model that adapts much more quickly in a single gradient update, and furthermore continues to improve with additional updates

It also performs well on learning locomotion.

### Discussion

> Our approach has a number of benefits. It is simple and does not introduce any learned parameters for meta-learning.

> It can be combined with any model representation that is amenable to gradient-based training, and any differentiable objective, including classification, regression, and reinforcement learning.

MAML is highly flexible.

> Reusing knowledge from past tasks may be a crucial ingredient in making high-capacity scalable models, such as deep neural networks, amenable to fast training with small datasets.
