<img
     img align="left"
     src="src/uni_logo_black.png"
     alt="Universität Bielefeld"
     width="20%"
/>

# Miniproject 10: Meta-Learning with Reptile
#### József Lurvig
19. July 2022

*A. Nichol, J. Achiam, J. Schulman (2018): On First-Order Meta-Learning Algorithms*

This paper considers meta-learning problems, where there is a distribution of tasks, and we would like to obtain an agent that performs well (i.e., learns quickly) when presented with a previously unseen task sampled from this distribution.

## Problem
Many tasks: for every task a different AI needs to be trained and training takes long.

## Content
- First order MAML
- Reptile

### Human vs ML
ML systems have surpassed humans at many tasks, but they need far more data to reach the same level of performance. This comparison is however not fair, because these algorithms have to start from scratch, humans enter the task with a large amount of prior/background knowledge, encoded in their brains and DNA. Humans don't learn every time from scratch, but they are fine-tuning and recombining a set of pre-existing skills.

Tenenbaum et al.: This can be explained as Bayesian inference --> the key is to make our algorithms more Bayesian. This is however difficult. (should make use of deep neural networks and should be computationally feasible)

# Model-agnostic meta-learning (MAML)

Model-Agnostic Meta-Learning (MAML) was introduced in 2017 by Chelsea Finn et al. Given a sequence of tasks, the parameters of a given model are trained such that few iterations of gradient descent with few training data from a new task will lead to good generalization performance on that task. MAML "trains the model to be easy to fine-tune." MAML was successfully applied to few-shot image classification benchmarks and to policy-gradient-based reinforcement learning. **[[1](https://en.wikipedia.org/wiki/Meta_learning_(computer_science))]**

Compatible with any model trained with gradient descent and applicable to a variety of different learning problems. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. **[[2](https://arxiv.org/abs/1703.03400)]**

1. MAML is a powerful tool for meta-learning that can be used to improve the performance of machine learning models.
2. MAML can be used to train models on a variety of tasks and then fine-tune them for a specific task.
3. MAML is efficient and can be used to train models on large datasets.
4. MAML is scalable and can be used to train models on a variety of hardware platforms.
5. MAML is open source and available for use by anyone.

## Test leírás
Access a distribution over tasks, they sample a training set of tasks and a test set of tasks.

Algorithm gets trainig set of tasks and produces agent, that has good average performance on the test set of tasks. Reenforcement reward = learning quickly.

# Reptile

- a new first-order gradient-based meta-learning algorithm

The Reptile algorithm is as follows:
> Initialize $\phi$\
> **for** *i* **do**
> > Sample $\tau$\
> > Compute $\tilde{\phi}$\
> > Update $\phi$
>
> **end for**

## Case Study: One-Dimensional Sine Wave Regression

- let's look at: task = 1D sine regression

1. $f(x) = a\cdot{\sin(x+b)}$, where amplitude $a \sim U([0.1, 5])$ and phase $b \sim U([0, 2\pi])$
1. Sample $x_{1}, x_{2}, \dots, x_{p}$
1. Learner sees $(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{p}, y_{p})$ and predicts $f(x)$
1. Loss is $L_{\tau}(f) = \int_{-5}^{5}dx||f(x)-f_{\tau}(x)||^{2}$

Note that $\mathbb{E}_{\tau}[f_{\tau}()x]=0$ due to the random phase $b$

## Example

- few-shot regression
- training on 10 sampled points
- 32 gradient steps
- MLP with layers $1 \to 64 \to 64 \to 1$

| ![](src/sine_maml.png) | ![](src/sine_reptile.png) |
| :----: | :----: |
| After MAML training | After Reptile training |

### Analysis

- two alternative explanations of why Reptile works

#### 1. Leading Order Expansion of the Update

#### 2. Finding a Point Near All Solution Manifolds