## 08. PyTorch Paper Replicating

### Milestone Project 2: PyTorch Paper Replicating

=> replicating a Machine Learning research paper and creating a Vision Transformer(ViT) from scrach using PyTorch.

#### • What is paper replicating?

=> many of advances get published in machine learning research papers.

**=> Goal of paper replicating:** take replicate advances with code -> use the techniques for own problem

**=> Involves:** *turn a machine learning paper comprised of images/digrams, math and test into usable code and in this case, usable PyTorch code. Digram, math equations and test from the `ViT paper.`*

#### • What is a machine learning research paper?
> (1) **Abstract** -> An overview/summary of the paper's main findings/contributions
>
> (2) **Introduction** -> What's the paper's main problem and details of previous methods used to try and solve it.
>
> (3) **Method** -> How did the researchers go about conducting their research? -> what model(s), data sources, training setups were used?
>
> (4) **Results** -> outcomes -> If a new type of model or training setup was used, how did the results of findings compare to previous works?
>
> (5) **Conclusion** -> limitations of the suggested methods? next steps for the research community?
>
> (6) **References** -> resources/other papers did the researchers look at to build their own body of work?
>
> (7) **Appendix** -> any extra resources/findings to look at
>


#### • Where to find code examples for ML research paper?
> (1) **arXiv** -> a free and open resource for reading technical articles on everything from physics to computer science
>
> (2) **AK Twitter** -> The AK Twitter account publishes machine learning research highlights, often with live demos almost every day
>
> (3) **Paper with Code** -> collection of trending, active and greatest machine learning papers, many of which include code resources attached. Also includes a collection of common machine learning datasets, benchmarks and current state-of-the-art models.
>
> (4) **lucidrains' `vit-pytorch` GitHub repository** -> Less of a place to find research papers and more of an example of what paper replicating with code on a larger-scale and with a specific focus looks like. 
>
> ...

### 0. Get Setup

=> replicate the machine learning research paper `An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale` (ViT paper) with PyTorch - https://arxiv.org/abs/2010.11929

=> The `Transformer neural network architecture` was originally introduced in the machine learning research paper `Attention is all you need` - https://arxiv.org/abs/1706.03762

=> A `Transformer architecture` is generally considered to be any neural network that uses the `attention mechanism` as its **primary learning layer**. Similar to a how a convolutional neural network (CNN) uses convolutions as its primary learning layer.

=> the `Vision Transformer (ViT) architecture` was designed to adapt the original Transformer architecture to vision problem(s)

In [1]:
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms
from torchinfo import summary

In [2]:
from go_modular import data_setup, engine
from helper_functions import download_data, set_seeds, plot_loss_curves

device = "gpu" if torch.cuda.is_available() \
    else "mps" if torch.backends.mps.is_built() else "cpu"
device

'mps'