# Comparing CNN to Transformer
## Chapter 2 Module 4

In our first code example, we are going to install some key libraries that we will need for our code practices later. We will also do a very basic comparision of the two famous models, AlexNet vs ViT!

## Installation
Before we can get started, pip install the libraries below. `torch` `torchvision` `transformers` `pillow` will all help with our model inference. `fiftyone` is an open source library to help manage computer vision datasets as well as evaluation, visualization, and more that we will leverage for our examples. 


In [None]:
pip install torch torchvision transformers pillow  fiftyone

## Loading a Subset of Imagenet

The first part of FiftyOne we will leverage is the dataset zoo. Instead of manually collecting all 1TB of [ImageNet](https://www.image-net.org/index.php), we will use a subset from the [FiftyOne Dataset Zoo](https://docs.voxel51.com/dataset_zoo/index.html) called [ImageNet-Sample](https://docs.voxel51.com/dataset_zoo/datasets.html#imagenet-sample). The sample contains 1,000 images but we will load in just 50 for our example.

After the dataset is loaded, we launch the [FiftyOne App](https://docs.voxel51.com/user_guide/app.html) to visualize our dataset. You can view the app in your notebook below the cell. If you are running locally, you can also launch the notebook in your browser with `localhost:5151`. Try exploring ImageNet for yourself!

In [3]:
import fiftyone as fo
import fiftyone.zoo as foz 

dataset = foz.load_zoo_dataset(
    "imagenet-sample",
    dataset_name="Imagenet-Sample",
    max_samples=50,
    shuffle=True,
    overwrite=True,
)

  from .autonotebook import tqdm as notebook_tqdm


Downloading dataset to '/fiftyone/zoo/datasets/imagenet-sample'
Downloading dataset...
 100% |████|  762.4Mb/762.4Mb [1.2s elapsed, 0s remaining, 664.3Mb/s]         
Extracting dataset...
Parsing dataset metadata
Found 1000 samples
Dataset info written to '/fiftyone/zoo/datasets/imagenet-sample/info.json'
Loading 'imagenet-sample'
 100% |███████████████████| 50/50 [77.8ms elapsed, 0s remaining, 642.4 samples/s]  
Dataset 'Imagenet-Sample' created


In [6]:
session = fo.launch_app(dataset)

![imagenet](./assets/imagenet.png)

Try filtering your dataset on its labels by using the sidebar! You can select any of the classes in your dataset to see those samples!

![labels](./assets/labels.png)

## Model Showdown

Now that we have our dataset loaded and ready to go, its time to test our two models against eachother! 

We will be using the [Google ViT](https://huggingface.co/google/vit-base-patch16-224) from the [Hugging Face](https://huggingface.co/) [`transformers` library](https://huggingface.co/docs/transformers/en/index). We will be using this library often so feel free to get familiar!

The AlexNet model in the [FiftyOne Model Zoo](https://docs.voxel51.com/model_zoo/models.html) comes from [pytorch models](https://pytorch.org/vision/main/models.html) and is based off the original paper.

We use [FiftyOne's Hugging Face integration](https://docs.voxel51.com/integrations/huggingface.html) to take any `transformers` model and apply it instantly to our dataset. We can do the same with any model from the FiftyOne Model Zoo. After inference is done, check back in your app and compare performance!

In [7]:
from transformers import ViTForImageClassification

model = ViTForImageClassification.from_pretrained(
    "google/vit-base-patch16-224"
)
dataset.apply_model(model, label_field="ViT_predictions")

model = foz.load_zoo_model("alexnet-imagenet-torch")
dataset.apply_model(model, label_field="AlexNet-predictions")


session.show()

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Fast image processor class <class 'transformers.models.vit.image_processing_vit_fast.ViTImageProcessorFast'> is available for this model. Using slow image processor class. To use the fast image processor class set `use_fast=True`.


 100% |███████████████████| 50/50 [28.1s elapsed, 0s remaining, 2.0 samples/s]      
Downloading model from 'https://download.pytorch.org/models/alexnet-owt-4df8aa71.pth'...
 100% |██████|    1.8Gb/1.8Gb [7.0s elapsed, 0s remaining, 120.2Mb/s]      
Downloading: "https://download.pytorch.org/models/alexnet-owt-7be5be79.pth" to /root/.cache/torch/hub/checkpoints/alexnet-owt-7be5be79.pth


100%|██████████| 233M/233M [00:01<00:00, 177MB/s] 


 100% |███████████████████| 50/50 [18.7s elapsed, 0s remaining, 4.4 samples/s]      


![compare](./assets/compare.png)

## Looking Forward

In our next chapter, we will learn all about how we can begin using transformers right away, including how to load, inference, and evaluate transformers yourself. Hop over to Chapter 3 to learn more!