# "What are ResNets & Why use it for computer vision tasks"
> "Insights from [\"Deep Learning for Coders with fastai & PyTorch\"](https://github.com/fastai/fastbook) and from around the world"

- toc: true
- branch: master
- badges: true
- hide_binder_badge: true
- comments: true
- author: Wayde Gilliam
- categories: [fastai, fastbook, fastbook chapter 1, what is, computer vision, resnet]
- image: images/articles/resnet.png
- hide: false
- search_exclude: false
- permalink: /what-is/resnets

In [None]:
#hide
! pip install fastai -Uqq

[K     |████████████████████████████████| 189 kB 5.6 MB/s 
[K     |████████████████████████████████| 56 kB 3.4 MB/s 
[?25h

Arguably the best architecture for most computer vision tasks, here we take a look at **ResNet** and how it can be used in fastai for a variety of such tasks.

---
## What is a ResNet & Why use it for computer vision tasks?

A **ResNet** is a model architecture that has proven to work well in CV tasks. Several variants exist with different numbers of layers with the larger architectures taking longer to train and more prone to overfitting especially with smaller datasets.

The number represents the number of layers in this particular ResNet variant ... "(other options are 18, 50, 101, and 152) ... model architectures with more layers take longer to train and are more prone to overfitting ... on the other hand, when using more data, they can be qite a bit more accurate." {% fn 2 %}

### What other things can use images recognizers for besides image tasks? 

Sound, time series, malware classification ... "a good rule of thumb for converting a dataset into an image representation: if the human eye can recognize categories from the images, then a deep learning model should be able to do so too." {% fn 1 %}

### How does it fare against more recent architectures like vision transformers?

Pretty well apparently (at least at the time this post was written) ...

> twitter: https://twitter.com/wightmanr/status/1444852719773122565


---
## ResNet best practices


> Tip: Start with a smaller ResNet (like 18 or 34) and move up as needed.


> Note: If you have a lot of data, the bigger resnets will likely give you better results.

---
## An example using the high-level API

### Step 1: Build our DataLoaders

In [None]:
#hide_output
from fastai.vision.all import *

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()

dls = ImageDataLoaders.from_name_func(path, get_image_files(path), valid_pct=0.2, seed=42, label_func=is_cat, item_tfms=Resize(224))

**Why do we make images 224x224 pixels?**

"This is the standard size for historical reasons (old pretrained models require this size exactly) ... If you increase the size, you'll often get a model with better results since it will be able to focus on more details." {% fn 3 %}


> Tip: Train on progressively larger image sizes using the weights trained on smaller sizes as a kind of pretrained model.

### Step 2: Build our `cnn_learner`

In [None]:
#hide_output
learn = cnn_learner(dls, resnet18, metrics=error_rate)

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth


  0%|          | 0.00/44.7M [00:00<?, ?B/s]

  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)


As you can see above, the architecture being used is a resnet with 18 layers.

### Step 3: Train

In [None]:
learn.fine_tune(1)

epoch,train_loss,valid_loss,error_rate,time
0,0.161614,0.04067,0.013532,01:03


epoch,train_loss,valid_loss,error_rate,time
0,0.062475,0.020072,0.006766,01:04


For more information on how transfer learning works, and the `fine_tune` method in particuarl, see this section in my ["What is machine learning" post](https://ohmeow.com/what-is/machine-learning#Transfer-learning).

For more metrics like `error_rate`, see my ["What is a metric" post](https://ohmeow.com/what-is/a-metric).

---
{{ '"Chaper 1: Your Deep Learning Journey". In *[The Fastbook](https://www.amazon.com/Deep-Learning-Coders-fastai-PyTorch/dp/1492045527)* pp.30-31.' | fndetail: 2 }}

{{ 'Ibid., p.39. Pages 36-39 provides several examples of how non-image data can be converted to an image for such a purpose.' | fndetail: 1 }}

{{ 'Ibid., p.28' | fndetail: 3 }}
