Visual Transformers: Cats vs. Dogs Classification

A machine learning project utilizing Visual Transformers (ViTs) to classify images from the Cats vs. Dogs dataset.

Introduction

The Cats vs. Dogs dataset is a standard computer vision dataset that contains images of cats and dogs. In this project, instead of using conventional CNNs, we utilize Visual Transformers (ViTs). The Cats vs. Dogs dataset is only a small collection of 25K annotated images. The purpose of this project is to see if we can use MAE to pretrain the model to achieve a better result than simply trainining from random initialisation.

Installation

Requirements: Python 3.8+

Clone the repository:

git clone https://github.com/yourusername/cats-vs-dogs-vit.git
cd cats-vs-dogs-vit

Install the required packages:
```
pip install -r requirements.txt
```

Usage

To prepare the data:

python tinyVIT.py prepare-data

To train the model using MAE:

python tinyVIT.py train-mae

To train the model using supervision:

python tinyVIT.py train

Results

We achieved an accuracy of 83.46% on a randomly sampled validation set of 2500 images using a ViT with random weight initialisation. If we used a pretrained network trained use MAE then the final accuracy on the same validation set was 93.29%. Also of note that we reached this accuracy around 80 epochs where the best performance from random initialisation was reached only after 158 epochs.

Model	Accuracy (%)
Visual Transformer	83.46
Visual Transformer + MAE	93.29

Acknowledgements

Thanks to the creators of the Cats vs. Dogs dataset.
Inspired by the MAE paper.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
images		images
notebooks		notebooks
templates		templates
.gitignore		.gitignore
README.md		README.md
create_movie.py		create_movie.py
environment.yml		environment.yml
server.py		server.py
tinyVIT.py		tinyVIT.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visual Transformers: Cats vs. Dogs Classification

Table of Contents

Introduction

Installation

Usage

Results

Acknowledgements

About

Releases

Packages

Languages

msalvaris/tiny-vit

Folders and files

Latest commit

History

Repository files navigation

Visual Transformers: Cats vs. Dogs Classification

Table of Contents

Introduction

Installation

Usage

Results

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages