Skip to content

MJAHMADEE/Vision_Transformers

Repository files navigation

Vision Transformers with PyTorch 🖼️🤖

Python PyTorch License

This project implements Vision Transformers (ViT) using PyTorch to classify images from the CIFAR-10 dataset. It includes pre-trained models like ViT and CaiT, fine-tuned on CIFAR-10, demonstrating how transformers can be adapted for image classification.

Features 🌟

  • Utilizes pre-trained Vision Transformer (ViT) and Class-Attention in Image Transformers (CaiT) models.
  • Supports fine-tuning of transformer models on the CIFAR-10 dataset.
  • Visualizes training and validation loss, accuracy, and confusion matrices.
  • Demonstrates data preprocessing and augmentation techniques for image data.
  • Evaluates model performance with metrics such as F1-score, recall, accuracy, and precision.

Setup and Installation 🛠️

  1. Clone the repository from GitHub.
  2. Navigate to the project directory.
  3. Install the required dependencies listed in the requirements.txt file.

Dataset 📁

The CIFAR-10 dataset is used, consisting of 60,000 32x32 color images in 10 different classes, with 6,000 images per class. The dataset is automatically downloaded and pre-processed for training and testing.

Training the Model 🚀

The training process involves fine-tuning the pre-trained Vision Transformer models on the CIFAR-10 dataset. The models are adjusted to work with the smaller image size and class count of CIFAR-10.

Testing the Model 🧪

After training, the model's performance is evaluated on the test set of CIFAR-10. Metrics like accuracy, F1-score, recall, and precision are computed to assess the model.

Results and Evaluation 📊

Results are documented through confusion matrices, loss, and accuracy plots. These visualizations help in understanding the model's performance and areas of improvement.

License 📜

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements 🙌

  • Thanks to the creators of the CIFAR-10 dataset for providing the resources necessary for training and testing the model.
  • PyTorch and timm library documentation for providing comprehensive guides and tutorials.

Notebook and Copyright

Open In Colab

@misc{MJVisionTransformers2023, author = {Mohammad Javad (MJ) Ahmadi}, title = {Vision Transformers}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/MJAHMADEE/Vision_Transformers}} }


For more information, please refer to the official repository.