# Practice with Image Classification

This is a small practice exercise with image classification. Like the previous module, you just need to set the correct paths to the data and to save the model, then test your saved models in the application notebook.

The task here is to classify images of dogs and cats

# Load data

Please set `data_path` to the `animals.zip` file in your Google Drive. The curly brackets `{}` allow us to use Python variable in a terminal command (`!unzip`) through Google Colab

In [2]:
data_path = '/content/drive/MyDrive/Enterprise AI/Assignment3/animals.zip'

In [3]:
from google.colab import drive
drive.mount('/content/drive')
!unzip '{data_path}'

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Archive:  /content/drive/MyDrive/Enterprise AI/Assignment3/animals.zip
   creating: animals/cat/
  inflating: animals/cat/1.jpeg      
  inflating: animals/cat/10.jpeg     
  inflating: animals/cat/100.jpeg    
  inflating: animals/cat/1001.jpeg   
  inflating: animals/cat/1002.jpeg   
  inflating: animals/cat/1004.jpeg   
  inflating: animals/cat/1005.jpeg   
  inflating: animals/cat/1006.jpeg   
  inflating: animals/cat/1007.jpeg   
  inflating: animals/cat/1008.jpeg   
  inflating: animals/cat/101.jpeg    
  inflating: animals/cat/1011.jpeg   
  inflating: animals/cat/1012.jpeg   
  inflating: animals/cat/1013.jpeg   
  inflating: animals/cat/1014.jpeg   
  inflating: animals/cat/1015.jpeg   
  inflating: animals/cat/1016.jpeg   
  inflating: animals/cat/1017.jpeg   
  inflating: animals/cat/1018.jpeg   
  inflating: animals/cat/1019.jpeg   
  inflating: a

# Process data

This part can be run as is.

In [4]:
!pip install datasets evaluate transformers

import PIL, datasets, evaluate
from os import listdir
from os.path import isfile, join
from torchvision.datasets import ImageFolder
from datasets import load_dataset

dataset = load_dataset("imagefolder", data_dir="animals/")
dataset = dataset['train'].train_test_split(test_size=0.3)
labels = dataset["train"].features["label"].names
label2id, id2label = dict(), dict()
for i, label in enumerate(labels):
    label2id[label] = str(i)
    id2label[str(i)] = label

from transformers import AutoImageProcessor
checkpoint = "google/vit-base-patch16-224-in21k"
image_processor = AutoImageProcessor.from_pretrained(checkpoint)

from tensorflow import keras
from keras import layers
import numpy as np
import tensorflow as tf
from PIL import Image
from transformers import DefaultDataCollator
import evaluate
import numpy as np

size = (image_processor.size["height"], image_processor.size["width"])
train_data_augmentation = keras.Sequential(
    [
        layers.RandomCrop(size[0], size[1]),
        layers.Rescaling(scale=1.0 / 127.5, offset=-1),
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(factor=0.02),
        layers.RandomZoom(height_factor=0.2, width_factor=0.2),
    ],
    name="train_data_augmentation",
)
val_data_augmentation = keras.Sequential(
    [
        layers.CenterCrop(size[0], size[1]),
        layers.Rescaling(scale=1.0 / 127.5, offset=-1),
    ],
    name="val_data_augmentation",
)

def convert_to_tf_tensor(image: Image):
    np_image = np.array(image)
    tf_image = tf.convert_to_tensor(np_image)
    return tf.expand_dims(tf_image, 0)

def preprocess_train(example_batch):
    images = [
        train_data_augmentation(convert_to_tf_tensor(image.convert("RGB"))) for image in example_batch["image"]
    ]
    example_batch["pixel_values"] = [tf.transpose(tf.squeeze(image)) for image in images]
    return example_batch

def preprocess_val(example_batch):
    images = [
        val_data_augmentation(convert_to_tf_tensor(image.convert("RGB"))) for image in example_batch["image"]
    ]
    example_batch["pixel_values"] = [tf.transpose(tf.squeeze(image)) for image in images]
    return example_batch

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return accuracy.compute(predictions=predictions, references=labels)

accuracy = evaluate.load("accuracy")
dataset["train"].set_transform(preprocess_train)
dataset["test"].set_transform(preprocess_val)
data_collator = DefaultDataCollator(return_tensors="tf")

Collecting datasets
  Downloading datasets-3.0.1-py3-none-any.whl.metadata (20 kB)
Collecting evaluate
  Downloading evaluate-0.4.3-py3-none-any.whl.metadata (9.2 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess (from datasets)
  Downloading multiprocess-0.70.17-py310-none-any.whl.metadata (7.2 kB)
INFO: pip is looking at multiple versions of multiprocess to determine which version is compatible with other requirements. This could take a while.
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Downloading datasets-3.0.1-py3-none-any.whl (471 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m471.6/471.6 kB[0m [31m26.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading evaluate-0.4.3-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━

Resolving data files:   0%|          | 0/1811 [00:00<?, ?it/s]

Downloading data:   0%|          | 0/1811 [00:00<?, ?files/s]

Generating train split: 0 examples [00:00, ? examples/s]

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


preprocessor_config.json:   0%|          | 0.00/160 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/502 [00:00<?, ?B/s]

Fast image processor class <class 'transformers.models.vit.image_processing_vit_fast.ViTImageProcessorFast'> is available for this model. Using slow image processor class. To use the fast image processor class set `use_fast=True`.


Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

# Modeling

You can try different values for the two parameters below. Save the model when you are happy with the model performance.
- `num_epochs`: like in the previous module, this is the number of iteration
- `learning_rate`: how fast the model will update in each iteration

In [5]:
num_epochs = 4
learning_rate = 0.0003

In [6]:
from transformers import create_optimizer, TFAutoModelForImageClassification
from keras.losses import SparseCategoricalCrossentropy
from transformers.keras_callbacks import KerasMetricCallback

batch_size = 32
num_train_steps = len(dataset["train"]) * num_epochs
weight_decay_rate = 0.01

optimizer, lr_schedule = create_optimizer(
    init_lr=learning_rate,
    num_train_steps=num_train_steps,
    weight_decay_rate=weight_decay_rate,
    num_warmup_steps=0,
)

model = TFAutoModelForImageClassification.from_pretrained(
    checkpoint,
    id2label=id2label,
    label2id=label2id,
)

tf_train_dataset = dataset["train"].to_tf_dataset(
    columns="pixel_values", label_cols="label", shuffle=True, batch_size=batch_size, collate_fn=data_collator
)

tf_eval_dataset = dataset["test"].to_tf_dataset(
    columns="pixel_values", label_cols="label", shuffle=True, batch_size=batch_size, collate_fn=data_collator
)

loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=loss)
metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_eval_dataset)
callbacks = [metric_callback]
model.fit(tf_train_dataset, validation_data=tf_eval_dataset, epochs=num_epochs, callbacks=callbacks)

model.safetensors:   0%|          | 0.00/346M [00:00<?, ?B/s]

Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFViTForImageClassification: ['pooler.dense.weight', 'pooler.dense.bias']
- This IS expected if you are initializing TFViTForImageClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFViTForImageClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFViTForImageClassification were not initialized from the PyTorch model and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4


<tf_keras.src.callbacks.History at 0x7bd0d57e1270>

# Save the Model

Set `model_path` to the desired place to save your model. After thus cell, you are done with this notebook.

In [7]:
model_path = '/content/drive/MyDrive/Enterprise AI/Assignment3/image_classification_model'
model.save_pretrained(model_path)