# Image classification with ConvMixer

**Author:** [Sayak Paul](https://twitter.com/RisingSayak)<br>
**Date created:** 2021/10/12<br>
**Last modified:** 2021/10/12<br>
**Description:** An all-convolutional network applied to patches of images.

## Introduction

Vision Transformers (ViT; [Dosovitskiy et al.](https://arxiv.org/abs/1612.00593)) extract
small patches from the input images, linearly project them, and then apply the
Transformer ([Vaswani et al.](https://arxiv.org/abs/1706.03762)) blocks. The application
of ViTs to image recognition tasks is quickly becoming a promising area of research,
because ViTs eliminate the need to have strong inductive biases (such as convolutions) for
modeling locality. This presents them as a general computation primititive capable of
learning just from the training data with as minimal inductive priors as possible. ViTs
yield great downstream performance when trained with proper regularization, data
augmentation, and relatively large datasets.

In the [Patches Are All You Need](https://openreview.net/pdf?id=TVHS5Y4dNvM) paper (note: at
the time of writing, it is a submission to the ICLR 2022 conference), the authors extend
the idea of using patches to train an all-convolutional network and demonstrate
competitive results. Their architecture namely **ConvMixer** uses recipes from the recent
isotrophic architectures like ViT, MLP-Mixer
([Tolstikhin et al.](https://arxiv.org/abs/2105.01601)), such as using the same
depth and resolution across different layers in the network, residual connections,
and so on.

In this example, we will implement the ConvMixer model and demonstrate its performance on
the CIFAR-10 dataset.

To use the AdamW optimizer, we need to install TensorFlow Addons:

```shell
pip install -U -q tensorflow-addons
```

In [15]:
pip install -U -q tensorflow-addons

In [16]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Imports

In [8]:
from tensorflow.keras import layers
from tensorflow import keras

import matplotlib.pyplot as plt
import tensorflow_addons as tfa
import tensorflow as tf
import numpy as np


TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). 

For more information see: https://github.com/tensorflow/addons/issues/2807 



## Hyperparameters

To keep run time short, we will train the model for only 10 epochs. To focus on
the core ideas of ConvMixer, we will not use other training-specific elements like
RandAugment ([Cubuk et al.](https://arxiv.org/abs/1909.13719)). If you are interested in
learning more about those details, please refer to the
[original paper](https://openreview.net/pdf?id=TVHS5Y4dNvM).

In [3]:
learning_rate = 0.001
weight_decay = 0.0001
batch_size = 32
num_epochs = 10

In [10]:
import cv2,os
from keras.utils import to_categorical

In [5]:
def load_images_from_folder(folder_path):
    images = []
    labels = []
    class_names = sorted(os.listdir(folder_path))
    for i, class_name in enumerate(class_names):
        class_path = os.path.join(folder_path, class_name)
        for filename in os.listdir(class_path):
            img_path = os.path.join(class_path, filename)
            img = cv2.imread(img_path)
            img = cv2.resize(img, (128,128))  # resize images to 32x32
            images.append(img)
            labels.append(class_name)
    label_to_index = dict((name, index) for index, name in enumerate(class_names))
    labels = [label_to_index[label] for label in labels]
    images = np.array(images)
    labels = np.array(labels)
    return images, labels


In [11]:
data, labels = load_images_from_folder('/content/drive/MyDrive/Azymer/AlzimerSVMModel/PreprocessSeg')
num_classes = len(set(labels))
labels = to_categorical(labels, 4)


In [12]:
split_ratio = 0.8
split_index = int(split_ratio * len(data))
train_data, train_labels = data[:split_index], labels[:split_index]
test_data, test_labels = data[split_index:], labels[split_index:]


## Load the CIFAR-10 dataset

In [13]:
val_split = 0.1

val_indices = int(len(train_data) * val_split)
new_x_train, new_y_train = train_data[val_indices:], train_labels[val_indices:]
x_val, y_val = train_data[:val_indices], train_labels[:val_indices]
new_x_test, new_y_test = test_data[val_indices:], test_labels[val_indices:]
x_test, y_test = test_data[:val_indices], test_labels[:val_indices]


print(f"Training data samples: {len(new_x_train)}")
print(f"Validation data samples: {len(x_val)}") 

Training data samples: 4608
Validation data samples: 512


In [14]:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Define your model architecture
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(4, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_data, train_labels, epochs=10, batch_size=32, validation_data=(test_data, test_labels))


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fae1867ec70>

In [18]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import matplotlib.pyplot as plt
import numpy as np

# Load pre-trained model
model = keras.applications.ResNet50(weights='imagenet')

# Load and preprocess image
img_path = '/content/drive/MyDrive/Azymer/AlzimerSVMModel/Preprocessd/Mild_Demented/0.png'
img = keras.preprocessing.image.load_img(img_path, target_size=(224,224))
x = keras.preprocessing.image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Make prediction
preds = model.predict(x)

# Decode the predictions and display top 5 classes
decoded_preds = decode_predictions(preds, top=4)[0]
for pred in decoded_preds:
    print(pred[1], ':', pred[2])


Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
stopwatch : 0.1569954
pick : 0.119139
chambered_nautilus : 0.060526945
shield : 0.05736747


In [19]:
test_loss, test_acc = model.evaluate(x_test, y_test)

RuntimeError: ignored

In [20]:
import tensorflow as tf
from tensorflow.keras import layers

# Define the input shape
input_shape = (128,128,3)

# Define the ConvMixer architecture
def convmixer_model(input_shape, num_classes):
    inputs = layers.Input(shape=input_shape)
    x = layers.Conv2D(filters=256, kernel_size=9, strides=1, padding='valid')(inputs)
    x = layers.LayerNormalization()(x)
    x = layers.Permute((3, 1, 2))(x)
    x = layers.Reshape((-1, x.shape[-1]))(x)
    num_patches = x.shape[1]
    x = layers.Dense(units=256, activation='gelu')(x)
    x = layers.Dropout(rate=0.5)(x)
    x = layers.Dense(units=512)(x)
    x = layers.Dropout(rate=0.5)(x)
    x = layers.LayerNormalization()(x)
    x = layers.Dense(units=num_classes, activation='softmax')(x)
    outputs = layers.Flatten()(x)

    return tf.keras.Model(inputs=inputs, outputs=outputs)

# Create the model
model = convmixer_model(input_shape, num_classes=4)

# Compile the model
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
    loss='binary_crossentropy',
    metrics=['accuracy']
)

# Train the model
model.fit(
    train_data, train_labels, 
    batch_size=32, 
    epochs=10, 
    validation_data=(x_val, y_val)
)


Epoch 1/10


ValueError: ignored