# **Introduction**
We are going through to make a classifier for rock, paper, and scissors images using Convolutional Neural Network (CNN) with the help of TensorFlow and Keras. We also going to use Hyperparameter Tuning to help find the optimal model. Happy exploring!

# **Library**
## Import Libraries and Packages
The main library for this project are TensorFlow and its package Keras. So, the first thing you need is to import TensorFlow (make sure you already install the TensorFlow) and Keras will right away imported too. Then to create a new data from our dataset we use ImageDataGenerator from Keras for our image augmentation step.

Note: There are some libraries present in the code cells below this section because I want to show what those libraries do.

In [1]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# **Data Preparation**
## Download and Extract The Dataset
Next, we are going to download the dataset using wget command from the link that have been provided from my learning platform you may use it as well if you run it through Google Colab.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


Extract the zip file

In [4]:
import zipfile
# Open the zip file
with zipfile.ZipFile('/content/drive/MyDrive/FER13_4_cleaner.zip', 'r') as zip_ref:
    # Extract all the files to the current directory
    zip_ref.extractall()

In [6]:
import torch
import torchvision
import torchvision.transforms as transforms
from transformers import AutoImageProcessor, AutoModelForImageClassification

# Define the pre-trained model and processor
model_name = "trpakov/vit-face-expression"
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModelForImageClassification.from_pretrained(model_name)

# Add a new layer to the model
model.classifier = torch.nn.Linear(model.config.hidden_size, 5)

# Define the data transforms using the processor
def preprocess_image(image):
    inputs = processor(image, return_tensors="pt")
    return inputs["pixel_values"].squeeze(0)

# Load the data using ImageFolder
train_dataset = torchvision.datasets.ImageFolder(root='/content/FER13_cleaner/train', transform=preprocess_image)
test_dataset = torchvision.datasets.ImageFolder(root='/content/FER13_cleaner/test', transform=preprocess_image)


In [7]:
# Define the data loaders
batch_size = 32
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Set the device (GPU or CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Define the optimizer and loss function
optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
criterion = torch.nn.CrossEntropyLoss()


In [8]:
# Train the model
for epoch in range(5):
    model.train()
    total_correct = 0
    total_loss = 0
    for batch in train_loader:
        images, labels = batch
        images = images.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs.logits, labels)
        loss.backward()
        optimizer.step()
        _, predicted = torch.max(outputs.logits, 1)
        total_correct += (predicted == labels).sum().item()
        total_loss += loss.item()
    accuracy = total_correct / len(train_loader.dataset)
    print(f"Epoch {epoch+1}, Train Accuracy: {accuracy:.4f}, Train Loss: {total_loss / len(train_loader)}")

    model.eval()
    # Evaluate the model on the test set
    with torch.no_grad():
        total_correct = 0
        for batch in test_loader:
            images, labels = batch
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.logits, 1)
            total_correct += (predicted == labels).sum().item()
        accuracy = total_correct / len(test_loader.dataset)
        print(f"Epoch {epoch+1}, Test Accuracy: {accuracy:.4f}")

Epoch 1, Train Accuracy: 0.9643, Train Loss: 0.19622654114781174
Epoch 1, Test Accuracy: 0.7651
Epoch 2, Train Accuracy: 0.9785, Train Loss: 0.08206585861940582
Epoch 2, Test Accuracy: 0.7434
Epoch 3, Train Accuracy: 0.9827, Train Loss: 0.06336910536474394
Epoch 3, Test Accuracy: 0.7634
Epoch 4, Train Accuracy: 0.9862, Train Loss: 0.049064329871668214
Epoch 4, Test Accuracy: 0.7571
Epoch 5, Train Accuracy: 0.9890, Train Loss: 0.039747174446252184
Epoch 5, Test Accuracy: 0.7628


In [9]:
# Save the model
torch.save(model.state_dict(), 'swin_transformer_model.pth')

In [10]:
model.load_state_dict(torch.load('/content/swin_transformer_model.pth', map_location='cpu'))
model.eval()

ViTForImageClassification(
  (vit): ViTModel(
    (embeddings): ViTEmbeddings(
      (patch_embeddings): ViTPatchEmbeddings(
        (projection): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
      )
      (dropout): Dropout(p=0.0, inplace=False)
    )
    (encoder): ViTEncoder(
      (layer): ModuleList(
        (0-11): 12 x ViTLayer(
          (attention): ViTSdpaAttention(
            (attention): ViTSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.0, inplace=False)
            )
            (output): ViTSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.0, inplace=False)
            )
          )
          (intermediate): ViTIntermediate(
            (dense): Linear(in_fe

In [11]:
sample_input = torch.rand((1, 3, 224, 224))

In [12]:
# Pindahkan model ke GPU jika tersedia
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)

ViTForImageClassification(
  (vit): ViTModel(
    (embeddings): ViTEmbeddings(
      (patch_embeddings): ViTPatchEmbeddings(
        (projection): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
      )
      (dropout): Dropout(p=0.0, inplace=False)
    )
    (encoder): ViTEncoder(
      (layer): ModuleList(
        (0-11): 12 x ViTLayer(
          (attention): ViTSdpaAttention(
            (attention): ViTSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.0, inplace=False)
            )
            (output): ViTSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.0, inplace=False)
            )
          )
          (intermediate): ViTIntermediate(
            (dense): Linear(in_fe

In [13]:
!pip install onnx
torch.onnx.export(
    model,                  # PyTorch Model
    sample_input.to(device),                    # Input tensor
    'odel-swin.onnx',        # Output file (eg. 'output_model.onnx')
    opset_version=14,       # Operator support version (updated to 14)
    export_params=True,    # Export model parameters
    verbose=True           # Tampilkan pesan debug
)

Collecting onnx
  Downloading onnx-1.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.9 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/15.9 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.6/15.9 MB[0m [31m19.7 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.2/15.9 MB[0m [31m76.1 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━[0m [32m11.6/15.9 MB[0m [31m156.6 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m15.9/15.9 MB[0m [31m204.8 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m15.9/15.9 MB[0m [31m204.8 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.9/15.9 MB[0m [31m84.4 MB/s[

  if num_channels != self.num_channels:
  if height != self.image_size[0] or width != self.image_size[1]:


In [14]:
import onnx

# Load the ONNX model
model = onnx.load("odel-swin.onnx")

# Check that the IR is well formed
onnx.checker.check_model(model)

# Print a Human readable representation of the graph
onnx.helper.printable_graph(model.graph)

'graph main_graph (\n  %input.1[FLOAT, 1x3x224x224]\n) initializers (\n  %vit.embeddings.cls_token[FLOAT, 1x1x768]\n  %vit.embeddings.position_embeddings[FLOAT, 1x197x768]\n  %vit.embeddings.patch_embeddings.projection.weight[FLOAT, 768x3x16x16]\n  %vit.embeddings.patch_embeddings.projection.bias[FLOAT, 768]\n  %vit.encoder.layer.0.attention.attention.query.bias[FLOAT, 768]\n  %vit.encoder.layer.0.attention.attention.key.bias[FLOAT, 768]\n  %vit.encoder.layer.0.attention.attention.value.bias[FLOAT, 768]\n  %vit.encoder.layer.0.attention.output.dense.bias[FLOAT, 768]\n  %vit.encoder.layer.0.intermediate.dense.bias[FLOAT, 3072]\n  %vit.encoder.layer.0.output.dense.bias[FLOAT, 768]\n  %vit.encoder.layer.0.layernorm_before.weight[FLOAT, 768]\n  %vit.encoder.layer.0.layernorm_before.bias[FLOAT, 768]\n  %vit.encoder.layer.0.layernorm_after.weight[FLOAT, 768]\n  %vit.encoder.layer.0.layernorm_after.bias[FLOAT, 768]\n  %vit.encoder.layer.1.attention.attention.query.bias[FLOAT, 768]\n  %vit.enc

In [15]:
!pip install onnxruntime

Collecting onnxruntime
  Downloading onnxruntime-1.18.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m23.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting coloredlogs (from onnxruntime)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.0/46.0 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime)
  Downloading humanfriendly-10.0-py2.py3-none-any.whl (86 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.8/86.8 kB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: humanfriendly, coloredlogs, onnxruntime
Successfully installed coloredlogs-15.0.1 humanfriendly-10.0 onnxruntime-1.18.0


In [16]:
import onnxruntime as ort
import numpy as np
ort_session = ort.InferenceSession('odel-swin.onnx')
outputs = ort_session.run(
    None,
    {'input.1': np.random.randn(1, 3, 224, 224).astype(np.float32)}
)

In [17]:
!git clone https://github.com/onnx/onnx-tensorflow.git && cd onnx-tensorflow
!pip install -e .

Cloning into 'onnx-tensorflow'...
remote: Enumerating objects: 6516, done.[K
remote: Counting objects: 100% (465/465), done.[K
remote: Compressing objects: 100% (200/200), done.[K
remote: Total 6516 (delta 326), reused 383 (delta 261), pack-reused 6051[K
Receiving objects: 100% (6516/6516), 1.98 MiB | 13.90 MiB/s, done.
Resolving deltas: 100% (5051/5051), done.
Obtaining file:///content
[31mERROR: file:///content does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.[0m[31m
[0m

In [18]:
import onnx

onnx_model = onnx.load('odel-swin.onnx')

In [19]:
!pip install onnx-tf
from onnx_tf.backend import prepare

Collecting onnx-tf
  Downloading onnx_tf-1.10.0-py3-none-any.whl (226 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m226.1/226.1 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Collecting tensorflow-addons (from onnx-tf)
  Downloading tensorflow_addons-0.23.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (611 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m611.8/611.8 kB[0m [31m23.8 MB/s[0m eta [36m0:00:00[0m
Collecting typeguard<3.0.0,>=2.7 (from tensorflow-addons->onnx-tf)
  Downloading typeguard-2.13.3-py3-none-any.whl (17 kB)
Installing collected packages: typeguard, tensorflow-addons, onnx-tf
Successfully installed onnx-tf-1.10.0 tensorflow-addons-0.23.0 typeguard-2.13.3



TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). 

For more information see: https://github.com/tensorflow/addons/issues/2807 



In [2]:
import onnx
from onnx import helper

onnx_model = onnx.load('odel-swin.onnx')

# Define a mapping from old names to new names
name_map = {"input.1": "input_1"}

# Initialize a list to hold the new inputs
new_inputs = []

# Iterate over the inputs and change their names if needed
for inp in onnx_model.graph.input:
    if inp.name in name_map:
        # Create a new ValueInfoProto with the new name
        new_inp = helper.make_tensor_value_info(name_map[inp.name],
                                                inp.type.tensor_type.elem_type,
                                                [dim.dim_value for dim in inp.type.tensor_type.shape.dim])
        new_inputs.append(new_inp)
    else:
        new_inputs.append(inp)

# Clear the old inputs and add the new ones
onnx_model.graph.ClearField("input")
onnx_model.graph.input.extend(new_inputs)

# Go through all nodes in the model and replace the old input name with the new one
for node in onnx_model.graph.node:
    for i, input_name in enumerate(node.input):
        if input_name in name_map:
            node.input[i] = name_map[input_name]

# Save the renamed ONNX model
onnx.save(onnx_model, 'model-swin.onnx')

ModuleNotFoundError: No module named 'onnx'

In [21]:
import onnx

# Load the ONNX model
model = onnx.load('model-swin.onnx')

# Print the names of the inputs
for input in model.graph.input:
    print("Input name:", input.name)


Input name: input_1


In [22]:
# Create the input tensor
input_tensor = np.random.randn(1, 3, 224, 224).astype(np.float32)

# Prepare the TensorFlow representation
tf_rep = prepare(onnx_model, input_shapes={'input.1': [1, 3, 224, 224]})

In [23]:
tf_rep.export_graph('modelbismillah.h5')

In [24]:
import tensorflow as tf

model = tf.saved_model.load('modelbismillah.h5')
model.trainable = False
input_tensor = tf.random.uniform([1, 3, 224, 224])
out = model(input_1=input_tensor)

In [1]:
import tensorflow as tf

try:
    # Convert the TensorFlow model to TensorFlow Lite
    converter = tf.lite.TFLiteConverter.from_saved_model("modelbismillah.h5")

    # Enable TensorFlow Select ops
    converter.target_spec.supported_ops = [
        tf.lite.OpsSet.TFLITE_BUILTINS,  # Enable TensorFlow Lite ops.
        tf.lite.OpsSet.SELECT_TF_OPS     # Enable TensorFlow Select ops.
    ]

    tflite_model = converter.convert()

    # Save the TensorFlow Lite model
    with open("simple_model.tflite", "wb") as f:
        f.write(tflite_model)

    print("Model conversion successful and saved as 'simple_model.tflite'.")
except Exception as e:
    print("Error during model conversion:", str(e))

Error during model conversion: SavedModel file does not exist at: modelbismillah.h5/{saved_model.pbtxt|saved_model.pb}


In [26]:
import numpy as np
import tensorflow as tf

# Load the TFLite model and allocate tensors
interpreter = tf.lite.Interpreter(model_path="simple_model.tflite")
interpreter.allocate_tensors()

# Get input and output tensors
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Test the model on random input data
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)

interpreter.invoke()

# get_tensor() returns a copy of the tensor data
# use tensor() in order to get a pointer to the tensor
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)

[[ 1.2821207   0.77802527  0.40348542 -0.60586274 -2.4083736 ]]
