## Shape Image Classification

Given *images of shapes,* let's try to predict which **stage** is present in a given image. 

We will use a Tensorflow/Keras convolutional neural network to make our predictions. 

Data Source: https://www.kaggle.com/datasets/cactus3/basicshapes

### Getting Started 

In [1]:
import numpy as np
import pandas as pd
from pathlib import Path
import os.path

from sklearn.model_selection import train_test_split

import tensorflow as tf

2024-12-11 09:35:23.024898: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [2]:
image_dir = Path('shapes')

### Creating File DataFrame

In [5]:
filepaths = list(image_dir.glob(r'**/*.png'))
filepaths

[PosixPath('shapes/circles/drawing(78).png'),
 PosixPath('shapes/circles/drawing(2).png'),
 PosixPath('shapes/circles/drawing(92).png'),
 PosixPath('shapes/circles/drawing(58).png'),
 PosixPath('shapes/circles/drawing(10).png'),
 PosixPath('shapes/circles/drawing(31).png'),
 PosixPath('shapes/circles/drawing(13).png'),
 PosixPath('shapes/circles/drawing(55).png'),
 PosixPath('shapes/circles/drawing(27).png'),
 PosixPath('shapes/circles/drawing(24).png'),
 PosixPath('shapes/circles/drawing(74).png'),
 PosixPath('shapes/circles/drawing(59).png'),
 PosixPath('shapes/circles/drawing(67).png'),
 PosixPath('shapes/circles/drawing(39).png'),
 PosixPath('shapes/circles/drawing(3).png'),
 PosixPath('shapes/circles/drawing(30).png'),
 PosixPath('shapes/circles/drawing(18).png'),
 PosixPath('shapes/circles/drawing(42).png'),
 PosixPath('shapes/circles/drawing(45).png'),
 PosixPath('shapes/circles/drawing(7).png'),
 PosixPath('shapes/circles/drawing(56).png'),
 PosixPath('shapes/circles/drawing(79

In [8]:
os.path.split(os.path.split('shapes/circles/drawing(39).png')[0])[1]

'circles'

In [11]:
labels = list(map(lambda x: os.path.split(os.path.split(x)[0])[1], filepaths))
labels

['circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'circles',
 'ci

In [12]:
filepaths = pd.Series(filepaths, name='Filepath').astype(str)
labels = pd.Series(labels, name='Label')

image_df = pd.concat([filepaths, labels], axis=1)
image_df

Unnamed: 0,Filepath,Label
0,shapes/circles/drawing(78).png,circles
1,shapes/circles/drawing(2).png,circles
2,shapes/circles/drawing(92).png,circles
3,shapes/circles/drawing(58).png,circles
4,shapes/circles/drawing(10).png,circles
...,...,...
295,shapes/squares/drawing(96).png,squares
296,shapes/squares/drawing(11).png,squares
297,shapes/squares/drawing(37).png,squares
298,shapes/squares/drawing(81).png,squares


In [13]:
train_df, test_df = train_test_split(image_df, train_size=0.7, shuffle=True, random_state=1)

In [14]:
train_df

Unnamed: 0,Filepath,Label
253,shapes/squares/drawing(1).png,squares
19,shapes/circles/drawing(7).png,circles
14,shapes/circles/drawing(3).png,circles
91,shapes/circles/drawing(48).png,circles
296,shapes/squares/drawing(11).png,squares
...,...,...
203,shapes/squares/drawing(58).png,squares
255,shapes/squares/drawing(80).png,squares
72,shapes/circles/drawing(53).png,circles
235,shapes/squares/drawing(97).png,squares


In [15]:
test_df

Unnamed: 0,Filepath,Label
189,shapes/triangles/drawing(14).png,triangles
123,shapes/triangles/drawing(4).png,triangles
185,shapes/triangles/drawing(44).png,triangles
213,shapes/squares/drawing(39).png,squares
106,shapes/triangles/drawing(13).png,triangles
...,...,...
181,shapes/triangles/drawing(93).png,triangles
290,shapes/squares/drawing(61).png,squares
244,shapes/squares/drawing(60).png,squares
197,shapes/triangles/drawing(37).png,triangles


### Loading Image Data

In [16]:
train_generator = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale = 1./255,
    validation_split = 0.2,
)

test_generator = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale = 1./255
)

In [20]:
train_images = train_generator.flow_from_dataframe(
    dataframe = train_df,
    x_col = 'Filepath',
    y_col = 'Label',
    target_size = (28, 28),
    color_mode = 'grayscale',
    class_mode = 'categorical',
    batch_size = 32,
    shuffle=True,
    seed = 42,
    subset = 'training'
)

Found 168 validated image filenames belonging to 3 classes.


In [21]:
val_images = train_generator.flow_from_dataframe(
    dataframe = train_df,
    x_col = 'Filepath',
    y_col = 'Label',
    target_size = (28, 28),
    color_mode = 'grayscale',
    class_mode = 'categorical',
    batch_size = 32,
    shuffle=True,
    seed = 42,
    subset = 'validation'
)

Found 42 validated image filenames belonging to 3 classes.


In [22]:
test_images = test_generator.flow_from_dataframe(
    dataframe = test_df,
    x_col = 'Filepath',
    y_col = 'Label',
    target_size = (28, 28),
    color_mode = 'grayscale',
    class_mode = 'categorical',
    batch_size = 32,
    shuffle=False
)

Found 90 validated image filenames belonging to 3 classes.


### Training

In [33]:
inputs = tf.keras.Input(shape=(28, 28, 1))
x = tf.keras.layers.Conv2D(filters=6, kernel_size=(3,3), activation='relu')(inputs)
inputs

<KerasTensor: shape=(None, 28, 28, 1) dtype=float32 (created by layer 'input_4')>

In [34]:
x

<KerasTensor: shape=(None, 26, 26, 6) dtype=float32 (created by layer 'conv2d_6')>

In [35]:
x = tf.keras.layers.MaxPool2D()(x)
x

<KerasTensor: shape=(None, 13, 13, 6) dtype=float32 (created by layer 'max_pooling2d_5')>

In [36]:
x = tf.keras.layers.Conv2D(filters=16, kernel_size=(3,3), activation='relu')(x)
x = tf.keras.layers.MaxPool2D()(x)
x

<KerasTensor: shape=(None, 5, 5, 16) dtype=float32 (created by layer 'max_pooling2d_6')>

In [37]:
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x

<KerasTensor: shape=(None, 16) dtype=float32 (created by layer 'global_average_pooling2d')>

In [38]:
x = tf.keras.layers.Dense(64, activation='relu')(x)
x = tf.keras.layers.Dense(64, activation='relu')(x)
outputs = tf.keras.layers.Dense(3, activation='softmax')(x)

In [42]:
model = tf.keras.Model(inputs=inputs, outputs=outputs)

model.compile(
    optimizer = 'adam',
    loss = 'categorical_crossentropy',
    metrics=['accuracy']
)

history = model.fit(
    train_images,
    validation_data = val_images,
    epochs=100,
    callbacks=[
        tf.keras.callbacks.ModelCheckpoint('./model.h5', save_best_only=True, save_weights_only=True)
    ]
)

Epoch 1/100


2024-12-11 10:26:17.984130: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype int32
	 [[{{node Placeholder/_0}}]]




2024-12-11 10:26:19.429310: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype int32
	 [[{{node Placeholder/_0}}]]


Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 7

In [43]:
model.load_weights('model.h5')

### Results

In [46]:
results = model.evaluate(test_images, verbose=0)

print("    Loss: {:.4f}".format(results[0]))
print("Accuracy: {:.2f}%".format(results[1] * 100))

    Loss: 0.4033
Accuracy: 80.00%


2024-12-11 10:29:44.301356: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype int32
	 [[{{node Placeholder/_0}}]]
