# Classical image recognition using a convolutional neural network (CNN)

In [1]:
import numpy as np

# Step 1: Load training data and test data
train_dataset = "./dataset/recycled_32_train.npz"
train_data = np.load(train_dataset)

test_dataset = "./dataset/recycled_32_test.npz"
test_data = np.load(test_dataset)

The "x" of the dataset contains a two dimension numpy array of uint8, where each
row contains a 32x32 coloured image. The picture follows the "channel first" rule.

The "y" of the dataset contains a one dimension numpy array of uint8, where each
value indicates the label of corresponding x item.


In [2]:
# Step 2: Extract data from npz file

# Train data
train_images = train_data["x"]
train_labels = train_data["y"]

# Test data
test_images = test_data["x"]
test_labels = test_data["y"]

In [3]:
# Step 3: Preprocess the data

# Normalize the data
train_images = train_images / 255.0
test_images = test_images / 255.0

# Reshape the data
train_images = train_images.reshape(train_images.shape[0], 32, 32, 3)

test_images = test_images.reshape(test_images.shape[0], 32, 32, 3)

We use only 3 layers to simplify the model and make the fair competition with the ability to test the model, run in different scenarios and cases.

In [4]:
import tensorflow as tf

# Build the model

model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(32, 32, 3)),
    tf.keras.layers.Dense(256, activation="relu"),
    tf.keras.layers.Dense(5, activation="softmax")
])


We still use Adam optimizer, and the loss function is still cross entropy, because we want to acheive the best performance within the simplest model layer set.

In [5]:
# Compile the model

model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

In [6]:

import time

start = time.time()

# Train the model
model.fit(train_images, train_labels, epochs=50, verbose=2)

end = time.time() - start

print("Training time:", end, "seconds")

Epoch 1/50
313/313 - 1s - loss: 1.3604 - accuracy: 0.4414 - 997ms/epoch - 3ms/step
Epoch 2/50
313/313 - 1s - loss: 1.1259 - accuracy: 0.5455 - 1s/epoch - 4ms/step
Epoch 3/50
313/313 - 1s - loss: 1.0731 - accuracy: 0.5781 - 1s/epoch - 4ms/step
Epoch 4/50
313/313 - 1s - loss: 1.0428 - accuracy: 0.5872 - 738ms/epoch - 2ms/step
Epoch 5/50
313/313 - 1s - loss: 1.0015 - accuracy: 0.6066 - 1s/epoch - 3ms/step
Epoch 6/50
313/313 - 1s - loss: 0.9572 - accuracy: 0.6145 - 729ms/epoch - 2ms/step
Epoch 7/50
313/313 - 1s - loss: 0.9228 - accuracy: 0.6357 - 710ms/epoch - 2ms/step
Epoch 8/50
313/313 - 1s - loss: 0.8851 - accuracy: 0.6508 - 710ms/epoch - 2ms/step
Epoch 9/50
313/313 - 1s - loss: 0.8567 - accuracy: 0.6596 - 706ms/epoch - 2ms/step
Epoch 10/50
313/313 - 1s - loss: 0.8321 - accuracy: 0.6789 - 732ms/epoch - 2ms/step
Epoch 11/50
313/313 - 1s - loss: 0.8057 - accuracy: 0.6882 - 709ms/epoch - 2ms/step
Epoch 12/50
313/313 - 1s - loss: 0.7858 - accuracy: 0.6965 - 723ms/epoch - 2ms/step
Epoch 13/5

We time the training process and the testing process, and we also record the accuracy of the model, so we can use this data in comparison with the other models and then make the conclusion and analysis with visual component.

In [7]:
# Evaluate the model

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)

print("\nTest accuracy:", test_acc)

47/47 - 0s - loss: 0.6886 - accuracy: 0.7347 - 152ms/epoch - 3ms/step

Test accuracy: 0.734666645526886


In [8]:
# Save the model

model.save("./model/classical_model.h5")

  saving_api.save_model(


After the model is saved, we can use the model to predict the image, and we can also use the model to predict the image in the test dataset, and then we can compare the result with the real label to see the accuracy of the model.

The final model will be in releases of the repository, so it can be freely accessed and tested.

*All the training and testing processes are done using Macbook Air 2020 M1 16GB RAM.*