# CIFAR-10 Luokitteluprojekti (FCN)


#### Data preparation

- Fetching data
- Reshaping data
- Split dataset to training, test and validation

In [15]:
from keras.src.utils import to_categorical
import keras
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt



(X_train, y_train), (X_test, y_test) = keras.datasets.cifar10.load_data()

X_train = X_train.reshape(-1, 32 * 32 * 3)
X_test = X_test.reshape(-1, 32 * 32 * 3)

X_train = X_train.astype('float32') / 255
X_test = X_test.astype('float32') / 255

# Add stronger normalization here
import numpy as np
mean = np.mean(X_train, axis=0)
std = np.std(X_train, axis=0) + 1e-7
X_train = (X_train - mean) / std
X_test = (X_test - mean) / std

# Data for generator use
X_train_split, X_val, y_train_split, y_val = train_test_split(
    X_train, y_train, test_size=0.2, random_state=42
)

# Then convert labels to categorical format
y_train_split = to_categorical(y_train_split, 10)
y_val = to_categorical(y_val, 10)
y_test = to_categorical(y_test, 10)



#y_train = to_categorical(y_train, 10)
#y_test = to_categorical(y_test, 10)

print(f"Training data shape: {X_train.shape}")
print(f"Test data shape: {X_test.shape}")

Training data shape: (50000, 3072)
Test data shape: (10000, 3072)


#### Modeling

We build FCN model here, simple model which reached to around 50% accuracy is commented off. The heavier model got the accuracy of 62%.

In [16]:
from keras import backend as K
from keras import layers
from keras import regularizers

print(K.backend())

# Simple model ~50% accuracy
'''
inputs = keras.Input(shape=(3072,))
x = layers.Dense(256, activation="relu")(inputs)
x = layers.Dense(128, activation="relu")(x)
outputs = layers.Dense(10, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="cifar10_model")
'''

# Wider and deeper FCN
inputs = keras.Input(shape=(3072,))
x = layers.Dense(1024, activation="relu", kernel_regularizer=regularizers.l2(0.0005))(inputs)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.3)(x)

x = layers.Dense(512, activation="relu", kernel_regularizer=regularizers.l2(0.0005))(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.4)(x)

x = layers.Dense(256, activation="relu", kernel_regularizer=regularizers.l2(0.0005))(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.4)(x)

x = layers.Dense(128, activation="relu", kernel_regularizer=regularizers.l2(0.0005))(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.3)(x)

outputs = layers.Dense(10, activation="softmax")(x)
improved_model = keras.Model(inputs=inputs, outputs=outputs)

tensorflow


In [17]:
improved_model.summary()

In [None]:
from keras_preprocessing.image import ImageDataGenerator

# Create data generator
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True
)

# Reshape for augmentation, then flatten for model
# This creates random variations of your training images by
# This is mostly for CNN model, but it might help FCN model to perform better
def generate_augmented_batches(X, y, batch_size=128):
    X_reshaped = X.reshape(-1, 32, 32, 3)
    gen = datagen.flow(X_reshaped, y, batch_size=batch_size)
    while True:
        X_batch, y_batch = gen.next()
        yield X_batch.reshape(-1, 3072), y_batch

In [None]:
# Learning rate schedule
lr_scheduler = keras.callbacks.ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=3,
    min_lr=1e-6
)

# Different optimizer configuration
improved_model.compile(
    loss=keras.losses.CategoricalCrossentropy(),
    optimizer=keras.optimizers.SGD(learning_rate=0.01, momentum=0.9, nesterov=True),
    metrics=["accuracy"],
)

# Early stop callback function
early_stop = keras.callbacks.EarlyStopping(
    monitor='val_loss',
    patience=10,
    min_delta=0.0001,
    restore_best_weights=True
)


# Reshape and flatten data
train_generator = generate_augmented_batches(X_train_split, y_train_split)

# Add the callbacks
history = improved_model.fit(
    train_generator,
    steps_per_epoch=len(X_train_split) // 128,
    epochs=100,
    validation_data=(X_val, y_val),
    callbacks=[early_stop, lr_scheduler]
)


#### Evaluation

This code evaluates the model on the test set and prints the test loss and accuracy. It also visualizes some predictions by displaying the images and their corresponding prediction probabilities.

In [None]:

plt.plot(history.history["loss"])
plt.plot(history.history["val_loss"])
plt.grid()
plt.show()

In [None]:
test_scores = improved_model.evaluate(X_test, y_test, verbose=2)
print("Test loss:", test_scores[0])
print("Test accuracy:", test_scores[1])

In [None]:
import numpy as np

start = 2000

for k in range(10):
    plt.figure(figsize=(8, 2))

    # Get one image
    x = np.expand_dims(X_test[start + k], axis=0)

    # Predict
    y = improved_model.predict(x)[0]

    # Show the image
    plt.subplot(1, 2, 1)
    plt.imshow(X_test[start + k].reshape(32, 32, 3))
    plt.axis("off")

    # Show the prediction probabilities
    plt.subplot(1, 2, 2)
    plt.bar(np.arange(10), y)
    plt.xticks(range(10))

    plt.show()


#### Summary

It is possible to make an FCN model perform nearly as well as a simple and much lighter CNN model. For datasets consisting of images, there is no advantage to using an FCN model over a CNN model. The fitting time of this heavy model with a CPU is about 2 hours. We achieved around 62% accuracy with 3.8 million parameters, and when testing with a simpler model (commented in the model code block), we achieved approximately 50% accuracy with a fitting time of 2 minutes on an average CPU.

We are using 100 epochs and an early stopping callback function, which will terminate the model fitting process when it detects overfitting.