The following code uses a 2D Convolutional Neural Network (CNN) to detect whether a page is being flipped or not based on a single image. The model consists of two convolutional layers, each using three 3×3 filters, followed by max pooling layers to reduce spatial dimensions and retain the most important features.

During training, the input images are processed in batches of 32 images, resulting in 75 batches per epoch. For each batch, the model performs forward propagation, computes the loss, and updates the filter and dense layer weights using gradient descent. This process is repeated across 10 epochs, with the learned weights from each epoch carried forward to the next.
After each epoch, the model is evaluated on a separate testing dataset to compute validation accuracy and validation loss. These testing images are not used for learning and do not affect the model’s weights. After training is complete, the final learned weights are used to generate predictions on the full test dataset.

The model achieves a Test F1 score of 97.42%, indicating strong performance. The F1 score is an appropriate evaluation metric for this problem because it balances precision and recall, accounting for both false positives (incorrectly detecting a page flip) and false negatives (missing an actual page flip), which are both important in the context of page flip detection.

In [None]:
# Hide TensorFlow and Keras low level logs (GPU, device, paths)
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

# Hide Python warnings (Keras input shape warning, etc.)
import warnings
warnings.filterwarnings("ignore")

# Import required packages after setting log level
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import f1_score


def clean_path(path_input):
    """
    Remove unwanted quotes from copied macOS folder paths.
    """
    return path_input.replace("'", "").replace('"', "").strip()


def create_generators(train_dir, test_dir, img_size, batch_size):
    """
    Create training and testing image generators for loading data.
    """

    # Normalize pixel values from [0,255] to [0,1]
    train_datagen = ImageDataGenerator(rescale=1.0 / 255.0)
    test_datagen = ImageDataGenerator(rescale=1.0 / 255.0)

    # Generator for training images (shuffled)
    train_gen = train_datagen.flow_from_directory(
        directory=train_dir,
        target_size=img_size,
        batch_size=batch_size,
        class_mode="binary",
        shuffle=True
    )

    # Generator for test images (not shuffled)
    test_gen = test_datagen.flow_from_directory(
        directory=test_dir,
        target_size=img_size,
        batch_size=batch_size,
        class_mode="binary",
        shuffle=False
    )

    return train_gen, test_gen


def build_cnn_model(img_height, img_width):
    """
    Build a simple 2 layer CNN with 3 filters each.
    """

    # Use an explicit Input layer to avoid warnings
    model = models.Sequential([
        layers.Input(shape=(img_height, img_width, 3)),

        layers.Conv2D(filters=3, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),

        layers.Conv2D(filters=3, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),

        layers.Flatten(),
        layers.Dense(32, activation="relu"),
        layers.Dense(1, activation="sigmoid")
    ])

    model.compile(
        optimizer="adam",
        loss="binary_crossentropy",
        metrics=["accuracy"]
    )

    return model


def evaluate_f1(model, test_gen):
    """
    Compute F1 score on the test set predictions.
    """

    test_gen.reset()
    y_prob = model.predict(test_gen, verbose=0)
    y_pred = (y_prob > 0.5).astype("int32").ravel()
    y_true = test_gen.classes

    return f1_score(y_true, y_pred)


# ===============================
# MAIN EXECUTION
# ===============================

img_height = 128
img_width = 128
img_size = (img_height, img_width)

batch_size = 32
epochs = 10

raw_train_dir = input("Paste training folder path: ")
raw_test_dir = input("Paste testing folder path: ")

train_dir = clean_path(raw_train_dir)
test_dir = clean_path(raw_test_dir)

train_gen, test_gen = create_generators(
    train_dir, test_dir, img_size, batch_size
)

# Print only non sensitive info
print(f"Training images: {train_gen.samples}")
print(f"Testing images: {test_gen.samples}")
print("Class indices:", train_gen.class_indices)

model = build_cnn_model(img_height, img_width)

# Hide model summary if you want less output
model.summary()

# Train with less verbose output to reduce printed logs
model.fit(
    train_gen,
    epochs=epochs,
    validation_data=test_gen,
    verbose=2
)

f1 = evaluate_f1(model, test_gen)
print(f"Test F1 score: {f1:.4f}")


Found 2392 images belonging to 2 classes.
Found 597 images belonging to 2 classes.
Training images: 2392
Testing images: 597
Class indices: {'flip': 0, 'notflip': 1}


Epoch 1/10
75/75 - 20s - 263ms/step - accuracy: 0.7115 - loss: 0.5412 - val_accuracy: 0.8124 - val_loss: 0.4180
Epoch 2/10
75/75 - 18s - 246ms/step - accuracy: 0.8997 - loss: 0.2841 - val_accuracy: 0.9179 - val_loss: 0.2368
Epoch 3/10
75/75 - 18s - 240ms/step - accuracy: 0.9323 - loss: 0.1715 - val_accuracy: 0.9012 - val_loss: 0.2244
Epoch 4/10
75/75 - 18s - 240ms/step - accuracy: 0.9640 - loss: 0.1142 - val_accuracy: 0.9246 - val_loss: 0.1766
Epoch 5/10
75/75 - 18s - 236ms/step - accuracy: 0.9749 - loss: 0.0771 - val_accuracy: 0.9631 - val_loss: 0.1141
Epoch 6/10
75/75 - 18s - 235ms/step - accuracy: 0.9841 - loss: 0.0579 - val_accuracy: 0.9414 - val_loss: 0.1536
Epoch 7/10
75/75 - 18s - 236ms/step - accuracy: 0.9816 - loss: 0.0539 - val_accuracy: 0.9682 - val_loss: 0.1017
Epoch 8/10
75/75 - 18s - 244ms/step - accuracy: 0.9783 - loss: 0.0627 - val_accuracy: 0.9447 - val_loss: 0.1321
Epoch 9/10
75/75 - 19s - 248ms/step - accuracy: 0.9895 - loss: 0.0399 - val_accuracy: 0.9765 - val_loss: