<a href="https://colab.research.google.com/github/ryrynbob/ust-deep-learning-2026/blob/main/Assignment_for_Week_2_Ryan.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
%matplotlib inline

# Introduction to Deep Learning with Keras and TensorFlow


# Ryan's Work for Week 2 Assignment

# Turn Class Code to Binary!

**Daniel Moser (UT Southwestern Medical Center)**

**Resources: [Xavier Snelgrove](https://github.com/wxs/keras-mnist-tutorial), [Yash Katariya](https://github.com/yashk2810/MNIST-Keras)**

To help you understand the fundamentals of deep learning, this demo will walk through the basic steps of building two toy models for classifying handwritten numbers with accuracies surpassing 95%. The first model will be a basic fully-connected neural network, and the second model will be a deeper network that introduces the concepts of convolution and pooling.

## The Task for the AI

Our goal is to construct and train an artificial neural network on thousands of images of handwritten digits so that it may successfully identify others when presented. The data that will be incorporated is the MNIST database which contains 60,000 images for training and 10,000 test images. We will use the Keras Python API with TensorFlow as the backend.

<img src="https://github.com/AviatorMoser/keras-mnist-tutorial/blob/master/mnist.png?raw=1" >

## Prerequisite Python Modules

First, some software needs to be loaded into the Python environment.

In [2]:
import numpy as np                   # advanced math library
import matplotlib.pyplot as plt      # plotting routines
import random                        # generating random numbers

from tensorflow.keras.datasets import mnist     # MNIST dataset
from tensorflow.keras.models import Sequential  # Model type
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D, BatchNormalization
from tensorflow.keras.utils import to_categorical

# Load the data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

print("X_train shape", X_train.shape)
print("y_train shape", y_train.shape)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
X_train shape (60000, 28, 28)
y_train shape (60000,)


## Modify Data to Binary

In [3]:
# 1. PREPROCESSING THE IMAGES (X)
# Reshape to (60000, 28, 28, 1) for CNN compatibility immediately
X_train = X_train.reshape(60000, 28, 28, 1).astype('float32') / 255
X_test = X_test.reshape(10000, 28, 28, 1).astype('float32') / 255

# 2. PREPROCESSING THE LABELS (y) - CHANGE TO BINARY
# If the digit is odd (1, 3, 5, 7, 9), set label to 1. Else 0.
y_train_binary = (y_train % 2 != 0).astype(int)
y_test_binary = (y_test % 2 != 0).astype(int)

print("Original Label Example:", y_train[0]) # Likely '5'
print("New Binary Label Example:", y_train_binary[0]) # Should be '1' (True)

Original Label Example: 5
New Binary Label Example: 1


## Building Binary Model

In [4]:
model = Sequential()

# Convolution Layer 1
model.add(Conv2D(32, (3, 3), input_shape=(28,28,1)))
model.add(BatchNormalization())
model.add(Activation('relu'))

# Convolution Layer 2
model.add(Conv2D(32, (3, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

# Flattening
model.add(Flatten())

# Fully Connected Layer
model.add(Dense(128))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.2))

# --- THE CRITICAL CHANGE: OUTPUT LAYER ---
# Old: model.add(Dense(10)) + Activation('softmax')
# New: Dense(1) because we have 1 binary outcome.
# New: Activation('sigmoid') squashes output between 0 and 1.
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## Compiling and Training

In [5]:
# Compile
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

# Train
model.fit(X_train, y_train_binary,
          batch_size=128,
          epochs=5,
          verbose=1,
          validation_data=(X_test, y_test_binary))

Epoch 1/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m122s[0m 253ms/step - accuracy: 0.9613 - loss: 0.1029 - val_accuracy: 0.8604 - val_loss: 0.2926
Epoch 2/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m115s[0m 245ms/step - accuracy: 0.9909 - loss: 0.0262 - val_accuracy: 0.9877 - val_loss: 0.0363
Epoch 3/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m115s[0m 244ms/step - accuracy: 0.9932 - loss: 0.0198 - val_accuracy: 0.9930 - val_loss: 0.0212
Epoch 4/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m115s[0m 246ms/step - accuracy: 0.9953 - loss: 0.0146 - val_accuracy: 0.9928 - val_loss: 0.0220
Epoch 5/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m145s[0m 253ms/step - accuracy: 0.9968 - loss: 0.0103 - val_accuracy: 0.9936 - val_loss: 0.0188


<keras.src.callbacks.history.History at 0x7c6971c6c800>

## Testing Output

In [6]:
# Evaluate
score = model.evaluate(X_test, y_test_binary)
print('Test score:', score[0])
print('Test accuracy:', score[1])

# Check a single example
predictions = model.predict(X_test)
index = 0
print(f"Actual Digit: {y_test[index]}")
print(f"Binary Label (Odd?): {y_test_binary[index]}")
print(f"Model Prediction Probability: {predictions[index][0]:.4f}")
print(f"Predicted Class: {round(predictions[index][0])}")

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.9916 - loss: 0.0233
Test score: 0.018758676946163177
Test accuracy: 0.9936000108718872
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step
Actual Digit: 7
Binary Label (Odd?): 1
Model Prediction Probability: 1.0000
Predicted Class: 1


#### For a 3D visualization of a very similar network, visit http://scs.ryerson.ca/~aharley/vis/conv/