You can just click **Runtime → Run all** to try the demo.

# Module 2 - Simple Neural Network: Identify Handwritten digits

In this notebook, we examin how a multilayer perceptron (MLP) identifies hand-written digits.

An MLP is a simple neural network with:
- An **input layer** that receives the raw data
- One or more **hidden layers** that learn patterns (edges, curves, etc.)
- An **output layer** that gives a prediction (here: digit 0–9)

We use:

* Modified National Institute of Standards and Technology (MNIST) data set, which includes handwritten digits labeled as 0-9

MLP demonstrates that machines could learn patterns from data, not by following rules, but by adjusting internal connections based on examples. This idea laid the foundation for most modern AI systems.


In [None]:
%%capture
#@title System set-up, install and load packages {display-mode: "form"}

!pip ipywidgets

import numpy as np                     # for numerical operations
import matplotlib.pyplot as plt        # for plotting digits and visualizations
import seaborn as sns
import tensorflow as tf                # for building and training the neural network

from tensorflow.keras import layers    # for concise model building
import requests # for url request

# Optional: for interactive visualizations (if used later)
import ipywidgets as widgets
from IPython.display import display

# suppress warnings
import warnings
warnings.filterwarnings('ignore')

## Load and Visualize MNIST
The digit dataset we're using here — **MNIST** — was created in the 1990s from scanned handwriting samples collected in controlled settings. Each image is paired with a known label (0–9) and was designed for machine learning research. It is now built in the Python TensorFlow package and can be loaded directly.


### Where Does Labeled Data Come From?

Different from MNIST, in many modern systems, labeled data comes from **users in the wild**.

You’ve likely encountered challenges like this while browsing the web:

> "Select all images with traffic lights"  
> "Click every square that contains a bus"

These tasks come from **reCAPTCHA**, a system designed to verify that users are human.  
But they also serve another purpose: helping improve machine learning models by collecting **labeled data**.

This raises important legal and ethical questions:
- Are users aware they are contributing to machine learning datasets?
- Do they consent to how their input is used?
- What are the implications for data ownership?

As we train a simple digit classifier in this notebook, it's worth reflecting on how the **source and intent of labeled data** might affect how we interpret the role of machine learning in society.

In [None]:
#@title Load MNIST from TensorFlow's built-in datasets {display-mode: "form"}
from tensorflow.keras.datasets import mnist

# Load training and test data
# x: images (28x28 pixels), y: labels (0–9 digits)
(x_train, y_train), (x_test, y_test) = mnist.load_data()

print(f"Training samples: {x_train.shape[0]}")
print(f"Test samples: {x_test.shape[0]}")

In [None]:
#@title Now we can take a look at the first 10 images in the training data set from MNIST {display-mode: "form"}
# Plot the first 10 images from the training set
fig, axes = plt.subplots(1, 10, figsize=(12, 2))

for i in range(10):
    axes[i].imshow(x_train[i], cmap='gray')
    axes[i].set_title(f"Label: {y_train[i]}")
    axes[i].axis('off')

plt.suptitle("Examples of Handwritten Digits (MNIST)", fontsize=14)
plt.tight_layout()
plt.show()


# then there are still some small toggles to make the images compatible for model training
# Normalize pixel values to range [0, 1] for neural network input
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

#print("Pixel values normalized to [0, 1]")

# flatten the image
# Each image is 28x28 → flatten to 784-dimensional vector for MLP
x_train_flat = x_train.reshape(-1, 28 * 28)
x_test_flat = x_test.reshape(-1, 28 * 28)

#print(f"Shape of flattened input: {x_train_flat.shape[1]} features per image, ready to feed into the model.")

## Build/Load an MLP

Now that we have the training data set ready, we can start explore the MLP. Here, we provide a pre-trained model on our github repository. We also provide a code below to train this pre-trained model just to save time.



In [None]:
#@title code to construct and train a simple MLP {display-mode: "form"}
# build and define the model

def build_and_train_model(x_train, y_train, epochs=5):

  # Functional API: define model with named intermediate layers
  inputs = tf.keras.Input(shape=(784,), name="input_layer")
  x = layers.Dense(64, activation='relu', name="hidden_layer")(inputs)
  outputs = layers.Dense(10, activation='softmax', name="output_layer")(x)

  # Build the full model
  model = tf.keras.Model(inputs=inputs, outputs=outputs, name="mnist_mlp")
  model.summary()

  model = tf.keras.Model(inputs=inputs, outputs=outputs)

  # Compile the model: define loss function, optimizer, and evaluation metric
  model.compile(
      optimizer='adam',
      loss='sparse_categorical_crossentropy',
      metrics=['accuracy']
      )

  # Train the model on training data (this runs fast in Colab)
  history = model.fit(
      x_train_flat, y_train,
      validation_split=0.1,   # Use 10% of training set for validation
      epochs=epochs,
      batch_size=32,
      verbose=1
      )

  # Save to disk for later download
  model.save("mlp_model.h5")
  print("Model trained and saved as 'mlp_model.h5'.")

  return model

In [None]:
#@title Load the MLP, or train from scratch {display-mode: "form"}

#model_url = "https://github.com/WeihaoGe1009/ml-demos-temp-inputs/raw/main/mlp_model.h5"

url_header = "https://github.com/WeihaoGe1009/ml-demos-temp-inputs/raw/main/"
module_name = ""
model_dir = url_header+module_name+''

#url_header = https://github.com/WeihaoGe1009/ai-history-for-ip-scholars/faw/main/
#module_name = "02_neural_networks/"
#model_dir = url_header + module_name + "models/"

model_file = model_dir + 'mlp_model.h5'

try:

    print(" Attempting to download model from GitHub...")

    response = requests.get(model_file, allow_redirects=True)
    response.raise_for_status()

    if response.status_code == 200:
      print("Pretrained model found. Loading from GitHub...")


      # Check if GitHub served us HTML instead of a binary file
      if "text/html" in response.headers.get("Content-Type", ""):
        raise ValueError("Received HTML instead of binary model file.")

      # Save binary content
      local_path = '/tmp/mlp_model.h5'
      with open(local_path, "wb") as f:
        f.write(response.content)

      # Load model using tf.keras
      model = tf.keras.models.load_model(local_path)
      print("Model successfully loaded from GitHub.")

    else:
        raise ValueError("Model not found or unavailable.")
except Exception as e:
    print(f"⚠️ Could not load model: {e}")
    print("🔁 Training a new model instead...")
    model = build_and_train_model(x_train_flat, y_train)

# Evaluate how well the model performs on unseen test data
test_loss, test_accuracy = model.evaluate(x_test_flat, y_test, verbose=0)
print(f"\nTest accuracy: {test_accuracy:.4f}")

## Visualizing how hidden layers respond to digit images

What do we see in the hidden layer? We won't see digits. The hidden layer will learn to detect patterns, like edges, curves, or blobs

In [None]:
#@title Plot the first 10 hidden neuron weight patterns as 28x28 images {display-mode: "form"}
# Get the weights from the hidden layer, which captures the underlying patterns
hidden_layer = model.get_layer("hidden_layer") # weights shape: (784, 128)
weights, biases = hidden_layer.get_weights()  # weights shape: (784, 64)

print(f"Shape of weights matrix: {weights.shape}")

fig, axes = plt.subplots(2, 5, figsize=(6, 4))

for i, ax in enumerate(axes.flat):
    # Extract weights for the i-th neuron and reshape
    neuron_weights = weights[:, i].reshape(28, 28)
    ax.imshow(neuron_weights, cmap='gray', interpolation='nearest')
    ax.set_title(f"Neuron {i}")
    ax.axis('off')

plt.suptitle("Visualizing Weights of Hidden Neurons (First Layer)", fontsize=14)
plt.tight_layout()
plt.show()


## What Do These Neurons "See"?

Each hidden neuron has learned to respond to certain **patterns** in the input images.  
These patterns are encoded in the neuron's weights — shown here as 28×28 grayscale images.

From these images, we can see that:
- Some neurons focus on strokes in specific **regions** of the digit
- Others respond to **curved or rounded shapes**
- These learned patterns can be **combined and reused** by the network to recognize full digits

These are not full digits — they are **building blocks**, like loops, arcs, or partial strokes.  
By layering and combining them, the network learns to tell apart digits like `3`, `6`, and `8`.

> What the network learns reflects the **statistical patterns** in the training data —  
> in this case, handwritten digits composed of many curved elements.


In [None]:
#@title View Hidden-layer Neuron Activation Pattern {display-mode: "form"}

# Create a small model that gives access to the hidden layer's output
activation_model = tf.keras.Model(
    inputs=model.input,
    outputs=model.get_layer("hidden_layer").output
)

# Select a few test samples (e.g., digits 2, 3, 7, 8)
sample_indices = [np.where(y_test == d)[0][0] for d in [2, 3, 7, 8]]
sample_images = x_test_flat[sample_indices]
sample_labels = y_test[sample_indices]

# Get activations from the second layer
activations = activation_model.predict(sample_images)

activation_matrix = activations  # shape: (4 digits, 64 neurons)

plt.figure(figsize=(10, 1.5))  # short height, wide width
sns.heatmap(
    activation_matrix,
    cmap='YlGn_r',
    cbar=False,                  # remove color bar
    yticklabels=sample_labels,   # show digits on the left
    xticklabels=False,           # hide neuron labels
    linewidths=0, linecolor='none'
)

plt.title("Neurons Activated by Each Digit", fontsize=12)
plt.xlabel("Hidden Neurons")
plt.ylabel("Digit")
plt.yticks(rotation=0)
plt.tick_params(axis='both', which='major', labelsize=10)
plt.xlabel("Hidden Neurons", fontsize=10)
plt.ylabel("Digit", fontsize=10)
plt.tight_layout()
plt.show()


## Generating Digit-Like Images.

This model learns how digits look — not just to recognize them, but to redraw them from memory.
It studies many examples, compresses the patterns into a smaller form, and then reconstructs what it "thinks" the digit should look like.

When we give it the number 5, it doesn’t copy a specific image of hand-written 5 — it generates a new one using the features it has learned.

If we give it the same number multiple times, it can draw slightly different versions — just like people never write the exact same digit twice.

### "Prototype" digit
For each digit, we calculate an average activation pattern. We assume the hand-writing to of the same digit correspond to a similar pattern. Therefore, the average activation pattern will show a "prototype" digit. The generated "prototype" does not belong to any pre-existing images in the data set, but created based on the recombination of the "patterns" learnt in each neuron

In [None]:
#@title Step 1 — Precompute the Avecrage Activation for Each Digit {display-mode: "form"}
# Dictionary to hold mean and std activations for digits 0–9
activation_stats = {}

for label in range(10):
    indices = np.where(y_train == label)[0]
    images = x_train_flat[indices]
    activations = activation_model.predict(images, verbose=0)

    mean_act = np.mean(activations, axis=0)

    activation_stats[label] = mean_act


In [None]:
#@title Step 2 - Define the Function that Combines the Patterns by Activation Stats {display-mode: "form"}
def visualize_activation_projection(label):
    """
    Generate a synthetic activation pattern for a given digit label
    and project it back to pixel space using hidden layer weights.
    """
    # Retrieve mean and std from precomputed stats
    synthetic_activation = activation_stats[label]

    # Get hidden layer weights
    weights = hidden_layer.get_weights()[0]  # shape (784, 64)


    # Combine each neuron's weight map with its activation
    combined_image = np.zeros((28, 28))
    for i in range(64):
        neuron_weights_2d = weights[:, i].reshape(28, 28)
        combined_image += synthetic_activation[i] * neuron_weights_2d

    # Normalize image for display
    combined_image -= combined_image.min()
    combined_image /= combined_image.max()


    return combined_image

    # Plot
    #plt.figure(figsize=(2.5, 2.5))
    #plt.imshow(combined_image, cmap='gray')
    #plt.title(f"Projection of Synthetic Activation for Digit {label}")
    #plt.axis('off')
    #plt.tight_layout()
    #plt.show()


In [None]:
#@title Step 3, Show the Prototype Digits {display-mode: "form"}
fig, axes = plt.subplots(2, 5, figsize=(6, 3))
for label in range(10):
  ax = axes[label//5, label % 5]
  image = visualize_activation_projection(label)
  ax.imshow(image, cmap='gray')
  ax.set_title(f"Digit {label}")
  ax.axis('off')
plt.tight_layout()
plt.show()

We can see the prototype digits are very blurry, but some numbers are discernible in the center. The neural network algorithm will create some artifacts at the edge, which resulted in the halo around the digits. These prototype digits are created by the model, instead of extracted from the data set.

### Take-Home Message: What We Learned
In this notebook, we explored how a simple neural network can learn to recognize hand-written digits.

Along the way, we introduced:

* Categorization — how the network classifies images into labels (0–9)

* Pattern Learning — how it identifies shared strokes and shapes across digits

* Neuron Activation — how internal layers respond differently depending on what digit is shown

* Prototype Digits — how we can reconstruct what the network “expects” a digit to look like using only activation patterns

We’ve shown that the model isn’t memorizing specific examples. Instead, it’s learning generalizable visual patterns, which can even be used to generate new digit-like images without copying.

In the next notebook, we’ll go further:

We’ll explore convolutional neural networks (CNNs) and show how they can process images in a more advanced way — identifying edges, corners, and more.

We’ll also revisit image generation — this time starting from pure noise — and demonstrate how a well-trained model can turn randomness into some scribbles.