You can just click **Runtime → Run all** to try the demo.

## Module 2 cont. - Seeing Like a Machine: Patterns, Vision, and Imagination
In the first notebook, we worked with a simple type of neural network — enough to recognize hand-written digits and generate the "average" or "prototype" digits using learned patterns.

In real-world applications — including many where copyright concerns arise — machines often rely on more complex neural structures to process images. One of the most widely used is the **convolutional neural network**, or CNN. This architecture is more powerful, and it's also more common in systems that create or transform visual content.

In this notebook, we’ll explore:

* How these more advanced networks “see” patterns in images

* How a trained model can go beyond recognition and start creating new digit-like images from what it has learned

In [None]:
#@title System setup {display-mode: "form"}
!pip install -q tensorflow matplotlib

#Import required packages
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Conv2DTranspose, Reshape

import requests

# supress warnings

import warnings
warnings.filterwarnings("ignore")

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import absl.logging
absl.logging.set_verbosity(absl.logging.ERROR)

We are using MNIST data again, and see how a slightly more complex neural network learns features in the handwritten digits.

In [None]:
#@title Load and prepare MNIST data {display-mode: "form"}
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize and reshape
x_train = x_train.astype("float32") / 255.
x_test = x_test.astype("float32") / 255.

x_train = np.expand_dims(x_train, -1)  # shape: (60000, 28, 28, 1)
x_test = np.expand_dims(x_test, -1)


### CNN Classifier
Now that we have training and testing data set ready, we are ready to look at a simplest CNN classifier that, like the simple neural network / MLP we saw in the last notebook, "recognizes" what digits are in the handwritting.

In [None]:
#@title code to construct and train a CNN Classifier {display-mode: "form"}
def build_and_train_cnn_classifier(x_train, y_train, epochs=5):

  input_img = tf.keras.Input(shape=(28, 28, 1), name="input")
  x = layers.Conv2D(32, (3, 3), activation='relu', name="conv1")(input_img)
  x = layers.MaxPooling2D((2, 2), name="pool1")(x)
  x = layers.Conv2D(64, (3, 3), activation='relu', name="conv2")(x)
  x = layers.MaxPooling2D((2, 2), name="pool2")(x)
  x = layers.Flatten(name="flatten")(x)
  x = layers.Dense(64, activation='relu', name="dense1")(x)
  output = layers.Dense(10, activation='softmax', name="output")(x)

  cnn_classifier = tf.keras.Model(inputs=input_img, outputs=output)
  cnn_classifier.compile(optimizer='adam',
                       loss='sparse_categorical_crossentropy',
                       metrics=['accuracy'])
  #Train CNN Classifier on MNIST {display-mode: "form"}
  cnn_classifier.fit(x_train, y_train, epochs=5, batch_size=128, validation_split=0.1)

  # Save to disk for later download
  cnn_classifier.save("cnn_classifier.h5")
  print("Model trained and saved as 'cnn_classifier.h5'.")


  return cnn_classifier

In [None]:
#@title Load the cnn_classifier, or train from scratch {display-mode: "form"}

#model_url = "https://github.com/WeihaoGe1009/ml-demos-temp-inputs/raw/main/mlp_model.h5"

url_header = "https://github.com/WeihaoGe1009/ml-demos-temp-inputs/raw/main/"
module_name = ""
model_dir = url_header+module_name+''

#url_header = https://github.com/WeihaoGe1009/ai-history-for-ip-scholars/raw/main/
#module_name = "02_neural_networks/"
#model_dir = url_header + module_name + "models/"

model_file = model_dir + 'cnn_model.h5'

try:

    print(" Attempting to download model from GitHub...")

    response = requests.get(model_file, allow_redirects=True)
    response.raise_for_status()

    if response.status_code == 200:
      print("Pretrained model found. Loading from GitHub...")


      # Check if GitHub served us HTML instead of a binary file
      if "text/html" in response.headers.get("Content-Type", ""):
        raise ValueError("Received HTML instead of binary model file.")

      # Save binary content
      local_path = '/tmp/cnn_generator.h5'
      with open(local_path, "wb") as f:
        f.write(response.content)

      # Load model using tf.keras
      cnn_classifier = tf.keras.models.load_model(local_path)
      print("Model successfully loaded from GitHub.")

    else:
        raise ValueError("Model not found or unavailable.")
except Exception as e:
    print(f"⚠️ Could not load model: {e}")
    print("🔁 Training a new model instead...")
    cnn_classifier = build_and_train_cnn_classifier(x_train, y_train)

# Evaluate how well the model performs on unseen test data
test_loss, test_accuracy = cnn_classifier.evaluate(x_test, y_test, verbose=0)
print(f"\nTest accuracy: {test_accuracy:.4f}")

In [None]:
#@title Visualize filters from first Conv2D layer {display-mode: "form"}
first_conv = cnn_classifier.get_layer("conv1")
filters, biases = first_conv.get_weights()  # filters shape: (3, 3, 1, 32)

# Normalize for display
f_min, f_max = filters.min(), filters.max()
filters = (filters - f_min) / (f_max - f_min)

fig, axes = plt.subplots(4, 8, figsize=(10, 5))
for i, ax in enumerate(axes.flat):
    ax.imshow(filters[:, :, 0, i], cmap='gray')
    ax.axis('off')
fig.suptitle("First-layer filters learned by CNN")
plt.show()


The tiny 3×3 filters shown above don’t look like digits — and they’re not supposed to.
Instead, each filter acts like a small lens that picks up local visual signals, such as slight changes in brightness, orientation, or texture.
These filters are the first layer of processing in a convolutional neural network (CNN), a more advanced type of neural network used widely in real-world image recognition systems.
While each filter detects only a very basic aspect, the model builds on these pieces — layer by layer — to recognize or even generate complete images like digits.
This modular structure is a key idea in understanding how machines learn visual patterns without memorizing images.


### From Pure Noise to a Scribble Image
In this section, the model starts with **pure noise** — a random 28×28 image — and tries to "imagine" a digit using only the patterns it learned during training.

This shift — from recognizing patterns to generating new images — raises thoughtful questions about how machines learn, and what it means for a result to be “new” or “inspired” in the context of intellectual work.

This is not a memory recall or a copy-paste. It's a **synthesis**: the model combines patterns like loops, strokes, and shapes that it believes are typical of hand-written digits.

This simplified process is related to the way **modern Generative AI** works:
- Start from noise
- Refine it using learned structure
- Generate something *plausible*, but not retrieved from storage

Each run generates a different result — a different "interpretation" of what a digit might look like.


In [None]:
#@title code to construct and train a CNN generator based on CNN Classifier
def build_and_train_cnn_generator(x_train, y_train, cnn_classifier, epochs=5):

  # Freeze encoder layers to retain pretrained weights from cnn_classifier
  for layer in cnn_classifier.layers:
    if layer.name != "output":  # skip output layer, since we're reusing everything up to dense1
        layer.trainable = False

  # Reuse encoder layers from the trained cnn_classifier
  encoder_input = cnn_classifier.input
  encoder_output = cnn_classifier.get_layer("dense1").output  # shape: (None, 64)

  # Decoder: mirror of encoder layers in reverse
  x = layers.Dense(7 * 7 * 64, activation='relu', name="dense_decode")(encoder_output)
  x = Reshape((7, 7, 64), name="reshape_decode")(x)
  x = Conv2DTranspose(64, (3, 3), activation='relu', strides=2, padding='same', name="deconv1")(x)
  x = Conv2DTranspose(32, (3, 3), activation='relu', strides=2, padding='same', name="deconv2")(x)
  decoded_output = Conv2DTranspose(1, (3, 3), activation='tanh', padding='same', name="decoder_output")(x)

  cnn_generator = Model(inputs=encoder_input, outputs=decoded_output, name="cnn_generator")
  cnn_generator.compile(optimizer='adam', loss='binary_crossentropy')

  # Train using the original images as both input and output (autoencoder-style training)
  cnn_generator.fit(x_train, x_train, epochs=epochs, batch_size=128, validation_split=0.1)


  # Save to disk for later download
  cnn_generator.save("cnn_generator.h5")
  print("Model trained and saved as 'cnn_generator.h5'.")

  return cnn_generator

In [None]:
#@title Load the cnn_generator, or train from scratch {display-mode: "form"}

#model_url = "https://github.com/WeihaoGe1009/ml-demos-temp-inputs/raw/main/mlp_model.h5"

url_header = "https://github.com/WeihaoGe1009/ml-demos-temp-inputs/raw/main/"
module_name = ""
model_dir = url_header+module_name+''

#url_header = https://github.com/WeihaoGe1009/ai-history-for-ip-scholars/raw/main/
#module_name = "02_neural_networks/"
#model_dir = url_header + module_name + "models/"

model_file = model_dir + 'cnn_generator.h5'

try:

    print(" Attempting to download model from GitHub...")

    response = requests.get(model_file, allow_redirects=True)
    response.raise_for_status()

    if response.status_code == 200:
      print("Pretrained model found. Loading from GitHub...")


      # Check if GitHub served us HTML instead of a binary file
      if "text/html" in response.headers.get("Content-Type", ""):
        raise ValueError("Received HTML instead of binary model file.")

      # Save binary content
      local_path = '/tmp/cnn_generator.h5'
      with open(local_path, "wb") as f:
        f.write(response.content)

      # Load model using tf.keras
      cnn_generator = tf.keras.models.load_model(local_path)
      print("Model successfully loaded from GitHub.")

    else:
        raise ValueError("Model not found or unavailable.")
except Exception as e:
    print(f"⚠️ Could not load model: {e}")
    print("🔁 Training a new model instead...")
    #cnn_generator = build_and_train_cnn_generator(x_train, x_train, cnn_classifier)

# Evaluate how well the model performs on unseen test data
#cnn_generator.compile(optimizer='adam', loss='binary_crossentropy')
#test_loss = cnn_generator.evaluate(x_test, x_test)

In [None]:
#@title generate image from random noise {display-mode: "form"}
num_samples = 4
random_images = np.random.normal(loc=0.0, scale=1.0, size=(num_samples, 28, 28, 1))

# Feed the noise images through the full CNN generator model
generated_images = cnn_generator.predict(random_images)

# Plot input noise and output digit side-by-side
fig, axes = plt.subplots(2, num_samples, figsize=(num_samples * 1.5, 3))
for i in range(num_samples):
    # Top row: input noise
    axes[0, i].imshow(random_images[i].squeeze(), cmap='gray')
    axes[0, i].axis('off')
    axes[0, i].set_title("Noise", fontsize=8)

    # Bottom row: generated digit-like image
    axes[1, i].imshow(generated_images[i].squeeze(), cmap='gray')
    axes[1, i].axis('off')
    axes[1, i].set_title("Generated", fontsize=8)

fig.suptitle("CNN Generator: Creating Digits from Noise", fontsize=12)
plt.tight_layout()


The images generated above looks like hand-written scribbles. They are mostly resembling 8, 3, 6, or 2. In this simplest architecture of CNN, it is easier to capture or "learn" curves, and generate images based on the patterns the model has learnt.

### Take-home message: What we've learned

In this notebook, we explored how a convolutional neural network (CNN) can learn to recognize hand-written digits, and generate some digit-like scribbles from pure random.

We have found:

* The model learns filters that detect small visual features common in many digits.

* These filters shape how the model interprets or constructs digit-like forms.

* The scribble outputs come from applying learned filters to random input.
