In this file, I explore Neural Nets, DNNs, CNNs, DNN/CNN combinations, dropout regularization, image augmentation, and transfer learning. 


In [21]:
import tensorflow as tf
import numpy as np 
from tensorflow.keras import Sequential 
from tensorflow.keras.layers import Dense

# Shallow Neural Net

Firstly, we import the necessary `numpy` and `TensorFlow` libraries.  

The following code performs these operations:  
- Constructs a layer `l0` with one neuron, taking a 1-dimensional array as input, with the activation function defaulted to linear (identity map).  
- Sets up a sequential (feedforward) neural network with just the layer `l0`.  
- Calls `model.compile()` to define the optimizer, loss function, and metrics for training. (SGD = Stochastic Gradient Descent, used to escape local minima.)  
- Defines the training data.  
- Fits the model to the training data, specifying the number of epochs (iterations of the optimization algorithm).  
- The model predicts a value of 10.0, which should be 19.  
- Prints the learned model weights. Notice they are not exactly `2x - 1`, because the network approaches them via gradient descent rather than "knowing" the solution directly.  

In [24]:
l0 = Dense(units = 1, input_shape=[1]) 
model = Sequential([l0]) 
model.compile(optimizer = "sgd", loss = "mean_squared_error") 
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype = float) 
model.fit(xs, ys, epochs = 500) 
print(model.predict([10.0]))
print("Here is what I learned: {}".format(l0.get_weights())) 

Epoch 1/500
Epoch 2/500
Epoch 3/500
...
Epoch 499/500
Epoch 500/500
[[18.978535]]
Here is what I learned: [array([[1.9968889]], dtype=float32), array([-0.99035394], dtype=float32)]


# Deep Neural Network

This is an example of a Deep Neural Network—several layers of neural nets.  

This code does the following: 
- Start a timer.  
- Load the MNIST dataset, which contains handwritten digits 0-9 along with their labels. Split it into training and test sets.  
- Rescale the images from grayscale values 0–255 to 0–1 and flatten each 28 x 28 image into a 784-entry vector. This helps manage exploding gradients and speeds up processing.  
- Set up a 3-layer model:  
- First hidden layer: 128 neurons with ReLU activation  
- Second hidden layer: 128 neurons with ReLU activation  
- Output layer: 10 neurons with softmax activation (multi-class output)  
- This outputs a 10-entry vector of probabilities for each class 0-9, which sum to 1.  
- Compile the model by specifying the optimizer, loss function, and accuracy metric. The model is not yet trained.  
- Train the model and evaluate it on the test set.  
- Report the number of model parameters. Calculation: `(784+1)*128 + 128*(128+1) + (128+1)*10`, since every input gets a weight plus 1 bias.  


In [29]:
import tensorflow as tf
import time


data = tf.keras.datasets.fashion_mnist

(training_images, training_labels), (test_images, test_labels) = data.load_data()

training_images = training_images.reshape(60000, 784) / 255.0
test_images = test_images.reshape(10000, 784) / 255.0

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

start_time = time.time()

model.fit(training_images, training_labels, epochs=10)
model.evaluate(test_images, test_labels)

end_time = time.time()
training_time = end_time - start_time

print(f"Training Time: {training_time} seconds")
model.summary()

Epoch 1/10
...
Epoch 10/10
Training Time: 109.65528678894043 seconds
Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_17 (Dense)            (None, 128)               100480    
                                                                 
 dense_18 (Dense)            (None, 128)               16512     
                                                                 
 dense_19 (Dense)            (None, 10)                1290      
                                                                 
Total params: 118,282
Trainable params: 118,282
Non-trainable params: 0
_________________________________________________________________


# Convolutional Neural Network (CNN)

This is an example of a convolutional neural network (CNN).  

A **convolution** is an n×n matrix with weights as entries (plus a bias term). For example, a 3×3 matrix has 10 parameters. The matrix slides over the input pixels, multiplying element-wise and summing the results. An activation function is then applied to this linear combination. The weights of the filter are learned during training and capture local features (edges, vertical/horizontal lines, color patterns, etc.). CNNs are essentially the image-based analogue of DNNs.  

This code does the following: 
- Data preparation.  
- Defining the feedforward model.  
- Apply 64 different 3×3 filters to each of the 26×26 valid pixels (28×28 input, edges excluded). Each filter has 3×3 + 1 bias parameters → 10 parameters per filter. Thus with 64 filters, we have 640 trainable parameters here. The output is a 26×26×64 array (height × width × filter index/(x, y, filter_n value(x, y))). Apply ReLU activation after convolution, replacing negative values with zero to add non-linearity: `ReLU(W_3x3 * pixels + bias)` for each of 64 filters.  
- Max pooling (no trainable parameters): reduce the 26×26×64 output to 13×13×64 by taking the max value in each 2×2 region for each filter.  
- Apply another set of 64 new 3×3 filters on the 13×13×64 output of the previous 64 filters and pooling. Parameters: `(3×3 + 1 × 64) × 64 = 36,928`. Output: 11×11×64.  
- Max pooling again: 11×11×64 → 5×5×64.  
- Flatten the 5×5×64 array into a 1D vector of 1600 values, ready to feed into a fully connected neural network.  
- Connect to a softmax output layer with 10 neurons (1601 parameters each).  
- Compile, train, and report model information.  

In [6]:
import time
import tensorflow as tf

data = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = data.load_data()

training_images = training_images.reshape(60000, 28, 28, 1) / 255.0
test_images = test_images.reshape(10000, 28, 28, 1) / 255.0

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, (3, 3), activation="relu", input_shape=(28,28,1)),
    tf.keras.layers.MaxPooling2D(2, 2),
    
    tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
    
    tf.keras.layers.Flatten(),  

    tf.keras.layers.Dense(10, activation="softmax")  
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

start_time = time.time()
model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)

end_time = time.time()
training_time = end_time - start_time

print(f"Training Time: {training_time} seconds")
model.summary() 

Epoch 1/5
...
Epoch 5/5
Training Time: 214.01053524017334 seconds
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_5 (Conv2D)           (None, 26, 26, 64)        640       
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 13, 13, 64)       0         
 2D)                                                             
                                                                 
 conv2d_6 (Conv2D)           (None, 11, 11, 64)        36928     
                                                                 
 max_pooling2d_6 (MaxPooling  (None, 5, 5, 64)         0         
 2D)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 1600)              0         
                                                      

# Combined CNN + DNN

The previous examples were only DNNs or CNNs. This example combines both approaches.  

- Use the convolutional layers from the CNN to extract spatial features from the input images.  
- Flatten the output of the convolutional layers to feed it into a fully connected DNN.  
- Compare **training time**, **accuracy**, and **number of parameters** across the three models:  
  1. DNN only  
  2. CNN only  
  3. CNN + DNN combined  


In [7]:
import tensorflow as tf

data = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = data.load_data()

training_images = training_images.reshape(60000, 28, 28, 1)
training_images = training_images / 255.0
test_images = test_images.reshape(10000, 28, 28, 1)
test_images = test_images / 255.0

model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, (3, 3), activation = "relu", input_shape=(28,28,1)), 
tf.keras.layers.MaxPooling2D(2,2), 
tf.keras.layers.Conv2D(64, (3, 3), activation = "relu"), 
tf.keras.layers.MaxPooling2D(2,2), 
tf.keras.layers.Flatten(), 
tf.keras.layers.Dense(128, activation = tf.nn.relu), 
tf.keras.layers.Dense(10, activation = tf.nn.softmax) 
])

model.compile(optimizer= 'adam', loss= 'sparse_categorical_crossentropy', metrics = ['accuracy'])
model.fit(training_images, training_labels, epochs= 10)
model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)
print(classifications[0])
print(test_labels[0])

Epoch 1/10
...
Epoch 10/10
[1.6829333e-11 2.9011214e-12 1.4376247e-13 9.9139134e-12 3.7999070e-14
 1.7961634e-08 1.6138226e-14 7.5122242e-10 4.2089064e-14 9.9999994e-01]
9


# RGB Image Classification: People vs. Horses

Now we take a look at another dataset, this time with **3 color channels (R, G, B)** and larger images.  
Due to the number of parameters and data size, this may take a few minutes to run.  
This dataset attempts to classify images into **people vs. horses**, but the images are taken at various angles and contain different sizes and colors.  
The model must **fit just enough to capture important distinguishing features**, without overfitting to irrelevant variations in angle, size, or color.  


In [9]:
import zipfile
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf 
import urllib.request, zipfile, os
from tensorflow.keras.optimizers import RMSprop
from PIL import Image

train_url = "https://storage.googleapis.com/learning-datasets/horse-or-human.zip"
validation_url = "https://storage.googleapis.com/learning-datasets/validation-horse-or-human.zip"

os.makedirs("data", exist_ok=True)
file_name = "data/horse-or-human.zip" 
validation_file_name = "data/validation-horse-or-human.zip"  

training_dir = "data/horse-or-human/training"  
validation_dir = "data/horse-or-human/validation"  

urllib.request.urlretrieve(train_url, file_name)
urllib.request.urlretrieve(validation_url, validation_file_name)

with zipfile.ZipFile(file_name, "r") as zip_ref:
    zip_ref.extractall(training_dir)

with zipfile.ZipFile(validation_file_name, "r") as zip_ref:
    zip_ref.extractall(validation_dir)
    
train_datagen = ImageDataGenerator(rescale=1/255)
train_generator = train_datagen.flow_from_directory(
    training_dir, 
    target_size=(300, 300),
    class_mode="binary"
)


validation_datagen = ImageDataGenerator(rescale=1/255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir, 
    target_size=(300, 300),
    class_mode="binary"
)

model = tf.keras.models.Sequential([
     tf.keras.layers.Conv2D(16, (3, 3), activation="relu", input_shape=(300,300,3)),
    tf.keras.layers.MaxPooling2D(2, 2),
     tf.keras.layers.Conv2D(32, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
     tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
       tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
       tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(), 
    tf.keras.layers.Dense(512, activation = "relu"),
    tf.keras.layers.Dense(1, activation = "sigmoid")
])

model.compile(loss= "binary_crossentropy", optimizer = RMSprop(learning_rate=0.001), metrics = ["accuracy"])
history = model.fit(train_generator, epochs = 10, validation_data = validation_generator)

Found 1027 images belonging to 2 classes.
Found 256 images belonging to 2 classes.
Epoch 1/10
...
Epoch 10/10


# Dropout Regularization and Image Augmentation

The following are examples of **dropout regularization** and **image augmentation**.  

**Dropout** reduces overfitting during training by randomly setting a subset of neuron outputs to 0 and scaling the remaining outputs by `1 / (1 - dropout rate)`.  

- This prevents the network from relying too heavily on any single sequence of neurons, encouraging it to learn more generalizable features.  
- The subset is selected using a boolean mask of length equal to the number of neurons, and is regenerated for each batch and epoch.  
- Dropout is only applied during training to adjust parameters, not during evaluation.  

**Image augmentation** is another technique to reduce overfitting.  

- It modifies images in the training set to expose the network to a broader variety of patterns, improving performance on unseen data.  
- Augmentation helps up to a point— overdoing it can introduce unrealistic features (e.g., a swirl effect on a horse) that the model will never encounter in real-world data.


In [8]:
import zipfile
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf 
from tensorflow.keras.optimizers import RMSprop
from PIL import Image
import urllib.request, zipfile, os

train_url = "https://storage.googleapis.com/learning-datasets/horse-or-human.zip"
validation_url = "https://storage.googleapis.com/learning-datasets/validation-horse-or-human.zip"

os.makedirs("data", exist_ok=True)
file_name = "data/horse-or-human.zip" 
validation_file_name = "data/validation-horse-or-human.zip"  

training_dir = "data/horse-or-human/training"  
validation_dir = "data/horse-or-human/validation"  

urllib.request.urlretrieve(train_url, file_name)
urllib.request.urlretrieve(validation_url, validation_file_name)

with zipfile.ZipFile(file_name, "r") as zip_ref:
    zip_ref.extractall(training_dir)

with zipfile.ZipFile(validation_file_name, "r") as zip_ref:
    zip_ref.extractall(validation_dir)

train_datagen2 = ImageDataGenerator(rescale=1/255, rotation_range = 40, width_shift_range = 0.2, height_shift_range = 0.2, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True, fill_mode = "nearest")
train_generator2 = train_datagen2.flow_from_directory(
    training_dir, 
    target_size=(300, 300),
    class_mode="binary"
)


validation_datagen = ImageDataGenerator(rescale=1/255)
validation_generator = validation_datagen.flow_from_directory(
    validation_dir, 
    target_size=(300, 300),
    class_mode="binary"
)

model2 = tf.keras.models.Sequential([
     tf.keras.layers.Conv2D(16, (3, 3), activation="relu", input_shape=(300,300,3)),
    tf.keras.layers.MaxPooling2D(2, 2),
     tf.keras.layers.Conv2D(32, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
     tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
       tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
       tf.keras.layers.Conv2D(64, (3, 3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(), 
    tf.keras.layers.Dense(512, activation = "relu"),
    tf.keras.layers.Dropout(0.2), 
    tf.keras.layers.Dense(1, activation = "sigmoid")
])

model2.compile(loss= "binary_crossentropy", optimizer = RMSprop(learning_rate=0.001), metrics = ["accuracy"])
history2 = model2.fit(train_generator2, epochs = 10, validation_data = validation_generator)
model2.summary()

Found 1027 images belonging to 2 classes.
Found 256 images belonging to 2 classes.
Epoch 1/10
...
Epoch 10/10
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_10 (Conv2D)          (None, 298, 298, 16)      448       
                                                                 
 max_pooling2d_10 (MaxPoolin  (None, 149, 149, 16)     0         
 g2D)                                                            
                                                                 
 conv2d_11 (Conv2D)          (None, 147, 147, 32)      4640      
                                                                 
 max_pooling2d_11 (MaxPoolin  (None, 73, 73, 32)       0         
 g2D)                                                            
                                                                 
 conv2d_12 (Conv2D)          (None, 71, 71, 64)        18496     
          

# Transfer Learning Example

**Transfer learning** allows you to leverage larger, pre-trained networks to save computing power.  

- It involves "freezing" the variables learned by a pre-trained model (often trained on massive datasets with powerful hardware) and only training a few layers at the end.  

Starting at the line `base_model = MobileNetV2()`, the following occurs:  

- Loads **MobileNetV2**, a convolutional neural network (CNN) with pre-trained weights from **ImageNet** (1.2M images, 1000 classes). The top layer is excluded since we do not need a 1000-class softmax.  
- Freezes the base model (`trainable=False`) so the pre-trained weights are not updated during training.  
- The output of the base model is a `(7, 7, 1280)` feature map (series of filters and pooling).  
- Applies **global average pooling 2D**, reducing `(7, 7, 1280)` → `(1280,)`. This reduces the data more efficiently than flattening.  
- Finally, we add a single neuron for **binary classification** and test the model on a few images. We set a threshold at 0.5 for the sigmoid and output the sigmoid probabilities.  


In [10]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dropout
import zipfile
import os
import numpy as np
from tensorflow.keras.preprocessing.image import ImageDataGenerator

file_name = "data/Class_Carnations.zip"  
file_name2 = "data/Class_Roses.zip"   
training_dir = "data/Rose-or-Carnation/training"  
test_zip = "data/Test.zip"
testing_dir = "data/Rose-or-Carnation/testing"

with zipfile.ZipFile(file_name, "r") as zip_ref:
    zip_ref.extractall(training_dir)

with zipfile.ZipFile(file_name2, "r") as zip_ref:
    zip_ref.extractall(training_dir)

os.makedirs(testing_dir, exist_ok=True)

with zipfile.ZipFile(test_zip, "r") as zip_ref:
    zip_ref.extractall(testing_dir)

train_datagen = ImageDataGenerator(rescale=1/255, validation_split=0.1)
test_datagen = ImageDataGenerator(rescale=1/255)

train_generator = train_datagen.flow_from_directory(
    training_dir,
    target_size=(224, 224),
    class_mode="binary",
    subset="training",
    seed=17 #set random seed 
)

val_generator = train_datagen.flow_from_directory(
    training_dir,
    target_size=(224, 224),
    class_mode="binary",
    subset="validation",
    seed=17 #same seed-- keeping validation set seperate from training 
)

test_generator = test_datagen.flow_from_directory(
    testing_dir,
    target_size=(224, 224),  # Important: match MobileNetV2 input size
    class_mode="binary",
    shuffle=False
)

base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
base_model.trainable = False  

x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.2)(x) 
x = Dense(1, activation='sigmoid')(x)

model = Model(inputs=base_model.input, outputs=x)

model.compile(optimizer=Adam(1e-4), loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(train_generator, validation_data=val_generator, epochs=10)

# Evaluate the model
test_loss, test_accuracy = model.evaluate(test_generator)
print(f"Test Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.4f}")

predictions = model.predict(test_generator) 

predicted_classes = (predictions > 0.5).astype(int).flatten()

filenames = test_generator.filenames

for filename, pred, prob in zip(filenames, predicted_classes, predictions.flatten()):
    print(f"{filename}: Predicted class -> {pred} (Confidence: {prob:.2f})")

Found 1170 images belonging to 2 classes.
Found 128 images belonging to 2 classes.
Found 4 images belonging to 1 classes.
Epoch 1/10
...
Epoch 10/10
Test Loss: 1.6949
Test Accuracy: 0.5000
Test\pic_1.jpg: Predicted class -> 0 (Confidence: 0.27)
Test\pic_2.jpg: Predicted class -> 0 (Confidence: 0.06)
Test\pic_3.jpg: Predicted class -> 1 (Confidence: 0.99)
Test\pic_4.jpg: Predicted class -> 1 (Confidence: 0.73)
