# Task 1

Check out this official repository with many examples of Keras implementations of various sorts of
deep neural networks here. We recommend cloning this repository and trying to get some of these
examples running on your system (or Colab/DeepNote). In particular, experiment with mnist mlp.py
and mnist cnn.py scripts which show you how to build simple neural networks for the MNIST dataset
(useful for the next task

Next, take the two well-known datasets: Fashion MNIST (introduced in Ch 10, p. 298) and CIFAR-10.
The first dataset contains 2D (grayscale) images of size 28x28, split into 10 categories; 60,000 images
for training and 10,000 for testing, while the latter contains 32x32x3 RGB images (50,000/10,000
train/test). Apply two reference networks on the fashion MNIST datase

## (a) MLP

initializations, activations, optimizers (and
their hyperparameters), regularizations (L1, L2, Dropout, no Dropout). You may also experiment
with changing the architecture of both networks: adding/removing layers, number of convolutional
filters, their sizes, etc.

In [1]:
import tensorflow as tf

2025-11-06 13:50:14.388809: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-11-06 13:50:14.510098: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-11-06 13:50:20.316014: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.


In [3]:
# Check number of available GPUs

print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Num GPUs Available:  1


In [6]:
# Use TensorFlow's bundled Keras to ensure compatibility with GPUs
import tensorflow as tf
from tensorflow import keras
import os
from sklearn.preprocessing import StandardScaler

random_state = 900
keras.utils.set_random_seed(random_state)

# Try to enable memory growth for all GPUs so TF doesn't reserve all GPU memory upfront
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        print('Enabled memory growth for GPUs:', gpus)
    except Exception as e:
        print('Could not set memory growth:', e)
else:
    print('No GPU devices found by TensorFlow')



fashion_mnist= keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
print(X_train_full.shape, y_train_full.shape)

X_valid, X_train = X_train_full[:5000] / 255.0, X_train_full[5000:] / 255.0
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

X_test = X_test / 255.0
print(X_train.shape, y_train.shape)
print(X_valid.shape, y_valid.shape)

Enabled memory growth for GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
(60000, 28, 28) (60000,)
(55000, 28, 28) (55000,)
(5000, 28, 28) (5000,)


### Model 1

Below is our first attempt at building a neural network. We use the SGD optimizer.

In [None]:
## First attempt

# Build model
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(400, activation='leaky_relu', kernel_regularizer=keras.regularizers.l2(0.0001)),
    keras.layers.Dropout(0.1),
    keras.layers.Dense(200, activation="leaky_relu", kernel_regularizer=keras.regularizers.l2(0.0001)),
    keras.layers.Dropout(0.05),
    keras.layers.Dense(100, activation="leaky_relu"),
    keras.layers.Dropout(0.05),
    keras.layers.Dense(40, activation="leaky_relu"),    
    keras.layers.Dense(10, activation="softmax")
])
early_stopping = keras.callbacks.EarlyStopping(
    patience=6,
    restore_best_weights=True
)

model.summary()


model.compile(loss="sparse_categorical_crossentropy",
optimizer=keras.optimizers.SGD(learning_rate=5e-3),
metrics=["accuracy",
        #   tf.keras.metrics.Precision(), tf.keras.metrics.Recall()
          ])
model.fit(
    X_train, y_train,
    epochs=60,
    validation_data=(X_valid, y_valid),
    callbacks=[early_stopping]
)


Epoch 1/30


2025-11-06 12:54:24.110812: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.





[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 2ms/step - accuracy: 0.6762 - loss: 1.0317 - val_accuracy: 0.7982 - val_loss: 0.6764
Epoch 2/30
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 790us/step - accuracy: 0.7947 - loss: 0.6742 - val_accuracy: 0.8346 - val_loss: 0.5760
Epoch 3/30
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 661us/step - accuracy: 0.8200 - loss: 0.6048 - val_accuracy: 0.8478 - val_loss: 0.5369
Epoch 4/30
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 808us/step - accuracy: 0.8331 - loss: 0.5642 - val_accuracy: 0.8542 - val_loss: 0.5124
Epoch 5/30
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 737us/step - accuracy: 0.8410 - loss: 0.5390 - val_accuracy: 0.8596 - val_loss: 0.4961
Epoch 6/30
[1m1719/1719[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 748us/step - accuracy: 0.8497 - loss: 0.5175 - val_accuracy: 0.8644 - val_loss: 0.4813
Epoch 7/30
[1m17

<keras.src.callbacks.history.History at 0x70fca8773a40>

In [35]:
#evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test accuracy:', test_acc)
print('Test loss:', test_loss)
#0.8834999799728394

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8812 - loss: 0.4220
Test accuracy: 0.8812000155448914
Test loss: 0.42198416590690613


### Model 2

In [1]:
## second attempt

# Build model
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(300, activation='leaky_relu', kernel_regularizer=keras.regularizers.l2(0.0001)),
    keras.layers.Dropout(0.1),
    keras.layers.Dense(100, activation="leaky_relu", kernel_regularizer=keras.regularizers.l2(0.0001)),
    keras.layers.Dropout(0.05),
    keras.layers.Dense(100, activation="leaky_relu"),
    keras.layers.Dropout(0.05),
    keras.layers.Dense(30, activation="leaky_relu"),    
    keras.layers.Dense(10, activation="softmax")
])
early_stopping = keras.callbacks.EarlyStopping(
    patience=6,
    restore_best_weights=True
)

model.summary()


model.compile(loss="sparse_categorical_crossentropy",
optimizer=keras.optimizers.Adam(learning_rate=1e-4),
metrics=["accuracy",
        #   tf.keras.metrics.Precision(), tf.keras.metrics.Recall()
          ])
model.fit(
    X_train, y_train,
    epochs=60,
    validation_data=(X_valid, y_valid),
    callbacks=[early_stopping]
)


NameError: name 'keras' is not defined

In [None]:
#evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test accuracy:', test_acc)
print('Test loss:', test_loss)
#0.8866999745368958

[1m  1/313[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m4s[0m 15ms/step - accuracy: 0.8750 - loss: 0.5421

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8867 - loss: 0.3845
Test accuracy: 0.8866999745368958
Test loss: 0.38445159792900085


## (b) CNN

Here we make a model 

In [None]:
max_pool = keras.layers.MaxPool2D(pool_size=2)
# avg_pool = keras.layers.AveragePooling2D(pool_size=2)
model = keras.models.Sequential([
keras.layers.Conv2D(64, 7, activation="leaky_relu", padding="same",
input_shape=[28, 28, 1]),
keras.layers.BatchNormalization(),
keras.layers.MaxPooling2D(2),
keras.layers.Conv2D(128, 3, activation="leaky_relu", padding="same"),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(128, 3, activation="leaky_relu", padding="same"),
keras.layers.BatchNormalization(),
keras.layers.MaxPooling2D(2),
# keras.layers.Dropout(0.12,
keras.layers.Conv2D(256, 3, activation="leaky_relu", padding="same"),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(256, 3, activation="leaky_relu", padding="same"),
keras.layers.BatchNormalization(),
keras.layers.MaxPooling2D(2),
keras.layers.Flatten(),
keras.layers.Dense(128, activation="leaky_relu"),
keras.layers.Dropout(0.5),
keras.layers.Dense(64, activation="leaky_relu"),
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation="softmax")
])
model.compile(loss="sparse_categorical_crossentropy",
optimizer=keras.optimizers.SGD(learning_rate=2e-3),
metrics=["accuracy",
        #   tf.keras.metrics.Precision(), tf.keras.metrics.Recall()
          ])
model.fit(
    X_train, y_train,
    epochs=30,
    validation_data=(X_valid, y_valid),
    # callbacks=[early_stopping]
)
#evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test accuracy:', test_acc)
#base:0.8420000076293945
#leaky: 0.8525000214576721
#leaky + L2regularization: 0.8472999930381775
#leaky + batch normalization: 0.8978000283241272

(print(tf.__version__))
# metrics=["accuracy"])

In [None]:
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test accuracy:', test_acc)