# Handling-Overfitting-with-CIFAR-10

### Q1: How does adding dropout layers affect training vs validation accuracy?


Adding dropout reduced training accuracy but improved validation accuracy stability. This means the model overfits less and generalizes better.


### Q2: Does early stopping prevent wasted training time?

Yes, training stops early when validation loss stops improving.

### Q3: Can L2 weight regularization improve generalization?

Yes, it keeps weights small and smooths the decision boundary, improving validation accuracy slightly.

### Q4: How does model depth affect overfitting?

Deeper networks fit training data faster but overfit more; shallower ones are more stable but less accurate.

### Data & Setup

In [2]:
from tensorflow import keras
from tensorflow.keras import layers, regularizers
import numpy as np

# data
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype("float32")/255.0
x_test  = x_test.astype("float32")/255.0
y_train = y_train.flatten(); y_test = y_test.flatten()

input_shape = (32,32,3)
num_classes = 10
results = {}

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 0us/step


### Baseline (no regularization)

In [None]:
model = keras.Sequential([
    layers.Flatten(input_shape=input_shape),
    layers.Dense(512, activation="relu"),
    layers.Dense(256, activation="relu"),
    layers.Dense(num_classes, activation="softmax")
])

model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

hist = model.fit(x_train, y_train, epochs=15, batch_size=128,
                 validation_data=(x_test, y_test), verbose=2)

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
results["baseline"] = float(test_acc)
print("Baseline test acc:", round(test_acc,3))

  super().__init__(**kwargs)


Epoch 1/15
391/391 - 13s - 32ms/step - accuracy: 0.3212 - loss: 1.9008 - val_accuracy: 0.3719 - val_loss: 1.7317
Epoch 2/15
391/391 - 16s - 42ms/step - accuracy: 0.3998 - loss: 1.6791 - val_accuracy: 0.4229 - val_loss: 1.6185
Epoch 3/15
391/391 - 11s - 29ms/step - accuracy: 0.4334 - loss: 1.5853 - val_accuracy: 0.4521 - val_loss: 1.5322
Epoch 4/15
391/391 - 13s - 32ms/step - accuracy: 0.4494 - loss: 1.5374 - val_accuracy: 0.4586 - val_loss: 1.5144
Epoch 5/15
391/391 - 12s - 30ms/step - accuracy: 0.4689 - loss: 1.4877 - val_accuracy: 0.4610 - val_loss: 1.5164
Epoch 6/15
391/391 - 13s - 32ms/step - accuracy: 0.4810 - loss: 1.4615 - val_accuracy: 0.4613 - val_loss: 1.5194
Epoch 7/15
391/391 - 13s - 33ms/step - accuracy: 0.4923 - loss: 1.4252 - val_accuracy: 0.4727 - val_loss: 1.4829
Epoch 8/15
391/391 - 12s - 31ms/step - accuracy: 0.5008 - loss: 1.4007 - val_accuracy: 0.4887 - val_loss: 1.4258
Epoch 9/15
391/391 - 12s - 31ms/step - accuracy: 0.5078 - loss: 1.3753 - val_accuracy: 0.4849 - 

### Dropout

In [None]:
model_do = keras.Sequential([
    layers.Flatten(input_shape=input_shape),
    layers.Dense(512, activation="relu"),
    layers.Dropout(0.5),
    layers.Dense(256, activation="relu"),
    layers.Dropout(0.3),
    layers.Dense(num_classes, activation="softmax")
])

model_do.compile(optimizer="adam",
                 loss="sparse_categorical_crossentropy",
                 metrics=["accuracy"])

hist_do = model_do.fit(x_train, y_train, epochs=15, batch_size=128,
                       validation_data=(x_test, y_test), verbose=2)

_, acc_do = model_do.evaluate(x_test, y_test, verbose=0)
results["dropout"] = float(acc_do)
print("Dropout test acc:", round(acc_do,3))

Epoch 1/15
391/391 - 14s - 35ms/step - accuracy: 0.2083 - loss: 2.1076 - val_accuracy: 0.3012 - val_loss: 1.9636
Epoch 2/15
391/391 - 13s - 32ms/step - accuracy: 0.2420 - loss: 2.0058 - val_accuracy: 0.3112 - val_loss: 1.9376
Epoch 3/15
391/391 - 13s - 33ms/step - accuracy: 0.2592 - loss: 1.9736 - val_accuracy: 0.3139 - val_loss: 1.9369
Epoch 4/15
391/391 - 13s - 32ms/step - accuracy: 0.2613 - loss: 1.9664 - val_accuracy: 0.3104 - val_loss: 1.9361
Epoch 5/15
391/391 - 13s - 33ms/step - accuracy: 0.2698 - loss: 1.9468 - val_accuracy: 0.3298 - val_loss: 1.9211
Epoch 6/15
391/391 - 10s - 26ms/step - accuracy: 0.2791 - loss: 1.9322 - val_accuracy: 0.3398 - val_loss: 1.9033
Epoch 7/15
391/391 - 12s - 30ms/step - accuracy: 0.2875 - loss: 1.9192 - val_accuracy: 0.3276 - val_loss: 1.9278
Epoch 8/15
391/391 - 21s - 54ms/step - accuracy: 0.2936 - loss: 1.9054 - val_accuracy: 0.3476 - val_loss: 1.9121
Epoch 9/15
391/391 - 10s - 25ms/step - accuracy: 0.2974 - loss: 1.8907 - val_accuracy: 0.3469 - 

### Early Stopping

In [None]:
early_stop = keras.callbacks.EarlyStopping(
    monitor="val_loss", patience=3, restore_best_weights=True
)

model_es = keras.Sequential([
    layers.Flatten(input_shape=input_shape),
    layers.Dense(512, activation="relu"),
    layers.Dropout(0.5),
    layers.Dense(256, activation="relu"),
    layers.Dropout(0.3),
    layers.Dense(num_classes, activation="softmax")
])

model_es.compile(optimizer="adam",
                 loss="sparse_categorical_crossentropy",
                 metrics=["accuracy"])

hist_es = model_es.fit(x_train, y_train, epochs=30, batch_size=128,
                       validation_data=(x_test, y_test),
                       callbacks=[early_stop], verbose=2)

_, acc_es = model_es.evaluate(x_test, y_test, verbose=0)
results["dropout+earlystop"] = float(acc_es)
print("Dropout+EarlyStopping test acc:", round(acc_es,3))
print("Best epoch:", len(hist_es.history["loss"]))

Epoch 1/30
391/391 - 14s - 35ms/step - accuracy: 0.2004 - loss: 2.1302 - val_accuracy: 0.2699 - val_loss: 1.9781
Epoch 2/30
391/391 - 27s - 68ms/step - accuracy: 0.2390 - loss: 2.0182 - val_accuracy: 0.2998 - val_loss: 1.9398
Epoch 3/30
391/391 - 19s - 48ms/step - accuracy: 0.2588 - loss: 1.9771 - val_accuracy: 0.3283 - val_loss: 1.9063
Epoch 4/30
391/391 - 11s - 28ms/step - accuracy: 0.2686 - loss: 1.9444 - val_accuracy: 0.3056 - val_loss: 1.9624
Epoch 5/30
391/391 - 19s - 49ms/step - accuracy: 0.2790 - loss: 1.9308 - val_accuracy: 0.3262 - val_loss: 1.9478
Epoch 6/30
391/391 - 22s - 55ms/step - accuracy: 0.2813 - loss: 1.9288 - val_accuracy: 0.3390 - val_loss: 1.8923
Epoch 7/30
391/391 - 14s - 36ms/step - accuracy: 0.2821 - loss: 1.9098 - val_accuracy: 0.3337 - val_loss: 1.9189
Epoch 8/30
391/391 - 13s - 34ms/step - accuracy: 0.2876 - loss: 1.9063 - val_accuracy: 0.3537 - val_loss: 1.8978
Epoch 9/30
391/391 - 12s - 30ms/step - accuracy: 0.2907 - loss: 1.9015 - val_accuracy: 0.3493 - 

### L2 Weight Regularization

In [6]:
model_l2 = keras.Sequential([
    layers.Flatten(input_shape=input_shape),
    layers.Dense(512, activation="relu", kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(256, activation="relu", kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(num_classes, activation="softmax")
])

model_l2.compile(optimizer="adam",
                 loss="sparse_categorical_crossentropy",
                 metrics=["accuracy"])

hist_l2 = model_l2.fit(x_train, y_train, epochs=15, batch_size=128,
                       validation_data=(x_test, y_test), verbose=2)

_, acc_l2 = model_l2.evaluate(x_test, y_test, verbose=0)
results["l2"] = float(acc_l2)
print("L2 test acc:", round(acc_l2,3))

Epoch 1/15
391/391 - 13s - 34ms/step - accuracy: 0.3246 - loss: 2.3939 - val_accuracy: 0.3771 - val_loss: 2.0045
Epoch 2/15
391/391 - 12s - 30ms/step - accuracy: 0.3942 - loss: 1.8892 - val_accuracy: 0.3746 - val_loss: 1.8915
Epoch 3/15
391/391 - 14s - 35ms/step - accuracy: 0.4189 - loss: 1.7643 - val_accuracy: 0.4444 - val_loss: 1.6983
Epoch 4/15
391/391 - 14s - 36ms/step - accuracy: 0.4366 - loss: 1.7033 - val_accuracy: 0.4361 - val_loss: 1.6892
Epoch 5/15
391/391 - 14s - 35ms/step - accuracy: 0.4452 - loss: 1.6695 - val_accuracy: 0.4522 - val_loss: 1.6425
Epoch 6/15
391/391 - 13s - 34ms/step - accuracy: 0.4577 - loss: 1.6314 - val_accuracy: 0.4622 - val_loss: 1.6187
Epoch 7/15
391/391 - 14s - 35ms/step - accuracy: 0.4634 - loss: 1.6198 - val_accuracy: 0.4318 - val_loss: 1.6993
Epoch 8/15
391/391 - 14s - 35ms/step - accuracy: 0.4708 - loss: 1.5931 - val_accuracy: 0.4794 - val_loss: 1.5719
Epoch 9/15
391/391 - 11s - 29ms/step - accuracy: 0.4732 - loss: 1.5853 - val_accuracy: 0.4525 - 

### Report validation accuracy improvements with regularization

In [8]:
print("\n=== Summary (higher is better) ===")
for k,v in results.items():
    print(f"{k:20s} -> {v:.3f}")

print("\nNotes:")
print("- Dropout: usually lowers train acc but helps val acc (less overfitting).")
print("- EarlyStopping: stops at best epoch, saves time.")
print("- L2: keeps weights small, improves generalization a bit.")
print("- Depth: deeper can overfit; use dropout/L2/ES to control.")



=== Summary (higher is better) ===
baseline             -> 0.500
dropout              -> 0.366
dropout+earlystop    -> 0.339
l2                   -> 0.483

Notes:
- Dropout: usually lowers train acc but helps val acc (less overfitting).
- EarlyStopping: stops at best epoch, saves time.
- L2: keeps weights small, improves generalization a bit.
- Depth: deeper can overfit; use dropout/L2/ES to control.
