## Optimized Model: Rethinking the Pipeline

In the previous notebooks, I implemented multiple models of increasing complexity, ranging from logistic regression to deeper neural networks. While these experiments were valuable for understanding core concepts in machine learning and deep learning, the resulting performance remained relatively low and inconsistent.

After analyzing the results more carefully, it became clear that the limitation was not only related to model architecture or optimization techniques, but also to **how the data was labeled and interpreted**.

### Identified Issue with Data Labeling

In the earlier approach, an image was classified as *damaged* if **any single polygon** within the image was labeled as `D_Building` or `Debris`. This means that even a small, localized damaged region could cause the entire image to be labeled as damaged.  
Such a strategy likely introduces noise and label ambiguity, especially for images that are largely intact but contain minor damage.

This coarse labeling scheme may prevent the model from learning meaningful visual patterns related to *overall structural damage*, which is the core objective of this project.

### Objective of This Notebook

In this notebook, I aim to build the **most optimized model so far**, not only by:
- improving model architecture,
- applying better initialization, regularization, and optimization techniques,

but also by **revisiting and refining the data labeling strategy itself**.

By aligning the labels more closely with the true semantic meaning of structural damage, the goal is to provide the model with cleaner supervision and enable more reliable learning.

This step marks a transition from experimenting with models to **systematically improving the full machine learning pipeline**, from data understanding to final evaluation.


## Parsing the XML file & Preparing the data 

In [11]:
from Optimized_Model import *

In [2]:
labels_train = parse_destroyed_with_size_check("../EIDSeg_Dataset/data/train/train.xml", min_coverage=0.3)
labels_test  = parse_destroyed_with_size_check("../EIDSeg_Dataset/data/test/test.xml",  min_coverage=0.3)

X_train_org, ordered_filenames_train = load_and_resize_images("../EIDSeg_Dataset/data/train/images/default", target_size=(64,64))
X_test_org,  ordered_filenames_test  = load_and_resize_images("../EIDSeg_Dataset/data/test/images/default",  target_size=(64,64))

Y_train_org = build_label_array(ordered_filenames_train, labels_train)   # shape (n_train,)
Y_test_org  = build_label_array(ordered_filenames_test,  labels_test)    # shape (n_test,)

# quick sanity checks
print("Train positive ratio:", Y_train_org.mean(), "n_train:", Y_train_org.shape[0])
print("Test  positive ratio:", Y_test_org.mean(),  "n_test:",  Y_test_org.shape[0])

Final X shape: (2612, 64, 64, 3)
Final X shape: (327, 64, 64, 3)
Train positive ratio: 0.49119448698315465 n_train: 2612
Test  positive ratio: 0.5168195718654435 n_test: 327


In [3]:
train_x = X_train_org.reshape(X_train_org.shape[0], -1).T
train_y = Y_train_org.reshape(1,-1)
test_x = X_test_org.reshape(X_test_org.shape[0], -1).T
test_y = Y_test_org.reshape(1,-1)


print(train_x.shape, train_y.shape)
print(test_x.shape, test_y.shape)

(12288, 2612) (1, 2612)
(12288, 327) (1, 327)


## L-layer Neural Network


In [17]:

def L_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):
    """
    Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID.
    
    Arguments:
    X -- input data, of shape (n_x, number of examples)
    Y -- true "label" vector (containing 1 if cat, 0 if non-cat), of shape (1, number of examples)
    layers_dims -- list containing the input size and each layer size, of length (number of layers + 1).
    learning_rate -- learning rate of the gradient descent update rule
    num_iterations -- number of iterations of the optimization loop
    print_cost -- if True, it prints the cost every 100 steps
    
    Returns:
    parameters -- parameters learnt by the model. They can then be used to predict.
    """

    np.random.seed(1)
    costs = []                         # keep track of cost
    
    # Parameters initialization.

    parameters = initialize_parameters_deep(layers_dims)
        
    # Loop (gradient descent)
    for i in range(0, num_iterations):

        # Forward propagation: [LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID.
        AL, caches = L_model_forward(X, parameters)
                
        # Compute cost.
        cost = compute_cost(AL,Y)
            
        # Backward propagation.
        grads = L_model_backward(AL, Y, caches)
        
        # Update parameters.

        parameters = update_parameters(parameters, grads, learning_rate)
                        
        # Print the cost every 100 iterations and for the last iteration
        if print_cost and (i % 100 == 0 or i == num_iterations - 1):
            print("Cost after iteration {}: {}".format(i, np.squeeze(cost)))
        if i % 100 == 0:
            costs.append(cost)
    
    return parameters, costs

## Training!!

In [18]:
layers_dims = [12288, 20, 7, 5, 1] 
parameters, costs = L_layer_model(train_x, train_y, layers_dims,learning_rate = 0.009, num_iterations = 2500, print_cost = True)

Cost after iteration 0: 0.7246250162012067
Cost after iteration 100: 0.6872203231761475
Cost after iteration 200: 0.6775461381963849
Cost after iteration 300: 0.6660236810493122
Cost after iteration 400: 0.675322633125315
Cost after iteration 500: 0.6539324891845053
Cost after iteration 600: 0.6572655245114436
Cost after iteration 700: 0.6285363185367053
Cost after iteration 800: 0.6369991025692343
Cost after iteration 900: 0.6473037728599877
Cost after iteration 1000: 0.5995318317603917
Cost after iteration 1100: 0.61476440088501
Cost after iteration 1200: 0.6015165843621063
Cost after iteration 1300: 0.5705049178075906
Cost after iteration 1400: 0.5649023718327169
Cost after iteration 1500: 0.5608144433536896
Cost after iteration 1600: 0.5818106456401206
Cost after iteration 1700: 0.6282697297242217
Cost after iteration 1800: 0.5505934307406615
Cost after iteration 1900: 0.5623603370943553
Cost after iteration 2000: 0.5638067592656174
Cost after iteration 2100: 0.5239095162844934
Cos

In [19]:
print("Train", end= " ")
pred_train = predict(train_x, train_y, parameters)
print("Test",end= " ")
pred_test = predict(test_x, test_y, parameters)

Train :Accuracy: 0.8016845329249619
Test :Accuracy: 0.5382262996941896


## Let's try Mini-Batch with Adam

In [9]:
import math
def optimized_model(X, Y, layers_dims, optimizer, learning_rate = 0.0007, mini_batch_size = 64, beta = 0.9,
          beta1 = 0.9, beta2 = 0.999,  epsilon = 1e-8, num_epochs = 5000, print_cost = True):
    """
    3-layer neural network model which can be run in different optimizer modes.
    
    Arguments:
    X -- input data, of shape (2, number of examples)
    Y -- true "label" vector (1 for blue dot / 0 for red dot), of shape (1, number of examples)
    optimizer -- the optimizer to be passed, gradient descent, momentum or adam
    layers_dims -- python list, containing the size of each layer
    learning_rate -- the learning rate, scalar.
    mini_batch_size -- the size of a mini batch
    beta -- Momentum hyperparameter
    beta1 -- Exponential decay hyperparameter for the past gradients estimates 
    beta2 -- Exponential decay hyperparameter for the past squared gradients estimates 
    epsilon -- hyperparameter preventing division by zero in Adam updates
    num_epochs -- number of epochs
    print_cost -- True to print the cost every 1000 epochs

    Returns:
    parameters -- python dictionary containing your updated parameters 
    """

    L = len(layers_dims)             # number of layers in the neural networks
    costs = []                       # to keep track of the cost
    t = 0                            # initializing the counter required for Adam update
    seed = 10                        # For grading purposes, so that your "random" minibatches are the same as ours
    m = X.shape[1]                   # number of training examples
    
    # Initialize parameters
    parameters = initialize_parameters_deep(layers_dims)

    # Initialize the optimizer
    if optimizer == "gd":
        pass # no initialization required for gradient descent
    elif optimizer == "momentum":
        v = initialize_velocity(parameters)
    elif optimizer == "adam":
        v, s = initialize_adam(parameters)
    
    # Optimization loop
    for i in range(num_epochs):
        
        # Define the random minibatches. We increment the seed to reshuffle differently the dataset after each epoch
        seed = seed + 1
        minibatches = random_mini_batches(X, Y, mini_batch_size, seed)
        cost_total = 0
        
        for minibatch in minibatches:

            # Select a minibatch
            (minibatch_X, minibatch_Y) = minibatch

            # Forward propagation
            AL, caches = L_model_forward(minibatch_X, parameters)

            # Compute cost and add to the cost total
            cost_total += compute_cost(AL, minibatch_Y)

            # Backward propagation
            grads = L_model_backward(AL, Y, caches)

            # Update parameters
            if optimizer == "gd":
                #parameters = update_parameters_with_gd(parameters, grads, learning_rate)
                pass
            elif optimizer == "momentum":
                parameters, v = update_parameters_with_momentum(parameters, grads, v, beta, learning_rate)
            elif optimizer == "adam":
                t = t + 1 # Adam counter
                parameters, v, s, _, _ = update_parameters_with_adam(parameters, grads, v, s,
                                                               t, learning_rate, beta1, beta2,  epsilon)
        cost_avg = cost_total / m
        
        # Print the cost every 1000 epoch
        if print_cost and i % 1000 == 0:
            print ("Cost after epoch %i: %f" %(i, cost_avg))
        if print_cost and i % 100 == 0:
            costs.append(cost_avg)
                
    # plot the cost
    plt.plot(costs)
    plt.ylabel('cost')
    plt.xlabel('epochs (per 100)')
    plt.title("Learning rate = " + str(learning_rate))
    plt.show()

    return parameters

### lets train !

In [12]:
layers_dims = [12288, 20, 7, 5, 1] 
parameters = optimized_model(train_x, train_y, layers_dims, optimizer = "adam")
# Predict
predictions = predict(train_x, train_y, parameters)

# Plot decision boundary
plt.title("Model with Adam optimization")
axes = plt.gca()
axes.set_xlim([-1.5,2.5])
axes.set_ylim([-1,1.5])
plot_decision_boundary(lambda x: predict_dec(parameters, x.T), train_X, train_Y)

NameError: name 'math' is not defined

In [None]:
predictions = predict(test_x, test_y, parameters)
