## Optimized Model: Rethinking the Pipeline

In the previous notebooks, I implemented multiple models of increasing complexity, ranging from logistic regression to deeper neural networks. While these experiments were valuable for understanding core concepts in machine learning and deep learning, the resulting performance remained relatively low and inconsistent.

After analyzing the results more carefully, it became clear that the limitation was not only related to model architecture or optimization techniques, but also to **how the data was labeled and interpreted**.

### Identified Issue with Data Labeling

In the earlier approach, an image was classified as *damaged* if **any single polygon** within the image was labeled as `D_Building` or `Debris`. This means that even a small, localized damaged region could cause the entire image to be labeled as damaged.  
Such a strategy likely introduces noise and label ambiguity, especially for images that are largely intact but contain minor damage.

This coarse labeling scheme may prevent the model from learning meaningful visual patterns related to *overall structural damage*, which is the core objective of this project.

### Objective of This Notebook

In this notebook, I aim to build the **most optimized model so far**, not only by:
- improving model architecture,
- applying better initialization, regularization, and optimization techniques,

but also by **revisiting and refining the data labeling strategy itself**.

By aligning the labels more closely with the true semantic meaning of structural damage, the goal is to provide the model with cleaner supervision and enable more reliable learning.

This step marks a transition from experimenting with models to **systematically improving the full machine learning pipeline**, from data understanding to final evaluation.


## Parsing the XML file 
the code checks the area of the polygon and if classify it accourding to its portion of the image

In [1]:
# Put at top of the cell
import os
import xml.etree.ElementTree as ET
from typing import Dict, Tuple, List
import numpy as np
import cv2

def polygon_area(coords: List[Tuple[float, float]]) -> float:
    """Compute polygon area using the shoelace formula.
    coords: list of (x, y) tuples in vertex order (clockwise or ccw).
    """
    if len(coords) < 3:
        return 0.0
    x = np.array([p[0] for p in coords], dtype=float)
    y = np.array([p[1] for p in coords], dtype=float)
    # use roll(-1) to get x_i * y_{i+1}
    return 0.5 * abs(np.dot(x, np.roll(y, -1)) - np.dot(y, np.roll(x, -1)))


def parse_destroyed_with_size_check(path: str, min_coverage: float = 0.05) -> Dict[str, int]:
    """
    Return {basename(filename): 0/1} where 1 means the image contains at least
    one polygon labeled as destroyed and that polygon covers >= min_coverage of image area.

    - path: xml annotation file path
    - min_coverage: fraction of image area (0..1), e.g. 0.05 -> 5%
    """
    DESTROYED_LABELS = {"D_Building", "Debris"}
    result: Dict[str, int] = {}

    try:
        tree = ET.parse(path)
    except ET.ParseError as e:
        raise RuntimeError(f"Failed to parse XML {path}: {e}")
    except FileNotFoundError:
        raise RuntimeError(f"XML file not found: {path}")

    root = tree.getroot()

    for image in root.findall(".//image"):
        filename = image.get("name")
        if not filename:
            continue
        # normalize to basename so it matches files in your images folder
        filename_key = os.path.basename(filename)

        # get image size (some annotations may store width/height as attributes)
        try:
            width = int(image.get("width", 0))
            height = int(image.get("height", 0))
        except ValueError:
            width = 0
            height = 0
        image_area = float(width * height)

        # if size missing, try to read size from a nested tag or skip
        if image_area == 0:
            # fallback: mark as not destroyed (or optionally skip)
            result[filename_key] = 0
            continue

        is_destroyed = False

        for polygon in image.findall("polygon"):
            label = polygon.get("label")
            points = polygon.get("points")

            if not label or not points:
                continue
            if label not in DESTROYED_LABELS:
                continue

            # Flexible parsing of points:
            # common formats: "x1,y1;x2,y2;..." or "x1,y1 x2,y2 ..." or "x1,y1;x2,y2;"
            pts_str = points.strip()
            if ";" in pts_str:
                raw_pts = pts_str.split(";")
            else:
                raw_pts = pts_str.split()  # split on whitespace

            coords = []
            for p in raw_pts:
                p = p.strip()
                if not p:
                    continue
                # support "x,y" or "x,y," etc.
                if "," not in p:
                    # unexpected format
                    coords = []
                    break
                try:
                    x_str, y_str = p.split(",")[:2]
                    coords.append((float(x_str), float(y_str)))
                except Exception:
                    coords = []
                    break

            if not coords:
                continue

            poly_area = polygon_area(coords)
            coverage = poly_area / image_area

            if coverage >= min_coverage:
                is_destroyed = True
                break

        result[filename_key] = int(is_destroyed)

    return result

## Preparing the data 

In [2]:
def load_and_resize_images(images_folder: str, target_size: Tuple[int, int] = (64, 64)):
    """
    Loads image files, resizes to target_size (width, height), normalizes pixels to [0,1].
    Returns:
      X: numpy array of shape (n_images, height, width, 3)
      ordered_filenames: list of basenames corresponding to rows in X
    """
    X = []
    ordered_filenames = []

    for filename in sorted(os.listdir(images_folder)):
        if filename.lower().endswith((".jpg", ".jpeg", ".png")):
            img_path = os.path.join(images_folder, filename)
            img = cv2.imread(img_path)
            if img is None:
                print(f"Warning: Cannot read {filename}, skipping.")
                continue
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            # cv2.resize expects (width, height) as tuple (w,h) for target size -> (w,h)
            img_resized = cv2.resize(img, target_size)
            img_resized = img_resized.astype(np.float32) / 255.0

            X.append(img_resized)
            ordered_filenames.append(filename)

    X = np.array(X)
    print("Final X shape:", X.shape)
    return X, ordered_filenames

In [3]:

def build_label_array(ordered_filenames: List[str], labels_dict: Dict[str, int], default_value: int = 0) -> np.ndarray:
    """
    Build label array matching ordered_filenames. Returns shape (n_samples,) of dtype int.
    """
    Y = []
    for fname in ordered_filenames:
        # ensures we compare basenames
        key = os.path.basename(fname)
        if key in labels_dict:
            Y.append(labels_dict[key])
        else:
            print(f"Warning: No label found for {fname}, assigning {default_value}")
            Y.append(default_value)
    return np.array(Y, dtype=int)


In [4]:
labels_train = parse_destroyed_with_size_check("../EIDSeg_Dataset/data/train/train.xml", min_coverage=0.3)
labels_test  = parse_destroyed_with_size_check("../EIDSeg_Dataset/data/test/test.xml",  min_coverage=0.3)

X_train_org, ordered_filenames_train = load_and_resize_images("../EIDSeg_Dataset/data/train/images/default", target_size=(64,64))
X_test_org,  ordered_filenames_test  = load_and_resize_images("../EIDSeg_Dataset/data/test/images/default",  target_size=(64,64))

Y_train_org = build_label_array(ordered_filenames_train, labels_train)   # shape (n_train,)
Y_test_org  = build_label_array(ordered_filenames_test,  labels_test)    # shape (n_test,)

# quick sanity checks
print("Train positive ratio:", Y_train_org.mean(), "n_train:", Y_train_org.shape[0])
print("Test  positive ratio:", Y_test_org.mean(),  "n_test:",  Y_test_org.shape[0])

Final X shape: (2612, 64, 64, 3)
Final X shape: (327, 64, 64, 3)
Train positive ratio: 0.49119448698315465 n_train: 2612
Test  positive ratio: 0.5168195718654435 n_test: 327


In [5]:
train_x = X_train_org.reshape(X_train_org.shape[0], -1).T
train_y = Y_train_org.reshape(1,-1)
test_x = X_test_org.reshape(X_test_org.shape[0], -1).T
test_y = Y_test_org.reshape(1,-1)


print(train_x.shape, train_y.shape)
print(test_x.shape, test_y.shape)

(12288, 2612) (1, 2612)
(12288, 327) (1, 327)


## DNN structure

In [6]:
def sigmoid(Z):
    """
    Implements the sigmoid activation in numpy
    
    Arguments:
    Z -- numpy array of any shape
    
    Returns:
    A -- output of sigmoid(z), same shape as Z
    cache -- returns Z as well, useful during backpropagation
    """
    
    A = 1/(1+np.exp(-Z))
    cache = Z
    
    return A, cache

def relu(Z):
    """
    Implement the RELU function.

    Arguments:
    Z -- Output of the linear layer, of any shape

    Returns:
    A -- Post-activation parameter, of the same shape as Z
    cache -- a python dictionary containing "A" ; stored for computing the backward pass efficiently
    """
    
    A = np.maximum(0,Z)
    
    assert(A.shape == Z.shape)
    
    cache = Z 
    return A, cache


def relu_backward(dA, cache):
    """
    Implement the backward propagation for a single RELU unit.

    Arguments:
    dA -- post-activation gradient, of any shape
    cache -- 'Z' where we store for computing backward propagation efficiently

    Returns:
    dZ -- Gradient of the cost with respect to Z
    """
    
    Z = cache
    dZ = np.array(dA, copy=True) # just converting dz to a correct object.
    
    # When z <= 0, you should set dz to 0 as well. 
    dZ[Z <= 0] = 0
    
    assert (dZ.shape == Z.shape)
    
    return dZ

def sigmoid_backward(dA, cache):
    """
    Implement the backward propagation for a single SIGMOID unit.

    Arguments:
    dA -- post-activation gradient, of any shape
    cache -- 'Z' where we store for computing backward propagation efficiently

    Returns:
    dZ -- Gradient of the cost with respect to Z
    """
    
    Z = cache
    
    s = 1/(1+np.exp(-Z))
    dZ = dA * s * (1-s)
    
    assert (dZ.shape == Z.shape)
    
    return dZ



In [7]:

def initialize_parameters_deep(layer_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the dimensions of each layer in our network
    
    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
                    bl -- bias vector of shape (layer_dims[l], 1)
    """
    
    np.random.seed(1)
    parameters = {}
    L = len(layer_dims)            # number of layers in the network

    for l in range(1, L):
        parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1]) / np.sqrt(layer_dims[l-1]) #*0.01
        parameters['b' + str(l)] = np.zeros((layer_dims[l], 1))
        
        assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))
        assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))

        
    return parameters


In [8]:
def linear_forward(A, W, b):
    """
    Implement the linear part of a layer's forward propagation.

    Arguments:
    A -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)

    Returns:
    Z -- the input of the activation function, also called pre-activation parameter 
    cache -- a python dictionary containing "A", "W" and "b" ; stored for computing the backward pass efficiently
    """
    
    Z = W.dot(A) + b
    
    assert(Z.shape == (W.shape[0], A.shape[1]))
    cache = (A, W, b)
    
    return Z, cache

In [9]:

def linear_activation_forward(A_prev, W, b, activation):
    """
    Implement the forward propagation for the LINEAR->ACTIVATION layer

    Arguments:
    A_prev -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)
    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"

    Returns:
    A -- the output of the activation function, also called the post-activation value 
    cache -- a python dictionary containing "linear_cache" and "activation_cache";
             stored for computing the backward pass efficiently
    """
    
    if activation == "sigmoid":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        Z, linear_cache = linear_forward(A_prev, W, b)
        A, activation_cache = sigmoid(Z)
    
    elif activation == "relu":
        # Inputs: "A_prev, W, b". Outputs: "A, activation_cache".
        Z, linear_cache = linear_forward(A_prev, W, b)
        A, activation_cache = relu(Z)
        
    else:
        print("\033[91mError! Please make sure you have passed the value correctly in the \"activation\" parameter")
    
    assert (A.shape == (W.shape[0], A_prev.shape[1]))
    cache = (linear_cache, activation_cache)

    return A, cache

In [10]:
def L_model_forward(X, parameters):
    """
    Implement forward propagation for the [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID computation
    
    Arguments:
    X -- data, numpy array of shape (input size, number of examples)
    parameters -- output of initialize_parameters_deep()
    
    Returns:
    AL -- last post-activation value
    caches -- list of caches containing:
                every cache of linear_relu_forward() (there are L-1 of them, indexed from 0 to L-2)
                the cache of linear_sigmoid_forward() (there is one, indexed L-1)
    """

    caches = []
    A = X
    L = len(parameters) // 2                  # number of layers in the neural network
    
    # Implement [LINEAR -> RELU]*(L-1). Add "cache" to the "caches" list.
    for l in range(1, L):
        A_prev = A 
        A, cache = linear_activation_forward(A_prev, parameters['W' + str(l)], parameters['b' + str(l)], activation = "relu")
        caches.append(cache)
    
    # Implement LINEAR -> SIGMOID. Add "cache" to the "caches" list.
    AL, cache = linear_activation_forward(A, parameters['W' + str(L)], parameters['b' + str(L)], activation = "sigmoid")
    caches.append(cache)
    
    assert(AL.shape == (1,X.shape[1]))
            
    return AL, caches

In [11]:
def compute_cost(AL, Y):
    """
    Implement the cost function defined by equation (7).

    Arguments:
    AL -- probability vector corresponding to your label predictions, shape (1, number of examples)
    Y -- true "label" vector (for example: containing 0 if non-cat, 1 if cat), shape (1, number of examples)

    Returns:
    cost -- cross-entropy cost
    """
    
    m = Y.shape[1]

    # Compute loss from aL and y.
    cost = (1./m) * (-np.dot(Y,np.log(AL).T) - np.dot(1-Y, np.log(1-AL).T))
    
    cost = np.squeeze(cost)      # To make sure your cost's shape is what we expect (e.g. this turns [[17]] into 17).
    assert(cost.shape == ())
    
    return cost

In [12]:

def linear_backward(dZ, cache):
    """
    Implement the linear portion of backward propagation for a single layer (layer l)

    Arguments:
    dZ -- Gradient of the cost with respect to the linear output (of current layer l)
    cache -- tuple of values (A_prev, W, b) coming from the forward propagation in the current layer

    Returns:
    dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prev
    dW -- Gradient of the cost with respect to W (current layer l), same shape as W
    db -- Gradient of the cost with respect to b (current layer l), same shape as b
    """
    A_prev, W, b = cache
    m = A_prev.shape[1]

    dW = 1./m * np.dot(dZ,A_prev.T)
    db = 1./m * np.sum(dZ, axis = 1, keepdims = True)
    dA_prev = np.dot(W.T,dZ)
    
    assert (dA_prev.shape == A_prev.shape)
    assert (dW.shape == W.shape)
    assert (db.shape == b.shape)
    
    return dA_prev, dW, db

In [13]:
def linear_activation_backward(dA, cache, activation):
    """
    Implement the backward propagation for the LINEAR->ACTIVATION layer.
    
    Arguments:
    dA -- post-activation gradient for current layer l 
    cache -- tuple of values (linear_cache, activation_cache) we store for computing backward propagation efficiently
    activation -- the activation to be used in this layer, stored as a text string: "sigmoid" or "relu"
    
    Returns:
    dA_prev -- Gradient of the cost with respect to the activation (of the previous layer l-1), same shape as A_prev
    dW -- Gradient of the cost with respect to W (current layer l), same shape as W
    db -- Gradient of the cost with respect to b (current layer l), same shape as b
    """
    linear_cache, activation_cache = cache
    
    if activation == "relu":
        dZ = relu_backward(dA, activation_cache)
        dA_prev, dW, db = linear_backward(dZ, linear_cache)
        
    elif activation == "sigmoid":
        dZ = sigmoid_backward(dA, activation_cache)
        dA_prev, dW, db = linear_backward(dZ, linear_cache)
        
    else:
        print("\033[91mError! Please make sure you have passed the value correctly in the \"activation\" parameter")
    
    return dA_prev, dW, db

In [14]:

def L_model_backward(AL, Y, caches):
    """
    Implement the backward propagation for the [LINEAR->RELU] * (L-1) -> LINEAR -> SIGMOID group
    
    Arguments:
    AL -- probability vector, output of the forward propagation (L_model_forward())
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat)
    caches -- list of caches containing:
                every cache of linear_activation_forward() with "relu" (there are (L-1) or them, indexes from 0 to L-2)
                the cache of linear_activation_forward() with "sigmoid" (there is one, index L-1)
    
    Returns:
    grads -- A dictionary with the gradients
             grads["dA" + str(l)] = ... 
             grads["dW" + str(l)] = ...
             grads["db" + str(l)] = ... 
    """
    grads = {}
    L = len(caches) # the number of layers
    m = AL.shape[1]
    Y = Y.reshape(AL.shape) # after this line, Y is the same shape as AL
    
    # Initializing the backpropagation
    dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL))
    
    # Lth layer (SIGMOID -> LINEAR) gradients. Inputs: "AL, Y, caches". Outputs: "grads["dAL"], grads["dWL"], grads["dbL"]
    current_cache = caches[L-1]
    grads["dA" + str(L-1)], grads["dW" + str(L)], grads["db" + str(L)] = linear_activation_backward(dAL, current_cache, activation = "sigmoid")
    
    for l in reversed(range(L-1)):
        # lth layer: (RELU -> LINEAR) gradients.
        current_cache = caches[l]
        dA_prev_temp, dW_temp, db_temp = linear_activation_backward(grads["dA" + str(l + 1)], current_cache, activation = "relu")
        grads["dA" + str(l)] = dA_prev_temp
        grads["dW" + str(l + 1)] = dW_temp
        grads["db" + str(l + 1)] = db_temp

    return grads

In [15]:
def update_parameters(parameters, grads, learning_rate):
    """
    Update parameters using gradient descent
    
    Arguments:
    parameters -- python dictionary containing your parameters 
    grads -- python dictionary containing your gradients, output of L_model_backward
    
    Returns:
    parameters -- python dictionary containing your updated parameters 
                  parameters["W" + str(l)] = ... 
                  parameters["b" + str(l)] = ...
    """
    
    L = len(parameters) // 2 # number of layers in the neural network

    # Update rule for each parameter. Use a for loop.
    for l in range(L):
        parameters["W" + str(l+1)] = parameters["W" + str(l+1)] - learning_rate * grads["dW" + str(l+1)]
        parameters["b" + str(l+1)] = parameters["b" + str(l+1)] - learning_rate * grads["db" + str(l+1)]
        
    return parameters

In [16]:

def predict(X, y, parameters):
    """
    This function is used to predict the results of a  L-layer neural network.
    
    Arguments:
    X -- data set of examples you would like to label
    parameters -- parameters of the trained model
    
    Returns:
    p -- predictions for the given dataset X
    """
    
    m = X.shape[1]
    n = len(parameters) // 2 # number of layers in the neural network
    p = np.zeros((1,m))
    
    # Forward propagation
    probas, caches = L_model_forward(X, parameters)

    
    # convert probas to 0/1 predictions
    for i in range(0, probas.shape[1]):
        if probas[0,i] > 0.5:
            p[0,i] = 1
        else:
            p[0,i] = 0
    
    #print results
    #print ("predictions: " + str(p))
    #print ("true labels: " + str(y))
    print("Accuracy: "  + str(np.sum((p == y)/m)))
        
    return p

## L-layer Neural Network


In [17]:

def L_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):
    """
    Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID.
    
    Arguments:
    X -- input data, of shape (n_x, number of examples)
    Y -- true "label" vector (containing 1 if cat, 0 if non-cat), of shape (1, number of examples)
    layers_dims -- list containing the input size and each layer size, of length (number of layers + 1).
    learning_rate -- learning rate of the gradient descent update rule
    num_iterations -- number of iterations of the optimization loop
    print_cost -- if True, it prints the cost every 100 steps
    
    Returns:
    parameters -- parameters learnt by the model. They can then be used to predict.
    """

    np.random.seed(1)
    costs = []                         # keep track of cost
    
    # Parameters initialization.

    parameters = initialize_parameters_deep(layers_dims)
        
    # Loop (gradient descent)
    for i in range(0, num_iterations):

        # Forward propagation: [LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID.
        AL, caches = L_model_forward(X, parameters)
                
        # Compute cost.
        cost = compute_cost(AL,Y)
            
        # Backward propagation.
        grads = L_model_backward(AL, Y, caches)
        
        # Update parameters.

        parameters = update_parameters(parameters, grads, learning_rate)
                        
        # Print the cost every 100 iterations and for the last iteration
        if print_cost and (i % 100 == 0 or i == num_iterations - 1):
            print("Cost after iteration {}: {}".format(i, np.squeeze(cost)))
        if i % 100 == 0:
            costs.append(cost)
    
    return parameters, costs

## Training!!

In [18]:
layers_dims = [12288, 20, 7, 5, 1] 
parameters, costs = L_layer_model(train_x, train_y, layers_dims,learning_rate = 0.009, num_iterations = 2500, print_cost = True)

Cost after iteration 0: 0.7246250162012067
Cost after iteration 100: 0.6872203231761475
Cost after iteration 200: 0.6775461381963849
Cost after iteration 300: 0.6660236810493122
Cost after iteration 400: 0.675322633125315
Cost after iteration 500: 0.6539324891845053
Cost after iteration 600: 0.6572655245114436
Cost after iteration 700: 0.6285363185367053
Cost after iteration 800: 0.6369991025692343
Cost after iteration 900: 0.6473037728599877
Cost after iteration 1000: 0.5995318317603917
Cost after iteration 1100: 0.61476440088501
Cost after iteration 1200: 0.6015165843621063
Cost after iteration 1300: 0.5705049178075906
Cost after iteration 1400: 0.5649023718327169
Cost after iteration 1500: 0.5608144433536896
Cost after iteration 1600: 0.5818106456401206
Cost after iteration 1700: 0.6282697297242217
Cost after iteration 1800: 0.5505934307406615
Cost after iteration 1900: 0.5623603370943553
Cost after iteration 2000: 0.5638067592656174
Cost after iteration 2100: 0.5239095162844934
Cos

In [19]:
print("Train ", end= ":")
pred_train = predict(train_x, Y_train_org, parameters)
print("Test ",end= ":")
pred_test = predict(test_x, Y_test_org, parameters)

Train :Accuracy: 0.8016845329249619
Test :Accuracy: 0.5382262996941896
