## Optimized Model: Rethinking the Pipeline

In the previous notebooks, I implemented multiple models of increasing complexity, ranging from logistic regression to deeper neural networks. While these experiments were valuable for understanding core concepts in machine learning and deep learning, the resulting performance remained relatively low and inconsistent.

After analyzing the results more carefully, it became clear that the limitation was not only related to model architecture or optimization techniques, but also to **how the data was labeled and interpreted**.

### Identified Issue with Data Labeling

In the earlier approach, an image was classified as *damaged* if **any single polygon** within the image was labeled as `D_Building` or `Debris`. This means that even a small, localized damaged region could cause the entire image to be labeled as damaged.  
Such a strategy likely introduces noise and label ambiguity, especially for images that are largely intact but contain minor damage.

This coarse labeling scheme may prevent the model from learning meaningful visual patterns related to *overall structural damage*, which is the core objective of this project.

### Objective of This Notebook

In this notebook, I aim to build the **most optimized model so far**, not only by:
- improving model architecture,
- applying better initialization, regularization, and optimization techniques,

but also by **revisiting and refining the data labeling strategy itself**.

By aligning the labels more closely with the true semantic meaning of structural damage, the goal is to provide the model with cleaner supervision and enable more reliable learning.

This step marks a transition from experimenting with models to **systematically improving the full machine learning pipeline**, from data understanding to final evaluation.


## Parsing the XML file & Preparing the data 

In [1]:
import numpy as np
import h5py 
import sys
sys.path.append("../src")
from Optimized_Model_04.Optimized_Model import *

In [2]:
with h5py.File("../EIDSeg_Dataset/cache/eidseg_64x64_binary_any.h5", "r") as f:
    X_train_org = f["X_train"][:]
    Y_train_org = f["Y_train"][:]
    X_test_org  = f["X_test"][:]
    Y_test_org  = f["Y_test"][:]

m_train = X_train_org.shape[0]
m_test = X_test_org.shape[0]
num_px =X_train_org.shape[1]

print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(X_train_org.shape))
print ("train_set_y shape: " + str(Y_train_org.shape))
print ("test_set_x shape: " + str(X_test_org.shape))
print ("test_set_y shape: " + str(Y_test_org.shape))

Number of training examples: m_train = 2612
Number of testing examples: m_test = 327
Height/Width of each image: num_px = 64
Each image is of size: (64, 64, 3)
train_set_x shape: (2612, 64, 64, 3)
train_set_y shape: (1, 2612)
test_set_x shape: (327, 64, 64, 3)
test_set_y shape: (1, 327)


In [5]:
with h5py.File("../EIDSeg_Dataset/cache//eidseg_64x64_binary_any_flat.h5", "r") as f:
    train_x = f["train_x"][:]   # (12288, m)
    train_y = f["train_y"][:]   # (1, m)
    test_x  = f["test_x"][:]    # (12288, m)
    test_y  = f["test_y"][:]    # (1, m)

print ("train_set_x flatten shape: " + str(train_x.shape))
print ("train_set_y shape: " + str(train_y.shape))
print ("test_set_x flatten shape: " + str(test_x.shape))
print ("test_set_y shape: " + str(test_y.shape))

train_set_x flatten shape: (12288, 2612)
train_set_y shape: (1, 2612)
test_set_x flatten shape: (12288, 327)
test_set_y shape: (1, 327)


In [6]:
#layers_dims = [12288, 64, 32, 16, 1] 
layers_dims = [12288, 128, 64, 32, 1]  # try modest sizes for FC net
params, costs = L_layer_model_mini_batch(train_x, train_y, layers_dims,
                              learning_rate=0.01,
                              num_epochs=400,
                              mini_batch_size=64,
                              print_cost=True,
                              seed=1,
                              init="he")

# evaluate
preds_train = predict(train_x, params)
acc_train = np.mean(preds_train == train_y) * 100
print("Train accuracy:", acc_train)

preds_test = predict(test_x, params)
acc_test = np.mean(preds_test == test_y) * 100
print("Test accuracy:", acc_test)

Epoch 0/400 - cost: 0.671243
Epoch 40/400 - cost: 0.604665
Epoch 80/400 - cost: 0.604453
Epoch 120/400 - cost: 0.604235
Epoch 160/400 - cost: 0.604018
Epoch 200/400 - cost: 0.603751
Epoch 240/400 - cost: 0.603354
Epoch 280/400 - cost: 0.602917
Epoch 320/400 - cost: 0.602342
Epoch 360/400 - cost: 0.601625
Train accuracy: 70.71209800918837
Test accuracy: 74.00611620795107


# Final Notes (I declare defeat )
## After two weeks of trying, I give myself a pat on the back
## I initially thought I could build a working model without convolutional networks. After trying many approaches changing the data, adjusting the model, and testing different hyperparameters I realized that this task is best suited for CNNs.

## because of the dataset imbalance and the early 70% accuracy (cause the model was learing to put all ones), I got false hope. but it was a valuable learning experience. I learned my lesson and now understand the importance of using the right architecture for the problem X0. 
## even though i hated my life but it was a funny experince to see how many times i thought i found the sultion
# Thank you for reading my notebook :)
