# Problem Definition

Given two sets of satellite images: the first set includes corresponding labels that identify roof areas, while the second set consists of unlabeled images. 

The objective is to train an appropriate neural network model using the first set of images and their labels, and then use the trained model to predict labels for the second set of images.

## Approach
Utilize the provided training data to train multiple neural network models, compare their accuracy, and select the best-performing model for predicting labels on the given second set of images. The models to be tested include **DeepLabv3** (with various backbone/base models) and **U-Net**.

### Step 1: Importing Libreries
Import the necessary libraries required for the subsequent steps.

In [None]:
import os
import cv2
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from sklearn.model_selection import train_test_split, ParameterGrid

### Step 2: Data Loading and Processing
Given first set of images and labels are stored in the following locations:
**Images** : `dataset/train/images`
**Labels** : `dataset/train/labels`

In this step, will read images with OpenCV libreary and labels and afterwards resize normalize them.
Finally, load them as numpy array.
* Dercribes every steps in comments in the code.


In [None]:
def load_data(image_dir, label_dir, img_size=(256, 256)):
    images, labels = [], []
    for file_name in os.listdir(image_dir):
        img_path = os.path.join(image_dir, file_name)
        label_path = os.path.join(label_dir, file_name)
        
        if img_path.endswith("png") and label_path.endswith("png"):

            img = cv2.imread(img_path) # Read Image
            img = cv2.resize(img, img_size) # Resize Image
            img = img / 255.0 # Normalizing

            lbl = cv2.imread(label_path, cv2.IMREAD_GRAYSCALE)# Read Label(as Grayscale)
            lbl = cv2.resize(lbl, img_size)
            lbl = lbl / 255.0 # Normalize label (binary labels)
            lbl = np.expand_dims(lbl, axis=-1) # Adding channel dimension

            images.append(img)
            labels.append(lbl)
    
    return np.array(images), np.array(labels)

# Function call to load data as numpy array
images, labels = load_data("dataset/train/images", "dataset/train/labels")

### Step 3: Splitting the Data into Train and Validation Sets
The data has been split into training and validation sets. The configuration ensures that 90% of the data is used for training and 10% for testing. A random state of 42 is used to maintain consistency between separate runs.
_N.B: Usually 80% and 20% represent a standard spliting, but as the availabale amount of data is very limited, so using 90% of data to train the model to get better accuracy._

In [None]:
X_train, X_val, y_train, y_val = train_test_split(images, labels, test_size=0.1, random_state=42)

### Step 4: Selection of Model and Hyperparameter Tuning
In this process, I will explore multiple neural network models, including **DeepLabv3** and **U-Net**. After evaluating their performance, I will select the best-performing models to predict labels for the second set of images. 

For **DeepLabv3**, I will experiment with different base models and choose the one with the highest accuracy among them. **U-Net**, with its distinct architecture, is effective at capturing edges and boundaries in labels, where DeepLabv3 may not perform as well. Therefore, I will also use U-Net for predictions.

Additionally, I will perform parameter tuning to optimize each model for the best possible performance.

#### Define and train DeepLabv3 model
Following blcok is a function to create DeepLabv3 Model with privided backbone or base model.

In [None]:
def get_deeplabv3_model(base_model, num_classes=1):
    
    # Atrous Spatial Pyramid Pooling 
    x = base_model.output
    x = tf.keras.layers.Conv2D(256, (1, 1), activation="relu", padding="same")(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv2D(256, (3, 3), dilation_rate=6, activation="relu", padding="same")(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv2D(256, (3, 3), dilation_rate=12, activation="relu", padding="same")(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv2D(256, (3, 3), dilation_rate=18, activation="relu", padding="same")(x)
    x = tf.keras.layers.BatchNormalization()(x)

    # Decoder
    x = tf.keras.layers.Conv2DTranspose(256, (3, 3), strides=(2, 2), padding="same")(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv2DTranspose(128, (3, 3), strides=(2, 2), padding="same")(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv2DTranspose(64, (3, 3), strides=(2, 2), padding="same")(x)
    x = tf.keras.layers.BatchNormalization()(x)

    # Final layer to match label size
    x = tf.keras.layers.Conv2D(num_classes, (1, 1), activation="sigmoid", padding="same")(x)
    
    # Ensuring output is (256, 256, num_classes)
    x = tf.keras.layers.UpSampling2D(size=(2, 2))(x)
    x = tf.keras.layers.UpSampling2D(size=(2, 2))(x)

    model = tf.keras.Model(inputs=base_model.input, outputs=x)
    return model

#### Functions to tune params and train DeepLabv3 model
The following block defines functions to tune hyperparameters and train the DeepLabv3 model with the provided backbone or base model.

In [None]:
# Function to get base model for hyperparameter value
def get_base_model(base_model_param):
    if base_model_param == 'xception':
        return tf.keras.applications.Xception(
            weights="imagenet", include_top=False, input_shape=(256, 256, 3)
        )
    if base_model_param == 'resnet50':
        return tf.keras.applications.ResNet50(
            weights="imagenet", include_top=False, input_shape=(256, 256, 3)
        )
    if base_model_param == 'resnet101':
        return tf.keras.applications.ResNet101(
            weights="imagenet", include_top=False, input_shape=(256, 256, 3)
        )
    if base_model_param == 'mobilenetv2':
        return tf.keras.applications.MobileNetV2(
            weights="imagenet", include_top=False, input_shape=(256, 256, 3)
        )
    return None


def tune_and_train():
    # Define the hyperparameter grid
    param_grid = {
        'learning_rate': [0.001, 0.0001],
        'batch_size': [8, 16],
        'optimizer': ['adam', 'sgd'],
        'epochs': [50, 75, 100],
        'base_model': ['xception', 'resnet50', 'resnet101', 'mobilenetv2']
    }

    best_model = None
    best_score = float('inf')
    best_params = None

    for params in ParameterGrid(param_grid):
        print(f"Testing parameters: {params}")

        base_model = get_base_model(params['base_model'])
        if base_model == None: # Safe check
            continue
            
        # Rebuild the model for each hyperparameter configuration
        model = get_deeplabv3_model(base_model)

        # Configuring optimizer
        if params['optimizer'] == 'adam':
            optimizer = tf.keras.optimizers.Adam(learning_rate=params['learning_rate'])
        elif params['optimizer'] == 'sgd':
            optimizer = tf.keras.optimizers.SGD(learning_rate=params['learning_rate'], momentum=0.9)

        # Compilig the model
        model.compile(
            optimizer=optimizer,
            loss="binary_crossentropy",
            metrics=["accuracy"]
        )
        
        # Training the model
        history = model.fit(
            X_train, y_train,
            validation_data=(X_val, y_val),
            epochs=params['epochs'],
            batch_size=params['batch_size'],
            verbose=1
        )
        
        # Evaluate performance on the validation set
        val_loss = history.history['val_loss'][-1]
        print(f"Validation loss: {val_loss}")

        # Saving the best model configuration
        if val_loss < best_score:
            best_score = val_loss
            best_model = model
            best_params = params

    # Print the best configuration
    print(f"Best Parameters: {best_params}")
    print(f"Best Validation Loss: {best_score}")

    return best_model


#### Deine and train DeepLabv3 model
In following step, I will tune and train DeepLabv3 model and will pick the better performer DeepLabv3 model.

In [None]:
# Calling functiong get DeepLabV3 model with best hyperparamater
deeplabv3_model = tune_and_train()

# Saving the best model
if deeplabv3_model:
    deeplabv3_model.save("trained_deeplabv3_roof_segmentation.h5")
    

#### Define and train U-Net model with hyperparameter tuning
In this step I will define a U-Net model with best fitting hyperparameter by tuning.

In [None]:
def get_unet_model(input_size=(256, 256, 3)):
    inputs = layers.Input(input_size)
    
    # Encoder
    c1 = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(inputs)
    c1 = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(c1)
    p1 = layers.MaxPooling2D((2, 2))(c1)

    c2 = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(p1)
    c2 = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(c2)
    p2 = layers.MaxPooling2D((2, 2))(c2)

    # Bottleneck
    c3 = layers.Conv2D(256, (3, 3), activation='relu', padding='same')(p2)
    c3 = layers.Conv2D(256, (3, 3), activation='relu', padding='same')(c3)

    # Decoder
    u1 = layers.Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(c3)
    u1 = layers.concatenate([u1, c2])
    c4 = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(u1)
    c4 = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(c4)

    u2 = layers.Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(c4)
    u2 = layers.concatenate([u2, c1])
    c5 = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(u2)
    c5 = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(c5)

    outputs = layers.Conv2D(1, (1, 1), activation='sigmoid')(c5)

    return models.Model(inputs, outputs)

# Hyperparameter grid
param_grid = {
    'learning_rate': [0.001, 0.0001],
    'batch_size': [8, 16],
    'optimizer': ['adam', 'sgd'],
    'epochs': [50, 75, 100]
}

unet_model = None
unet_score = float('inf')
unet_params = None

for params in ParameterGrid(param_grid):
    print(f"Testing parameters: {params}")
    
    # Rebuild the U-Net model for each hyperparameter configuration
    model = get_unet_model()

    # Configuring optimizer
    if params['optimizer'] == 'adam':
        optimizer = tf.keras.optimizers.Adam(learning_rate=params['learning_rate'])
    elif params['optimizer'] == 'sgd':
        optimizer = tf.keras.optimizers.SGD(learning_rate=params['learning_rate'], momentum=0.9)

    # Compiling the model
    model.compile(
        optimizer=optimizer,
        loss="binary_crossentropy",
        metrics=["accuracy"]
    )
    
    # Training the model
    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=params['epochs'],
        batch_size=params['batch_size'],
        verbose=1
    )
    
    # Evaluate performance on the validation set
    val_loss = history.history['val_loss'][-1]
    print(f"Validation loss: {val_loss}")
    
    # Saving the best model configuration
    if val_loss < unet_score:
        unet_score = val_loss
        unet_model = model
        unet_params = params

# Print the best configuration
print(f"Best Parameters: {unet_params}")
print(f"Best Validation Loss: {unet_score}")

# Saving the best model
if unet_model:
    unet_model.save("trained_unet_roof_segmentation_model.h5")


### Resulting better performing hyperparameters after running a hyperparamater tuning 

During hyperparameter tuning and training , **DeepLabV3** was performend better with the better performing hyperparameter set as follows.
```
{
    'base_model': 'xception',
    'batch_size': 16,
    'epochs': 100,
    'learning_rate': 0.001,
    'optimizer': 'adam'
}
```
With a validation loss of `0.15872398018836975`.


### Step 5: Predict labels for given images
In this step, will predict the labels for second set of images with better performer **DeepLabV3** model. I also included **U-Net** in prediction for reference.<br>_However, U-Net should ideally not be used for prediction as it was showing less accuracy during validation but I have included it only for reference._

#### Helper function to predict and save new labels using any given model

In [None]:
def predict_and_save(model, image_path, save_path, img_size=(256, 256)):
    img = cv2.imread(image_path)
    img_resized = cv2.resize(img, img_size) / 255.0
    img_resized = np.expand_dims(img_resized, axis=0) # Add batch dimension
    
    prediction = model.predict(img_resized)[0]
    prediction = (prediction > 0.5).astype(np.uint8) # Thresholding
    
    # Saving prediction as .png
    cv2.imwrite(save_path, prediction * 255)

def predict_for_folder(model, images_dir, labels_dir):
    for p in os.listdir(images_dir):
        if p.endswith("png"):
            img_path = images_dir + '//' + p
            lbl_path = labels_dir + '//' + p
            print("Saving label for "+img_path+" to "+lbl_path)
            predict_and_save(model, img_path, lbl_path)
            print("Done\n")

#### Following block is to predict and save labels for second set of images

In [None]:
predict_for_folder(deeplabv3_model, "dataset/new_images", "dataset/prediction/deeplabv3")
predict_for_folder(unet_model, "dataset/new_images", "dataset/prediction/unet")