<a href="https://colab.research.google.com/github/CampbellAgreev/Analysis-of-housing-information/blob/master/Unet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
import os


zip_file_path = '/content/drive/My Drive/tgs-salt-identification-challenge.zip'


output_extraction_path = '/content/tgs_data/'

if not os.path.exists(output_extraction_path):
    os.makedirs(output_extraction_path)
    print(f"Created directory: {output_extraction_path}")
else:
    print(f"Directory already exists: {output_extraction_path}")

!unzip -q -o "{zip_file_path}" -d "{output_extraction_path}"


!ls "{output_extraction_path}"


Directory already exists: /content/tgs_data/
competition_data.zip  flamingo.zip	     test      train	  train.zip
depths.csv	      sample_submission.csv  test.zip  train.csv


In [3]:
!git clone https://github.com/zhixuhao/unet.git
%cd unet
!ls

fatal: destination path 'unet' already exists and is not an empty directory.
/content/unet
data		   data.py  LICENSE  model.py	  README.md
dataPrepare.ipynb  img	    main.py  __pycache__  trainUnet.ipynb


In [4]:
import os

base_data_path = '/content/tgs_data/'
train_zip_path = os.path.join(base_data_path, 'train.zip')
test_zip_path = os.path.join(base_data_path, 'test.zip')


train_output_path = os.path.join(base_data_path, 'train/')
test_output_path = os.path.join(base_data_path, 'test/')

os.makedirs(train_output_path, exist_ok=True)
os.makedirs(test_output_path, exist_ok=True)

print(f"Unzipping {train_zip_path} to {train_output_path}...")
!unzip -q -o "{train_zip_path}" -d "{train_output_path}"
print("Train data unzipped.")

print(f"Unzipping {test_zip_path} to {test_output_path}...")
!unzip -q -o "{test_zip_path}" -d "{test_output_path}"
print("Test data unzipped.")

print("\nContents of the unzipped train folder:")
!ls "{train_output_path}"

print("\nContents of the unzipped test folder:")
!ls "{test_output_path}"

Unzipping /content/tgs_data/train.zip to /content/tgs_data/train/...
Train data unzipped.
Unzipping /content/tgs_data/test.zip to /content/tgs_data/test/...
Test data unzipped.

Contents of the unzipped train folder:
images	masks

Contents of the unzipped test folder:
images


In [5]:
%cd /content/unet/

/content/unet


In [7]:


# %% [code]

import os
import glob
import numpy as np
from skimage.transform import resize
from skimage.io import imsave
import pandas as pd


from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint

try:
    from model import unet
    from data import trainGenerator, testGenerator
except ImportError as e:
    print(f"ImportError: {e}. Please ensure model.py and data.py are accessible.")
    print("If they are in /content/unet/, try running '%cd /content/unet/' in a cell first.")
    raise


BATCH_SIZE = 4
TARGET_SIZE = (128, 128)
ORIGINAL_SIZE = (101, 101)
INPUT_CHANNELS = 1

base_data_path = '/content/tgs_data/'
train_data_dir = os.path.join(base_data_path, 'train/')
test_data_dir = os.path.join(base_data_path, 'test/')

data_gen_args = dict(rotation_range=0.2,
                     width_shift_range=0.05,
                     height_shift_range=0.05,
                     shear_range=0.05,
                     zoom_range=0.05,
                     horizontal_flip=True,
                     fill_mode='nearest')

checkpoint_filepath = 'unet_tgs_best.keras'

print("Creating training data generator...")

my_train_gene = trainGenerator(batch_size=BATCH_SIZE,
                               train_path=train_data_dir,
                               image_folder='images',
                               mask_folder='masks',
                               aug_dict=data_gen_args,
                               target_size=TARGET_SIZE,
                               image_color_mode="grayscale",
                               mask_color_mode="grayscale",
                               flag_multi_class=False)

print("\nCreating test data generator...")

my_test_gene = testGenerator(test_path=test_data_dir,
                             target_size=TARGET_SIZE,
                             as_gray=True)


print("Defining U-Net model...")

model_input_size = (TARGET_SIZE[0], TARGET_SIZE[1], INPUT_CHANNELS)
model = unet(input_size=model_input_size)

print("\nCompiling U-Net model...")

model.compile(optimizer=Adam(learning_rate=1e-4),
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.summary()


print("\nStarting model training...")


train_images_path_pattern = os.path.join(train_data_dir, 'images', '*.png')
num_train_samples = len(glob.glob(train_images_path_pattern))

if num_train_samples == 0:
    raise ValueError(f"No training images found: {train_images_path_pattern}")

steps_per_epoch = num_train_samples // BATCH_SIZE
if steps_per_epoch == 0:
    print(f"Warning: steps_per_epoch is 0 (num_train_samples={num_train_samples}, BATCH_SIZE={BATCH_SIZE}). Setting to 1.")
    steps_per_epoch = 1

print(f"Number of training samples: {num_train_samples}")
print(f"Batch size: {BATCH_SIZE}")
print(f"Steps per epoch: {steps_per_epoch}")


model_checkpoint_callback = ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=False,
    monitor='loss',
    mode='min',
    save_best_only=True)

NUMBER_OF_EPOCHS = 10

history = model.fit(my_train_gene,
                    steps_per_epoch=steps_per_epoch,
                    epochs=NUMBER_OF_EPOCHS,
                    callbacks=[model_checkpoint_callback])

print("\nTraining finished.")
print(f"Best model potentially saved to {checkpoint_filepath} based on training loss improvement.")



print(f"\nAttempting to load model from {checkpoint_filepath} for prediction...")

from tensorflow.keras.models import load_model
if os.path.exists(checkpoint_filepath):
    try:
        model = load_model(checkpoint_filepath)
        print("Best model loaded successfully.")
    except Exception as e:
        print(f"Error loading model from checkpoint: {e}. Using model from end of training.")
else:
    print(f"Checkpoint file {checkpoint_filepath} not found. Using model from end of training.")


def rle_encode(img):
    '''
    img: numpy array, 1 - mask, 0 - background. Input should be 2D.
    Returns run length as string formatted
    '''
    pixels = img.flatten(order='F') # Transpose and flatten
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)


print("\nMaking predictions on the test set and generating RLE strings...")
predictions_output_visualization_dir = "test_predictions_visualized"
if not os.path.exists(predictions_output_visualization_dir):
    os.makedirs(predictions_output_visualization_dir)

submission_data = []
processed_count = 0


my_test_gene_for_prediction = testGenerator(test_path=test_data_dir,
                                            target_size=TARGET_SIZE,
                                            as_gray=True)

for test_img_batch, test_filename in my_test_gene_for_prediction:
    if test_img_batch is None:
        print(f"Warning: testGenerator yielded None for {test_filename}")
        continue


    predicted_mask_batch = model.predict(test_img_batch, verbose=0)


    predicted_mask_at_target_size = predicted_mask_batch[0]

    predicted_mask_binary_at_target_size = (predicted_mask_at_target_size > 0.5).astype(np.uint8)


    predicted_mask_resized_to_original = resize(
        predicted_mask_binary_at_target_size[:,:,0],
        ORIGINAL_SIZE,
        mode='constant',
        preserve_range=True,
        anti_aliasing=False
    )

    predicted_mask_final_for_rle = (predicted_mask_resized_to_original > 0.5).astype(np.uint8)


    rle_string = rle_encode(predicted_mask_final_for_rle)
    image_id = os.path.splitext(test_filename)[0] # Filename without .png
    submission_data.append({'id': image_id, 'rle': rle_string})



    processed_count += 1
    if processed_count % 100 == 0: #
        print(f"Processed {processed_count} test images...")


if submission_data:
    submission_df = pd.DataFrame(submission_data)
    submission_df.to_csv('submission.csv', index=False)
    print(f"\nSubmission file 'submission.csv' created with {len(submission_df)} entries.")
else:
    print("\nNo predictions were processed to create a submission file.")



Creating training data generator...

Creating test data generator...
Defining U-Net model...

Compiling U-Net model...



Starting model training...
Number of training samples: 4000
Batch size: 4
Steps per epoch: 1000
Loading training images from: /content/tgs_data/train/images
Found 4000 images belonging to 1 classes.
Loading training masks from: /content/tgs_data/train/masks
Found 4000 images belonging to 1 classes.
Epoch 1/10
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m122s[0m 105ms/step - accuracy: 0.7544 - loss: 0.5271
Epoch 2/10
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m108s[0m 108ms/step - accuracy: 0.8633 - loss: 0.3785
Epoch 3/10
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m109s[0m 109ms/step - accuracy: 0.8808 - loss: 0.3573
Epoch 4/10
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m109s[0m 109ms/step - accuracy: 0.9010 - loss: 0.3284
Epoch 5/10
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m113s[0m 113ms/step - accuracy: 0.8972 - loss: 0.3309
Epoch 6/10
[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m 

Core U-Net Concepts from the Paper:

Symmetric U-Shape:

The architecture consists of:A contracting path (encoder) to capture context.An expansive path (decoder) to enable precise localization.

Contracting Path (Encoder):Repeated application of two 3x3 convolutions (ReLU activated).Followed by a 2x2 max pooling operation (stride 2) for downsampling.At each downsampling step, the number of feature channels is typically doubled.

Expansive Path (Decoder):Each step starts with an upsampling of the feature map (e.g., 2x2 "up-convolution" or UpSampling2D followed by a 2x2 Conv2D). This typically halves the number of feature channels.Concatenation with the corresponding feature map from the contracting path (these are the skip connections). The paper mentions cropping these feature maps from the encoder if their size doesn't match due to unpadded convolutions. In many Keras implementations using "same" padding, cropping might not be necessary if sizes align.Two 3x3 convolutions (ReLU activated).

Skip Connections: These are vital. They combine high-resolution features from the contracting path with the upsampled output of the expansive path, allowing the network to learn to make more precise localizations.

Final Layer: A 1x1 convolution maps the feature vector from the last decoder stage to the desired number of output classes. For binary segmentation (like salt/no salt or cell/not cell), this is 1 channel with a sigmoid activation.

Padding: The paper's original convolutions are unpadded, leading to a reduction in feature map size at each convolution. This necessitates cropping for skip connections. Many modern implementations (including zhixuhao/unet) use "same" padding in convolutions to maintain feature map dimensions within a block, simplifying skip connections.

Dropout: The paper mentions dropout layers at the end of the contracting path (in the deeper, bottleneck layers).

Comparing model.py to the Paper's Architecture:


Observations and Alignment with Paper:

Overall Structure: The model.py code clearly implements the symmetric U-shape. It has 4 downsampling stages in the encoder and 4 upsampling stages in the decoder, with a bottleneck layer in between. This matches the depth of the U-Net shown in Figure 1 of the paper.

Contracting Path (Encoder - conv1 to pool4):Convolution Blocks: Each stage (conv1, conv2, conv3, conv4) consists of two Conv2D(filters, 3, activation='relu', padding='same', kernel_initializer='he_normal') layers. This aligns with the paper's "two 3x3 convolutions, each followed by ReLU".padding='same': This implementation uses "same" padding, meaning the output feature map size from the convolution is the same as the input (if stride is 1). This simplifies skip connections as cropping is generally not needed, unlike the original paper's unpadded convolutions.kernel_initializer='he_normal': This is a common and good practice for initializing weights in networks with ReLU activations, helping with training stability.Downsampling: Each block is followed by MaxPooling2D(pool_size=(2, 2)), which performs 2x2 max pooling, halving the spatial dimensions and doubling the number of feature channels (64 -> 128 -> 256 -> 512), as described in the paper.Dropout: Dropout(0.5) is applied after conv4 (before the last pooling) and after conv5 (the bottleneck). This is consistent with the paper's suggestion to use dropout in the deeper layers.

Bottleneck (conv5, drop5):This is the deepest part of the network, connecting the encoder and decoder. It also has two 3x3 convolutions and a dropout layer. The number of feature channels here is 1024.

Expansive Path (Decoder - up6 to conv9 before output):Upsampling: Each stage (up6, up7, up8, up9) starts with an upsampling operation. The code uses UpSampling2D(size=(2,2)) followed by a Conv2D(filters, 2, activation='relu', padding='same', ...). This is a common way to implement "up-convolution" or "transposed convolution." The Conv2D with a 2x2 kernel helps learn the upsampling. The paper describes this as "2x2 up-convolution". The number of feature channels is halved at each upsampling stage (e.g., drop5 has 1024, up6 output has 512; conv6 has 512, up7 output has 256, etc., before concatenation).

Skip Connections: The concatenate([encoder_feature_map, upsampled_decoder_feature_map], axis=3) lines implement the skip connections. For example, merge6 = concatenate([drop4,up6], axis=3) takes the output of drop4 from the encoder (which has 512 channels) and concatenates it with the output of the upsampling block up6 (which also has 512 channels after the Conv2D). The result merge6 will have 1024 channels (512 from encoder + 512 from decoder path). This is exactly what the paper describes.axis=3: Concatenation happens along the channel axis.Convolution Blocks: After concatenation, each decoder stage has two Conv2D(filters, 3, activation='relu', padding='same', ...) layers, matching the paper. The number of filters in these convolutions matches the number of filters in the corresponding upsampling layer (e.g., conv6 uses 512 filters, conv7 uses 256, etc.).

Output Layer (conv9's last part and conv10):The zhixuhao/unet implementation has an extra conv9 = Conv2D(2, 3, activation='relu', padding='same', ...) layer before the final 1x1 convolution. The paper's Figure 1 shows the last two 3x3 convolutions (our conv9 block) having 64 filters, and then a 1x1 convolution mapping these 64 features to the number of classes (e.g., 2 for their cell example, or 1 for a binary case with sigmoid).The conv10 = Conv2D(1, 1, activation='sigmoid')(conv9) is the final 1x1 convolution that maps the features (from the preceding 2-channel conv9 layer) to a single channel with a sigmoid activation. This is appropriate for binary segmentation (outputting a probability map for one class).

Minor Differences/Implementation Choices:

Padding: As noted, padding='same' is used, which is a common practical choice.Upsampling Implementation: UpSampling2D followed by Conv2D is a valid way to achieve learned upsampling, equivalent to a transposed convolution.Extra Conv2D(2, ...): The Conv2D(2, ...) in the conv9 block just before the final Conv2D(1, ...) is a slight deviation from a direct mapping from 64 channels (output of the main conv9 block) to 1 channel. It introduces an intermediate 2-channel representation. This might not significantly impact performance but is an extra small step.

Summary of Alignment:

The model.py implementation is a faithful representation of the U-Net architecture described in the paper. It correctly implements:The U-shaped encoder-decoder structure.The repeated convolution blocks in both paths.Max pooling for downsampling and learned upsampling (UpSampling2D + Conv2D) for the decoder.The crucial skip connections via concatenation.Dropout in the deeper layers.A final 1x1 convolution with sigmoid activation for binary segmentation.The use of "same" padding is a common practical adaptation. The small extra convolution at the end is a minor implementation detail of this specific version.