# U-Net Model for Semantic Segmentation (with DenseNet121 Example)

Paper: https://arxiv.org/abs/2009.02805

This document describes the architecture and data flow of a U-Net model for semantic image segmentation, using DenseNet121 as the encoder. The model is implemented in PyTorch.

## 1. Introduction

Semantic segmentation is the task of assigning a class label to each pixel in an image. U-Net architectures are particularly well-suited for this because they capture both high-level contextual information and fine-grained spatial details. This implementation uses a pre-trained DenseNet121 as the encoder, leveraging transfer learning for improved performance.

## 2. Model Architecture

## DenseNet-U-Net Model Flowchart

This flowchart visualizes the architecture and data flow of a DenseNet-U-Net model for image segmentation.

    
    A[Input: (batch_size, 3, 256, 256)] --> B{Encoder (DenseNet121)}

    B --> C[Layer 0: conv0, norm0, relu0]
    C --> D[Layer 1: pool0, denseblock1]
    D --> E[Layer 2: transition1, denseblock2]
    E --> F[Layer 3: transition2, denseblock3]
    F --> G[Layer 4: transition3, denseblock4, norm5, relu5]

    G --> H{Bottleneck (ConvBottleneck)}
    H --> I[Concatenate with Upsampled Decoder Output]

    I --> J{Decoder Stage 1}
    J --> K[Upsample, Conv2d, ReLU]
    K --> L[Concatenate with Encoder Layer 3 Output]

    L --> M{Decoder Stage 2}
    M --> N[Upsample, Conv2d, ReLU]
    N --> O[Concatenate with Encoder Layer 2 Output]

    O --> P{Decoder Stage 3}
    P --> Q[Upsample, Conv2d, ReLU]
    Q --> R[Concatenate with Encoder Layer 1 Output]

    R --> S{Decoder Stage 4 (Last Upsample)}
    S --> T[Upsample, Conv2d, ReLU]

    T --> U[Final Layer: 1x1 Conv]
    U --> V[Output: (batch_size, num_classes, 256, 256)]

In [10]:
from google.colab import drive
import os

drive.mount('/content/drive')
os.chdir("/content/drive/My Drive/kaggle/img-classif/unet_pipeline")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
!pip install -r ../requirements.txt



In [35]:
!ls experiments/albunet_valid/results/result_top3.pkl

experiments/albunet_valid/results/result_top3.pkl


In [None]:
!python Inference.py experiments/albunet_valid/2nd_stage_inference.yaml


  check_for_updates()
{'DATA_DIRECTORY': '../input/dataset1024', 'SEED': 42, 'NUM_WORKERS': 4, 'DEVICE': 'cuda', 'BATCH_SIZE': 1, 'MODEL': {'PY': 'models.ternausnets', 'CLASS': 'AlbuNet', 'ARGS': {'pretrained': False}}, 'CHECKPOINTS': {'FULL_FOLDER': 'checkpoints', 'BEST_FOLDER': 'checkpoints', 'PIPELINE_PATH': 'experiments/albunet_valid', 'PIPELINE_NAME': 'albunet_1024'}, 'SUBMIT_BEST': False, 'USEFOLDS': [1], 'SELECTED_CHECKPOINTS': {'fold1': [7]}, 'TEST_TRANSFORMS': 'transforms/valid_transforms_1024_old.json', 'FLIP': False, 'RESULT': 'result_top3.pkl', 'RESULT_FOLDER': 'experiments/albunet_valid/results'}
The directory 'experiments/albunet_valid/results' is writable.
  return cls(**args)
experiments/albunet_valid/checkpoints/fold1/albunet_1024_fold1_epoch7.pth
  model.load_state_dict(torch.load(checkpoint_path))
100% 3205/3205 [11:34<00:00,  4.62it/s]
Attempting to save results to: experiments/albunet_valid/results/result_top3.pkl
mask_dict contains 3205 entries.
Successfully saved

In [36]:
from google.colab import files
import shutil

# Define the path of the file in Google Drive
google_drive_path = 'experiments/albunet_valid/results/result_top3.pkl'

# Download the file
files.download(google_drive_path)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [6]:
import pickle

file_path = "experiments/albunet_valid/results/result_top3.pkl"

with open(file_path, "rb") as f:
    data = pickle.load(f)

print(type(data))  # Check the type of stored data
print(len(data))   # If it's a list or dict, check its length

FileNotFoundError: [Errno 2] No such file or directory: 'experiments/albunet_valid/results/result_top3.pkl'

In [38]:
if isinstance(data, dict):
    print(f"Number of keys: {len(data)}")
    print(f"Sample keys: {list(data.keys())[:5]}")  # Print first 5 keys

    # Check the type and shape of the first value
    first_key = list(data.keys())[0]
    first_value = data[first_key]
    print(f"Type of first value: {type(first_value)}")

    if hasattr(first_value, 'shape'):
        print(f"Shape of first value: {first_value.shape}")

    # Check the data type of elements
    if hasattr(first_value, 'dtype'):
        print(f"Data type of first value: {first_value.dtype}")

Number of keys: 3205
Sample keys: ['ID_0011fe81e', 'ID_003206608', 'ID_004d6fbb6', 'ID_004d72c54', 'ID_00528aa0e']
Type of first value: <class 'numpy.ndarray'>
Shape of first value: (1024, 1024)
Data type of first value: float32


In [39]:
import torch
from models.ternausnets import AlbuNet  # Adjust this if needed

# Initialize model (ensure args match your config)
model = AlbuNet(pretrained=False)

# Count parameters
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f"Total parameters: {total_params:,}")
print(f"Trainable parameters: {trainable_params:,}")



Total parameters: 35,117,897
Trainable parameters: 35,117,897


In [40]:
import os

torch.save(model.state_dict(), "temp_model.pth")
model_size_mb = os.path.getsize("temp_model.pth") / (1024 * 1024)
print(f"Model size: {model_size_mb:.2f} MB")

Model size: 134.13 MB


In [3]:
def visualize_prediction(image_path, mask, alpha=0.5):
    """Overlay mask on the original image for visualization."""
    img = cv2.imread(image_path, cv2.IMREAD_COLOR)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # Convert to RGB for matplotlib
    mask = (mask * 255).astype(np.uint8)  # Normalize mask to [0, 255]

    # Convert grayscale mask to color
    mask_colored = cv2.applyColorMap(mask, cv2.COLORMAP_JET)

    # Blend the image and mask
    overlay = cv2.addWeighted(img, 1 - alpha, mask_colored, alpha, 0)

    # Plot
    plt.figure(figsize=(10, 5))
    plt.subplot(1, 3, 1)
    plt.imshow(img)
    plt.title("Original Image")
    plt.axis("off")

    plt.subplot(1, 3, 2)
    plt.imshow(mask, cmap="gray")
    plt.title("Predicted Mask")
    plt.axis("off")

    plt.subplot(1, 3, 3)
    plt.imshow(overlay)
    plt.title("Overlay")
    plt.axis("off")

    plt.show()

# Example visualization for a few images
image_folder = "../input/dataset1024/test"  # Adjust path accordingly

for i, (image_id, mask) in enumerate(data.items()):
    image_path = os.path.join(image_folder, f"{image_id}.jpg")  # Adjust extension if needed
    if os.path.exists(image_path):
        print(f"Visualizing {image_id}")
        visualize_prediction(image_path, mask)
    else:
        print(f"Image {image_id} not found.")

    if i == 4:  # Show only first 5 images
        break

NameError: name 'data' is not defined

In [None]:
!python Train.py experiments/albunet_valid/train_config_part0.yaml1

Traceback (most recent call last):
  File "/content/drive/MyDrive/kaggle/img-classif/unet_pipeline/Train.py", line 190, in <module>
    main()
  File "/content/drive/MyDrive/kaggle/img-classif/unet_pipeline/Train.py", line 111, in main
    train_config = load_yaml(config_folder)
                   ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/drive/MyDrive/kaggle/img-classif/unet_pipeline/utils/helpers.py", line 11, in load_yaml
    with open(file_name, 'r') as stream:
         ^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'experiments/albunet_valid/train_config_part0.yaml1'


In [None]:
!python Train.py experiments/albunet_valid/train_config_part1.yaml

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
score: 0.78953 on (0.75, 1000, 0.3):  42% 449/1068 [03:16<04:29,  2.30it/s]🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH a

In [None]:
!python train/Train.py experiments/albunet_valid/train_config_part2.yaml

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
score: 0.81415 on (0.6, 3000, 0.3):  42% 445/1069 [03:04<04:17,  2.43it/s]🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH af

In [None]:
!python train/Train.py experiments/albunet_valid/train_config_2nd_stage.yaml

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
score: 0.77815 on (0.75, 1000, 0.3):  42% 444/1068 [03:12<04:27,  2.34it/s]🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH after fixing: outputs torch.Size([2, 1, 1024, 1024]) vs targets torch.Size([2, 1024, 1024, 1])
🚨 MISMATCH a

In [None]:
!pip install ray[default]==2.40.0

Traceback (most recent call last):
  File "/usr/local/bin/pip3", line 5, in <module>
    from pip._internal.cli.main import main
ModuleNotFoundError: No module named 'pip'


In [None]:
!python --version

Python 3.12.9


In [None]:
# Update package lists
!sudo apt-get update

# Install Python 3.12 and its development libraries
!sudo apt-get install python3.12 python3.12-dev

# Install pip for Python 3.12
!sudo apt-get install python3.12-venv python3.12-distutils
!wget https://bootstrap.pypa.io/get-pip.py
!python3.12 get-pip.py

# Set Python 3.12 as the default alternative
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 1
!sudo update-alternatives --set python3 /usr/bin/python3.12

# Ensure pip3.12 is installed and is the correct version
!python3.12 -m pip --version

0% [Working]            Hit:1 http://security.ubuntu.com/ubuntu jammy-security InRelease
0% [Connecting to archive.ubuntu.com] [Connected to cloud.r-project.org (108.13                                                                               Hit:2 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
0% [Connecting to archive.ubuntu.com] [Connected to r2u.stat.illinois.edu (192.                                                                               Hit:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
0% [Connecting to archive.ubuntu.com (91.189.91.82)] [Waiting for headers] [Con                                                                               Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
0% [Waiting for headers] [Waiting for headers] [Connecting to ppa.launchpadcont                                                                               Hit:5 http://archive.ubuntu.com/ubuntu jam

In [None]:
!python /content/test.py

Traceback (most recent call last):
  File "/content/test.py", line 1, in <module>
    import ray
ModuleNotFoundError: No module named 'ray'


In [None]:
!python --version

Python 3.12.9


In [None]:
import ray

ray.init(address="ray://34.94.234.102:6379")

@ray.remote
def train():
    import os
    os.system("python Train.py experiments/albunet_valid/train_config_part0.yaml")

train.remote()

2025-02-10 05:22:09,758	INFO client_builder.py:244 -- Passing the following kwargs to ray.init() on the server: log_to_driver


ConnectionError: ray client connection timeout