**Import Essential Libraries**

This cell **imports essential libraries** for building and working with deep learning models, particularly for image processing tasks.
*   `tensorflow`: A powerful library for numerical computation and large-scale machine learning.
*   `cv2` (OpenCV): A widely used library for computer vision tasks.

It also **prints the versions** of TensorFlow and OpenCV to verify the installed versions.

In [None]:
import tensorflow as tf
import cv2
import numpy as np
import shutil
import os
import glob

print(f"TensorFlow Version: {tf.__version__}")
print(f"OpenCV Version: {cv2.__version__}")

TensorFlow Version: 2.19.0
OpenCV Version: 4.12.0


**Load and Locate Data Files**

This section handles the initial data setup.
*   The code first **unzips the image and mask datasets** from your project folder into the `/content/data` directory, making the files accessible.
*   It then **defines the directory paths** for the training and validation images and masks.
*   Finally, it **finds and stores the file paths** for all individual image and mask files within those directories, preparing them for use in creating the data pipelines. The number of found files for both training and validation sets is printed to confirm the process.

In [None]:

# ‚úÖ This is the Total Version of the Training data and model. The lite version is in next cell.

# Create the local data directory
if not os.path.exists('/content/data'):
    os.makedirs('/content/data')

# --- 1. HANDLE MASKS (ZIP FILE) ---
# We still unzip the masks because they are in a .zip file
if not os.path.exists('/content/data/gtFine'):
    print("üìÇ Unzipping masks...")
    !unzip -q /content/project/gtFine_trainvaltest.zip -d /content/data
    print("‚úÖ Masks unzipped.")
else:
    print("‚úÖ Masks already present.")

# --- 2. HANDLE IMAGES (ZIP FILE) --- üü¢ CHANGED BACK TO UNZIP
# We verify the image folder doesn't exist yet before unzipping to save time
if not os.path.exists('/content/data/leftImg8bit'):
    print("üìÇ Unzipping images (leftImg8bit)... This might take 1-2 mins.")
    # Adjust filename if yours is slightly different
    !unzip -q /content/project/leftImg8bit_trainvaltest.zip -d /content/data
    print("‚úÖ Images unzipped.")
else:
    print("‚úÖ Images already present.")



In [None]:
# --- 3. DEFINE PATHS & GLOB ---

# Define Training Paths
TRAIN_IMG_DIR = '/content/data/leftImg8bit/train/'
TRAIN_MASK_DIR = '/content/data/gtFine/train/'

# Define Validation Paths
VAL_IMG_DIR = '/content/data/leftImg8bit/val/'
VAL_MASK_DIR = '/content/data/gtFine/val/'


In [None]:
# --- Get Training File Paths ---
# The '/*/*.png' part searches inside all the city subfolders
train_image_paths = sorted(glob.glob(f"{TRAIN_IMG_DIR}/*/*.png"))
train_mask_paths = sorted(glob.glob(f"{TRAIN_MASK_DIR}/*/*_gtFine_labelIds.png"))

val_image_paths = sorted(glob.glob(f"{VAL_IMG_DIR}/*/*.png"))
val_mask_paths = sorted(glob.glob(f"{VAL_MASK_DIR}/*/*_gtFine_labelIds.png"))

print(f"Found {len(train_image_paths)} training images.")
print(f"Found {len(val_image_paths)} validation images.")

Below Cell is the Presentation Model.

In [None]:
# --- CONFIGURATION: LITE MODE ---
# Add more cities here if you want more data, or remove one to make it even smaller.
# Options: 'aachen', 'bochum', 'bremen', 'cologne', 'dusseldorf', 'darmstadt', etc.

CITIES_TO_LOAD = ['aachen', 'bochum']

# Create the local data directory
if not os.path.exists('/content/data'):
    os.makedirs('/content/data')

# --- 1. HANDLE MASKS (Unzip All - It's small & fast) ---
if not os.path.exists('/content/data/gtFine'):
    print("üìÇ Unzipping masks (gtFine)...")
    !unzip -q /content/project/gtFine_trainvaltest.zip -d /content/data
    print("‚úÖ Masks unzipped.")
else:
    print("‚úÖ Masks already present.")

# --- 2. HANDLE IMAGES (LITE COPY) ---
# We iterate through our list and only copy those specific folders
drive_source_root = '/content/project/leftImg8bit_trainvaltest/leftImg8bit' # Check this path!
if not os.path.exists(drive_source_root):
    # Fallback if path is different
    drive_source_root = '/content/project/leftImg8bit'

local_img_root = '/content/data/leftImg8bit'

print(f"üìÇ Starting Lite Copy for cities: {CITIES_TO_LOAD}")

for split in ['train', 'val']:
    for city in CITIES_TO_LOAD:
        # Define source and destination for this specific city
        src_path = os.path.join(drive_source_root, split, city)
        dst_path = os.path.join(local_img_root, split, city)

        # Only copy if it exists in Drive and we haven't copied it yet
        if os.path.exists(src_path) and not os.path.exists(dst_path):
            print(f"   Copying {split}/{city}...")
            shutil.copytree(src_path, dst_path)
        elif os.path.exists(dst_path):
            print(f"   Skipping {split}/{city} (Already copied)")
        else:
            # Sometimes a city is in 'train' but not 'val', this is normal
            pass

print("‚úÖ Lite Copy Finished.")

# --- 3. DEFINE PATHS & GENERATE LISTS ---
TRAIN_IMG_DIR = '/content/data/leftImg8bit/train/'
VAL_IMG_DIR = '/content/data/leftImg8bit/val/'

# 1. Find all images we just copied
train_image_paths = sorted(glob.glob(f"{TRAIN_IMG_DIR}/*/*.png"))
val_image_paths = sorted(glob.glob(f"{VAL_IMG_DIR}/*/*.png"))

# 2. Derive the matching mask paths (FIXED LOGIC)
def get_matching_mask_path(img_path):
    # Change the folder name
    mask_path = img_path.replace('/leftImg8bit/', '/gtFine/')
    # Change the file suffix to EXACTLY match the file listing you found
    mask_path = mask_path.replace('_leftImg8bit.png', '_gtFine_labelIds.png')
    return mask_path

train_mask_paths = [get_matching_mask_path(p) for p in train_image_paths]
val_mask_paths = [get_matching_mask_path(p) for p in val_image_paths]

print(f"\n--- DATA LOAD SUMMARY ---")
print(f"Training Images: {len(train_image_paths)}")
print(f"Training Masks:  {len(train_mask_paths)}")
print(f"Validation Images: {len(val_image_paths)}")
print(f"Validation Masks:  {len(val_mask_paths)}")

# SANITY CHECK RE-RUN
if len(train_image_paths) > 0:
    if os.path.exists(train_mask_paths[0]):
        print(f"‚úÖ SUCCESS! Found match: {train_mask_paths[0]}")
    else:
        print(f"‚ùå STILL FAILED. Tried to find: {train_mask_paths[0]}")
else:
    print("‚ö†Ô∏è No images found. Check the copy step.")

‚úÖ Masks already present.
üìÇ Starting Lite Copy for cities: ['aachen', 'bochum']
   Skipping train/aachen (Already copied)
   Skipping train/bochum (Already copied)
‚úÖ Lite Copy Finished.

--- DATA LOAD SUMMARY ---
Training Images: 270
Training Masks:  270
Validation Images: 0
Validation Masks:  0
‚úÖ SUCCESS! Found match: /content/data/gtFine/train/aachen/aachen_000000_000019_gtFine_labelIds.png
