# **IEEE Bigdata Cup 2024: Building extraction**

**Author:** [Yi-Jie Wong](https://www.linkedin.com/in/wongyijie/)<br>
**Challenge link:** [Kaggle](https://www.kaggle.com/competitions/building-extraction-generalization-2024/leaderboard)<br>
**Date created:** 2024/07/10<br>
**Last modified:** 2024/09/12<br>
**Description:** Cross-City Generalizability of Instance Segmentation Model in a Nationwide Building Extraction Task

## **Step 1: Setup Dependencies and Dataset (will auto restart session once complete)**

In [None]:
# clone this repo
!git clone https://github.com/yjwong1999/RSBuildingExtraction.git

In [None]:
# install ultralytics
%pip install ultralytics==8.1

In [None]:
!pip3 install pycocotools requests click

In [None]:
!pip install opendatasets

In [None]:
# https://github.com/opengeos/leafmap/blob/e35e0a75a125614244e5913755c50ec4f307bcab/docs/notebooks/74_map_tiles_to_geotiff.ipynb#L7
# require reload after installation

%pip install -U leafmap
!pip install mercantile

In [None]:
!pip install geomet==1.1.0

In [None]:
# !pip install geopandas

In [None]:
# prompt: restart colab session

import os
os.kill(os.getpid(), 9)


## **Step 2: Download and Setup the Dataset**

In [None]:
# Download the IEEE BEGC 2024 dataset

%cd RSBuildingExtraction

import opendatasets as od

od.download("https://www.kaggle.com/competitions/building-extraction-generalization-2024/data")

%cd ../

In [None]:
# Setup the IEEE BEGC2024 dataset into the necessary format

%cd RSBuildingExtraction

# run the code
!python setup_data.py

%cd ../

## **Step 3: Extract from Microsoft Building Footprint (BF) Dataset (OPTIONAL)**


### Step 3.1 - Define our area of interest (AOI)

We define our area of interest (or AOI) as a GeoJSON geometry, then use the `shapely` library to get the bounding box.</br>
Go to [https://geojson.io](https://geojson.io), and find the prefered AOI. Draw a box around the AOI, and you will get the coordindates for the AOI region.<br>
Please make sure the selected AOI is covered in the [Microsoft Building Footprint dataset](https://github.com/microsoft/GlobalMLBuildingFootprints/), s not all region is covered.<br>
We provide the AOI we used for Redmond, Washington and Las Vegas, Nevada. </br>
However, we recommend using the Las Vegas AOI, which is better for the training.

**Note**: the coordinate reference system is EPSG:4326. The coordinate in expressed as (long, lat) format.

In [None]:
import leafmap
import pandas as pd
import geopandas as gpd
from shapely import geometry
import mercantile
from tqdm import tqdm
import os, shutil
import tempfile
from PIL import Image
import numpy as np

from IPython.display import clear_output

In [None]:
# # The Selected AOI is around Redmond, Washington

# # Geometry copied from https://geojson.io
# aoi_geom = {
#     "coordinates": [
#         [
#             [-122.16484503187519, 47.69090474454916],
#             [-122.16484503187519, 47.6217555345674],
#             [-122.06529607517405, 47.6217555345674],
#             [-122.06529607517405, 47.69090474454916],
#             [-122.16484503187519, 47.69090474454916],
#         ]
#     ],
#     "type": "Polygon",
# }
# aoi_shape = geometry.shape(aoi_geom)
# minx, miny, maxx, maxy = aoi_shape.bounds

In [None]:
# The Selected AOI is around Las Vegas, Nevada (recommended)

# Geometry copied from https://geojson.io
aoi_geom = {
    "coordinates": [
        [
            [-115.31432742408262, 36.27297250862463],
            [-115.31432742408262, 36.00372747612303],
            [-114.98257204779121, 36.00372747612303],
            [-114.98257204779121, 36.27297250862463],
            [-115.31432742408262, 36.27297250862463],
        ]
    ],
    "type": "Polygon",
}
aoi_shape = geometry.shape(aoi_geom)
minx, miny, maxx, maxy = aoi_shape.bounds

### Step 3.2 - Determine which tiles intersect our AOI

In [None]:
quad_keys = set()
for tile in list(mercantile.tiles(minx, miny, maxx, maxy, zooms=9)):
    quad_keys.add(mercantile.quadkey(tile))
quad_keys = list(quad_keys)
print(f"The input area spans {len(quad_keys)} tiles: {quad_keys}")

### Step 3.3 - Download the building footprints for each tile that intersects our AOI and crop the results

This is where most of the magic happens. We download all the building footprints for each tile that intersects our AOI, then only keep the footprints that are _contained_ by our AOI.

*Note*: this step might take awhile depending on how many tiles your AOI covers and how many buildings footprints are in those tiles.

In [None]:
df = pd.read_csv(
    "https://minedbuildings.blob.core.windows.net/global-buildings/dataset-links.csv", dtype=str
)
df.head()

In [None]:
# create an empty dataframe
df_poly = pd.DataFrame()

# Obtain polygons for each tile that intersects the input geometry
for quad_key in tqdm(quad_keys):
    rows = df[df["QuadKey"] == quad_key]
    if rows.shape[0] == 1:
        url = rows.iloc[0]["Url"]

        df2 = pd.read_json(url, lines=True)
        df2["geometry"] = df2["geometry"].apply(geometry.shape)
        df_poly = pd.concat([df_poly, df2], ignore_index=True)

    elif rows.shape[0] > 1:
        raise ValueError(f"Multiple rows found for QuadKey: {quad_key}! We are not sure how to use such data, so feel free to contribute!")
    else:
        raise ValueError(f"QuadKey not found in dataset: {quad_key}")

### Step 3.4 - Get the outer bbox of AOI

In [None]:
df_poly_geometry = df_poly['geometry']
gdf = gpd.GeoDataFrame(df_poly_geometry, crs="EPSG:4326")
_, _, maxx, maxy = gdf.bounds.max()
minx, miny, _, _ = gdf.bounds.min()
outer_aoi_bbox = [minx, miny, maxx, maxy]

### Step 3.5 - Crop AOI into tiles and obtain outer bbox of each tile

In [None]:
def cropAOI(outer_aoi_bbox, step):
    minx, miny, maxx, maxy = outer_aoi_bbox
    maxx = (maxx - minx)//step *step + minx
    maxy = (maxy - miny)//step *step + miny
    outer_aoi_bbox = minx, miny, maxx, maxy

    aoi_bbox_list = []
    # handle large image situation, crop into tiles
    if (maxx - minx) > step or (maxy - miny) > step:
        new_minx, new_maxy = minx, maxy

        num_x_tiles = int((maxx - minx)//step)
        num_y_tiles = int((maxy - miny)//step)

        print(f'Number of x tiles: {num_x_tiles}')
        print(f'Number of y tiles: {num_y_tiles}')

        print(f'\nTotal number of tiles: {num_x_tiles*num_y_tiles}')
        for i in range(num_y_tiles):
            new_miny = new_maxy - step
            for j in range(num_x_tiles):
                new_maxx = new_minx + step

                aoi_bbox = [new_minx, new_miny, new_maxx, new_maxy]
                aoi_bbox_list.append(aoi_bbox)

                new_minx = new_maxx

            #     break

            new_minx = minx
            new_maxy = new_miny

            # break

    return aoi_bbox_list, outer_aoi_bbox

In [None]:
step = 0.0009
aoi_bbox_list, outer_aoi_bbox = cropAOI(outer_aoi_bbox, step)

### Step 3.6 - Label each tile sequentially

In [None]:
def labelAOITile(outer_aoi_bbox, bounded_df, step):
    minx, miny, maxx, maxy = outer_aoi_bbox

    bounded_df['left_tile'] = (bounded_df['minx'] - minx)//step
    bounded_df['right_tile'] = (bounded_df['maxx'] - minx)//step + 1
    bounded_df['bottom_tile'] = (bounded_df['miny'] - miny)//step - 1
    bounded_df['top_tile'] = (bounded_df['maxy'] - miny)//step

    num_x_tiles = (maxx - minx)//step
    num_y_tiles = (maxy - miny)//step

    bounded_df['Tile'] = bounded_df.apply(lambda row: calculateTile(row, num_x_tiles, num_y_tiles, step), axis=1)
    # bounded_df = bounded_df.drop(['left_tile', 'right_tile', 'top_tile', 'bottom_tile'], axis=1)

    return bounded_df

def calculateTile(row, num_x_tiles, num_y_tiles, step):
    y_diff = int(row['top_tile'] - row['bottom_tile'])
    x_diff = int(row['right_tile'] - row['left_tile'])

    y_buffer = num_y_tiles - row['top_tile']

    tile_list = []
    for i in range(y_diff):
        for j in range(x_diff):
            tile_list.append(int(row['left_tile'] + j + (i + y_buffer)*num_x_tiles))

    return tile_list

In [None]:
bounded_df = labelAOITile(outer_aoi_bbox, gdf.bounds, step)
bounded_df.head()

In [None]:
# outer_aoi_bbox

In [None]:
# bounded_df[bounded_df['Tile'].apply(lambda x: 262144 in x)].head()

### Step 3.7 - Normalise Polygon and save image with labels

In [None]:
df_poly = pd.concat([bounded_df, df_poly_geometry], axis=1)
df_poly.head()

In [None]:
def normaliseBbox(aoi_bbox_list, df_poly, step, start_idx=0, end_idx=-1, shuffle=False):
    tile_list = df_poly['Tile'].to_list()
    tile_list_unpacked = []
    for sublist in tile_list:
        for item in sublist:
            tile_list_unpacked.append(item)
    tile_list = set(tile_list_unpacked)

    print(f'Number of tiles to be generated: {len(tile_list)}')
    print('\nGenerating tiles...')

    # convert
    tile_list = list(tile_list)

    # set np random seed and shuffle tile_list
    if shuffle:
        np.random.seed(0)
        np.random.shuffle(tile_list)

    tile_list = tile_list[start_idx:end_idx] # testing purpose

    for tile in tqdm(tile_list):
        try:
            print(f'\n~ Tile {tile}')
            aoi_bbox = aoi_bbox_list[tile]
            cropped_minx, cropped_miny, cropped_maxx, cropped_maxy = aoi_bbox

            # df_tile = df_poly[df_poly['Tile'] == tile]
            df_tile = df_poly[df_poly['Tile'].apply(lambda x: tile in x)]
            for idx, row in df_tile.iterrows():
                polygon = row['geometry']
                # Extract coordinates and adjust them
                new_exterior = []
                for x, y in polygon.exterior.coords:
                    # Adjust x and y coordinates based on the cropped bounding box
                    new_x = (x - cropped_minx) / (cropped_maxx - cropped_minx)
                    new_y = (y - cropped_miny) / (cropped_maxy - cropped_miny)

                    if new_x < 0:
                        new_x = 0
                    elif new_x > 1:
                        new_x = 1

                    if new_y < 0:
                        new_y = 0
                    elif new_y > 1:
                        new_y = 1

                    # print(x - cropped_minx, x, cropped_minx, cropped_maxx, new_x)
                    # print(y - cropped_miny, y, cropped_miny, cropped_maxy, new_y)

                    new_exterior.append((new_x, new_y))

                # Create a new polygon with adjusted coordinates
                new_polygon = geometry.Polygon(new_exterior)

                # Update the 'geometry' column in the DataFrame
                df_tile.at[idx, 'geometry'] = new_polygon

            df_tile['geometry_str'] = df_tile['geometry'].astype(str)
            df_tile['geometry_str'] = df_tile['geometry_str'].str.lstrip('POLYGON ((').str.rstrip('))').str.replace(',', '')

            # save segmentation labels
            with open(f'{os.getcwd()}/labels/tile_{tile}.txt', 'w') as f:
                for item in df_tile['geometry_str']:
                    f.write(f'0 {item}\n')

            # save tif image
            output_path = f'{os.getcwd()}/images_tiff/tile_{tile}.tif'
            zoom = 20
            leafmap.map_tiles_to_geotiff(output_path, aoi_bbox, zoom=zoom, source='SATELLITE')

            clear_output()

            # print(df_tile.head())

        except Exception as e:
            print(f'Error occurred for tile {tile}: {e}')

In [None]:
image_dir_path = f'{os.getcwd()}/images_tiff'
label_dir_path = f'{os.getcwd()}/labels'

if os.path.exists(image_dir_path):
    shutil.rmtree(image_dir_path)
if os.path.exists(label_dir_path):
    shutil.rmtree(label_dir_path)

os.makedirs(image_dir_path)
os.makedirs(label_dir_path)

In [None]:
start_idx = 0
end_idx = 3000
shuffle = True
normaliseBbox(aoi_bbox_list, df_poly, step, start_idx, end_idx, shuffle)

In [None]:
# df_poly[df_poly['Tile'].apply(lambda x: 1 in x)].head()

### Step 3.8 - Convert Tiff to JPEG

In [None]:
def convert_tiff_to_jpeg(input_dir, output_dir):
    # check if output_dir exists, if not create it
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    for filename in os.listdir(input_dir):
        # check if file is an image (ends with .tif)
        if filename.endswith('.tif'):
            img = Image.open(os.path.join(input_dir, filename))
            img = img.transpose(Image.FLIP_TOP_BOTTOM)

            # check if image is RGB mode, if not convert it
            if img.mode != 'RGB':
                img = img.convert('RGB')

            # create new filename, replace .tif with .jpg
            output_filename = os.path.splitext(filename)[0] + '.jpg'

            # save the image in JPEG format
            img.save(os.path.join(output_dir, output_filename), 'JPEG')

    print("Conversion from TIFF to JPEG completed.")

In [None]:
if os.path.exists(f'{os.getcwd()}/images_jpeg'):
    shutil.rmtree(f'{os.getcwd()}/images_jpeg')

In [None]:
convert_tiff_to_jpeg(f'{os.getcwd()}/images_tiff/', f'{os.getcwd()}/images_jpeg/')

### Step 3.9 - Display Image

In [None]:
import numpy as np
from matplotlib.patches import Polygon
import matplotlib.pyplot as plt
from PIL import Image

i = 0
image_dir = f'{os.getcwd()}/images_tiff/'
img_name = sorted(os.listdir(image_dir))[i]
label_name = img_name.replace('tif', 'txt')

# Load the image
im = Image.open(f'{os.getcwd()}/images_tiff/{img_name}')
im = im.transpose(Image.FLIP_TOP_BOTTOM)
im_array = np.array(im)

# Create a figure and axes
fig, ax = plt.subplots()

# Display the image
ax.imshow(im_array)

width, height = im.size
print(f'Image width: {width}')
print(f'Image height: {height}')

# Read and plot the polygons from geometries.txt
with open(f'{os.getcwd()}/labels/{label_name}', 'r') as f:
    for line in f.readlines():
        coords_str = line.strip().split(' ')[1:]
        coords = [float(c) for c in coords_str]
        coords_x = coords[0::2]
        coords_x = [x * width for x in coords_x]

        coords_y = coords[1::2]
        coords_y = [y * height for y in coords_y]
        polygon = Polygon(list(zip(coords_x, coords_y)), closed=True, fill=False, edgecolor='r')
        ax.add_patch(polygon)

# Set the axis limits to match the image dimensions
ax.set_xlim(0, im_array.shape[1])
ax.set_ylim(im_array.shape[0], 0)  # Invert y-axis for image display

# Show the plot
plt.show()

### Step 3.10 - Copy to `RSBuildingExtraction/mydata` (the BEGC2024 training set in YOLO format)

In [None]:
cwd = os.getcwd()
!cp -r {cwd}/images_jpeg/* {cwd}/RSBuildingExtraction/mydata/images/train
!cp -r {cwd}/labels/* {cwd}/RSBuildingExtraction/mydata/labels/train

In [None]:
x = os.listdir('/content/RSBuildingExtraction/mydata/images/train')
len(x)

## **Step 3: Download our Preloaded Redmond/Las Vegas dataset (PREFERED)**

In [None]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
# # Download our Redmond dataset
# !curl -L -o "Redmond.zip" "https://www.dropbox.com/scl/fi/9d9yw2d3z777iiequ7mmr/Redmond.zip?rlkey=l9noi6ffibkmag76isteh6is1&st=lxl5ahd1&dl=0"
# !unzip "Redmond.zip"

# # Copy to RSBuildingExtraction/mydata
# cwd = os.getcwd()
# !cp -r {cwd}/Redmond/images_jpeg/* {cwd}/RSBuildingExtraction/mydata/images/train
# !cp -r {cwd}/Redmond/labels/* {cwd}/RSBuildingExtraction/mydata/labels/train

In [None]:
# Download our Las Vegas dataset
!curl -L -o "LasVegas.zip" "https://www.dropbox.com/scl/fi/qaqo2lqd7x7gxh521f8ce/LasVegas.zip?rlkey=j3oute8e9ia9yoa1hc85fw2ev&st=h6agw363&dl=0"
!unzip "LasVegas.zip"

# Copy to RSBuildingExtraction/mydata
import os
cwd = os.getcwd()
!cp -r {cwd}/LasVegas/images_jpeg/* {cwd}/RSBuildingExtraction/mydata/images/train
!cp -r {cwd}/LasVegas/labels/* {cwd}/RSBuildingExtraction/mydata/labels/train

In [None]:
len(os.listdir('RSBuildingExtraction/mydata/images/train'))

In [None]:
len(os.listdir('RSBuildingExtraction/mydata/labels/train'))

## **Step 4: Training YOLOv8-seg**

In [None]:
from ultralytics import YOLO
import os, shutil

# yaml file of the dataset
yaml_file = "RSBuildingExtraction/mydata/data.yaml"

# use OBB pretrained YOLOv8 models for transfer learning
model = YOLO("yolov8m-seg.pt").load("yolov8m-obb.pt")

# Train the model (mainly shutdown mosaic + add flipud + add rotation)
results = model.train(data=yaml_file, epochs=50, imgsz=640, plots=True, mixup=0.2)

In [None]:
# You can also download our pretrained weights here

import locale
locale.getpreferredencoding = lambda: "UTF-8"

# You can also download our pretrained weights
!curl -L -o "yolov8m-seg_LasVegas.pt" "https://www.dropbox.com/scl/fi/cdrl62i3mx9p82lqwpik5/yolov8m-seg_LasVegas.pt?rlkey=8ao7a5zz7xnqfd74deffprix2&st=0k24i2xp&dl=0"

In [None]:
# prompt: make prediction using model on all images in RSBuildingExtraction/building-extraction-generalization-2024/test/image using python

import os
from ultralytics import YOLO

# Load the trained YOLOv8 model
# model = YOLO('runs/segment/train/weights/last.pt') # newly trained from scratch
model = YOLO('yolov8m-seg_LasVegas.pt') # our pretrained models

# Directory containing test images
test_image_dir = 'RSBuildingExtraction/building-extraction-generalization-2024/test/image'

# Decoding according to the .yaml file class names order
decoding_of_predictions ={0: 'building'}

# Iterate through images in the test directory
IDs = []
entries = []
for image_filename in sorted(os.listdir(test_image_dir)):
    # remove extension from image_filename
    ID = int(os.path.splitext(image_filename)[0])
    print(ID)

    image_path = os.path.join(test_image_dir, image_filename)

    # Perform prediction on the image
    results = model.predict(source=image_path, save=True, conf=0.2, imgsz=640, iou=0.9)

    # Print results for each image (optional)
    for r in results:
        conf_list = r.boxes.conf.cpu().numpy().tolist()
        clss_list = r.boxes.cls.cpu().numpy().tolist()
        original_list = clss_list
        updated_list = []
        for element in original_list:
                updated_list.append(decoding_of_predictions[int(element)])

    # bounding_boxes = r.boxes.xyxy.cpu().numpy()

    confidences = conf_list
    class_names = updated_list
    try:
        masks = r.masks.xy
    except:
        masks = []

    # Check if bounding boxes, confidences and class names match
    if len(masks) != len(confidences) or len(masks) != len(class_names):
        print("Error: Number of bounding boxes, confidences, and class names should be the same.")
        continue

    entry = []
    for m in masks:
        temp = []
        if len(m) <4:
            continue
        for xy in m:
            x, y = xy[0], xy[1]
            temp.append((int(x), int(y)))
        entry.append(temp)

    IDs.append(ID)
    entries.append(entry)

In [None]:
# prompt: create a csv with two columns, using list IDs and entries as the columns, add ImageID and Coordinates as the column title

import csv

# Assuming you have the 'IDs' and 'entries' lists as defined in the previous code

# Create a list of dictionaries to store the data
data = []
for i in range(len(IDs)):
  data.append({'ImageID': IDs[i], 'Coordinates': entries[i]})

# Write the data to a CSV file
with open('output.csv', 'w', newline='') as csvfile:
  fieldnames = ['ImageID', 'Coordinates']
  writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

  writer.writeheader()
  writer.writerows(data)


In [None]:
# zip output

!zip output.zip output.csv


In [None]:
!kaggle competitions submit -c building-extraction-generalization-2024 -f "/content/output.csv" -m "YOLOv8m-seg with Las Vegas dataset"

## **Step 5: Visualization**

In [None]:
import cv2
import matplotlib.pyplot as plt
from ultralytics import YOLO
import numpy as np

# Load the trained YOLOv8 model
# model = YOLO('runs/segment/train/weights/last.pt') # newly trained from scratch
model = YOLO('yolov8m-seg_LasVegas.pt') # our pretrained models

# index
idx = 998 # 90 # 50 # 510 # 0 # 1 # 257 # 998 # 600 # 700 # 701 # 705 # 751 # 754 # 125 # 15
idx_formatted = str(idx).zfill(4)

# path
test_img_path = f'RSBuildingExtraction/building-extraction-generalization-2024/test/image/{idx_formatted}.tif'
test_output_path = f'runs/segment/predict/{idx_formatted}.tif'

# Create a figure and axes
fig, axs = plt.subplots(1, 4, figsize=(20, 5))

# The rest of your plotting code would go here, using axs[0], axs[1], etc. to access the individual subplots.
img = plt.imread(test_img_path)
axs[0].imshow(img)
axs[0].set_title('Original Image')

# for loop different iou
for i, iou in enumerate([0.7, 0.8, 0.9]):
    # Perform prediction on the image
    results = model.predict(source=test_img_path, save=False, conf=0.2, imgsz=640, iou=iou)

    # Create binary mask
    b_mask = np.zeros(img.shape[:2], np.uint8)

    # mask
    try:
        masks = results[0].masks.xy
    except:
        masks = []

    # loop all mask
    for m in masks:
        # if less than 4 points, ignore
        if len(m) <4:
            continue
        # get contour
        contour = m.astype(np.int32)
        contour = contour.reshape(-1, 1, 2)
        _ = cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED)

    axs[i + 1].imshow(b_mask, cmap='gray')
    axs[i + 1].set_title(f'NMS IoU Threshold = {iou}')

# Remove axis ticks
for ax in axs:
    ax.set_xticks([])
    ax.set_yticks([])

# Show the plot
# plt.tight_layout()
plt.show()

In [None]:
import cv2
import matplotlib.pyplot as plt
from ultralytics import YOLO
import numpy as np

# index
idx = 998 # 90 # 50 # 510 # 0 # 1 # 257 # 998 # 600 # 700 # 701 # 705 # 751 # 754 # 125 # 15
idx_formatted = str(idx).zfill(4)

# path
test_img_path = f'RSBuildingExtraction/building-extraction-generalization-2024/test/image/{idx_formatted}.tif'
test_output_path = f'runs/segment/predict/{idx_formatted}.tif'

# Create a figure and axes
fig, axs = plt.subplots(1, 4, figsize=(20, 5))

# The rest of your plotting code would go here, using axs[0], axs[1], etc. to access the individual subplots.
img = plt.imread(test_img_path)
axs[0].imshow(img)
axs[0].set_title('Original Image')

# for loop different iou
weights = ['yolov8m-seg_basic.pt', 'yolov8m-seg_Washington.pt', 'yolov8m-seg_LasVegas.pt']
names = ["YOLOv8m Basic", "YOLOv8m (w' Washington Dataset)", "YOLOv8m (w' Las Vegas Dataset)"]
for i, weight in enumerate(weights):
    # Load the trained YOLOv8 model
    model = YOLO(weight)

    # Perform prediction on the image
    results = model.predict(source=test_img_path, save=False, conf=0.2, imgsz=640, iou=0.9)

    # Create binary mask
    b_mask = np.zeros(img.shape[:2], np.uint8)

    # mask
    try:
        masks = results[0].masks.xy
    except:
        masks = []

    # loop all mask
    for m in masks:
        # if less than 4 points, ignore
        if len(m) <4:
            continue
        # get contour
        contour = m.astype(np.int32)
        contour = contour.reshape(-1, 1, 2)
        _ = cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED)

    axs[i + 1].imshow(b_mask, cmap='gray')
    axs[i + 1].set_title(f'{names[i]}')

# Remove axis ticks
for ax in axs:
    ax.set_xticks([])
    ax.set_yticks([])

# Show the plot
# plt.tight_layout()
plt.show()

In [None]:
import cv2
import matplotlib.pyplot as plt
from ultralytics import YOLO
import numpy as np

# index
idx = 998 # 90 # 50 # 510 # 0 # 1 # 257 # 998 # 600 # 700 # 701 # 705 # 751 # 754 # 125 # 15
idx_formatted = str(idx).zfill(4)

# path
test_img_path = f'RSBuildingExtraction/building-extraction-generalization-2024/test/image/{idx_formatted}.tif'
test_output_path = f'runs/segment/predict/{idx_formatted}.tif'

# Create a figure and axes
fig, axs = plt.subplots(1, 4, figsize=(20, 5))

# The rest of your plotting code would go here, using axs[0], axs[1], etc. to access the individual subplots.
img = plt.imread(test_img_path)
axs[0].imshow(img)
axs[0].set_title('Original Image')

# for loop different iou
weights = ['yolov8n-seg_basic.pt', 'yolov8s-seg_basic.pt', 'yolov8m-seg_basic.pt']
names = ["YOLOv8n-seg", "YOLOv8s-seg", "YOLOv8m-seg"]
for i, weight in enumerate(weights):
    # Load the trained YOLOv8 model
    model = YOLO(weight)

    # Perform prediction on the image
    results = model.predict(source=test_img_path, save=False, conf=0.2, imgsz=640, iou=0.9)

    # Create binary mask
    b_mask = np.zeros(img.shape[:2], np.uint8)

    # mask
    try:
        masks = results[0].masks.xy
    except:
        masks = []

    # loop all mask
    for m in masks:
        # if less than 4 points, ignore
        if len(m) <4:
            continue
        # get contour
        contour = m.astype(np.int32)
        contour = contour.reshape(-1, 1, 2)
        _ = cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED)

    axs[i + 1].imshow(b_mask, cmap='gray')
    axs[i + 1].set_title(f'{names[i]}')

# Remove axis ticks
for ax in axs:
    ax.set_xticks([])
    ax.set_yticks([])

# Show the plot
# plt.tight_layout()
plt.show()