# **IEEE Bigdata Cup 2024: Building extraction**

**Author:** [Yi-Jie Wong](https://www.linkedin.com/in/wongyijie/) et al<br>
**Challenge link:** [Kaggle](https://www.kaggle.com/competitions/building-extraction-generalization-2024/leaderboard)<br>
**Date created:** 2024/07/10<br>
**Last modified:** 2024/09/12<br>
**Description:** Cross-City Generalizability of Instance Segmentation Model in a Nationwide Building Extraction Task

## Step 1: Setup the Repo

In [1]:
# clone this repo
!git clone https://github.com/yjwong1999/RSGuidedDiffusion.git

Cloning into 'RSGuidedDiffusion'...



In [2]:
# Remaining dependencies (for segmentation)
!pip install opendatasets==0.1.22
!pip install ever-beta==0.2.3
!pip install git+https://github.com/qubvel/segmentation_models.pytorch
!pip install pycocotools requests click

# Remaining dependencies (for diffusion)
!pip install diffusers==0.21.4
!pip install datasets==2.14.5
!pip install transformers==4.33.2
!pip install tensorboard==2.14.0
!pip install safetensors==0.4.4

Collecting git+https://github.com/qubvel/segmentation_models.pytorch
  Cloning https://github.com/qubvel/segmentation_models.pytorch to /tmp/pip-req-build-wez7cmle
  Running command git clone --filter=blob:none --quiet https://github.com/qubvel/segmentation_models.pytorch /tmp/pip-req-build-wez7cmle

  Resolved https://github.com/qubvel/segmentation_models.pytorch to commit 966bb6deb096bd9963de0baffcbb7ad330cd30ba
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone


## Step 2: Download and Setup the Dataset

In [3]:
# Download the IEEE BEGC 2024 dataset

%cd RSGuidedDiffusion

import opendatasets as od

od.download("https://www.kaggle.com/competitions/building-extraction-generalization-2024/data")

%cd ../

/teamspace/studios/this_studio/RSGuidedDiffusion
Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username:

  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


  yjwong99


Your Kaggle Key:

  ········


Downloading building-extraction-generalization-2024.zip to ./building-extraction-generalization-2024


100%|██████████| 1.19G/1.19G [00:09<00:00, 135MB/s] 



Extracting archive ./building-extraction-generalization-2024/building-extraction-generalization-2024.zip to ./building-extraction-generalization-2024
/teamspace/studios/this_studio


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


In [4]:
# Setup the IEEE BEGC2024 dataset into the necessary format

%cd RSGuidedDiffusion

# run the code
!python setup_data.py

%cd ../

/teamspace/studios/this_studio/RSGuidedDiffusion


#-----------------------------------------------------------------------------------
Download the Building Extraction Generalization 2024 Dataset from kaggle
#-----------------------------------------------------------------------------------

Skipping, found downloaded files in "./building-extraction-generalization-2024" (use force=True to force download)


#-----------------------------------------------------------------------------------
Restructuring the dataset into directory "detect" for segmentation/diffusion
#-----------------------------------------------------------------------------------

loading annotations into memory...
Done (t=0.95s)
creating index...
index created!
[;36mCOCO categories[0m: 
['building']

['building']
The total number of the data: 3784
[;36mIndex Correspond Table:[0m
{'building': 0}
[;32mCreating symbolic links...[0m
[;32mSymbolic links for /teamspace/studios/this_studio/RSGuidedDiffusion/buildi

## Step 3: Get the Segmentation Mask using Pretrained HRNet32 from LoveDA dataset

In [5]:
# Get Pretrained HRNet weights
%cd RSGuidedDiffusion/segmentation
!curl -L -o "hrnetw32.pth" "https://www.dropbox.com/scl/fi/5au20lvw3yb5y3btnlamg/hrnetw32.pth?rlkey=eoqio6mlxtq4ykdnaa8n4dp4l&st=d4tg641s&dl=0"

# move the pretrained weights to the designated directory
!mkdir -vp ./log/
!mv "hrnetw32.pth" "./log/hrnetw32.pth"

# make a soft link from "building-extraction-generalization-2024" to get the image data into LoveDA
!ln -s "../building-extraction-generalization-2024" ./LoveDA

# copy the label data into LoveDA
!cp -r "../detect/train/label" ./LoveDA/train
!cp -r "../detect/val/label" ./LoveDA/val

# run the segmentation code
!python3 run.py
%cd ../../

/teamspace/studios/this_studio/RSGuidedDiffusion/segmentation
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     28      0 --:--:-- --:--:-- --:--:--    28
100  113M  100  113M    0     0  61.3M      0  0:00:01  0:00:01 --:--:--  140M
mkdir: created directory './log/'
INFO:ever.core.logger:HRNetEncoder: pretrained = True
INFO:data.loveda:./LoveDA/train/image -- Dataset images: 3784
100%|███████████████████████████████████████| 3784/3784 [04:51<00:00, 12.98it/s]
INFO:data.loveda:./LoveDA/val/image -- Dataset images: 933
100%|█████████████████████████████████████████| 933/933 [01:09<00:00, 13.41it/s]
/teamspace/studios/this_studio


## Step 4: Train the Segmentation Guided Diffusion Model

In [1]:
# # Train the Guided Diffusion Model using BEGC2024 Training Images + Segmentation Masks (from HRNet32, fixed with BEGC building labels)
# # Uncomment to train from scratch

# %cd RSGuidedDiffusion

# import os
# pwd = os.getcwd()

# !CUDA_VISIBLE_DEVICES=0 python3 main.py --mode train --model_type DDIM --img_size 256 --num_img_channels 3 --dataset BEGC --img_dir {pwd}/segmentation/diffusion_data/data --seg_dir {pwd}/segmentation/diffusion_data/mask --segmentation_guided --segmentation_channel_mode single --num_segmentation_classes 7 --train_batch_size 2 --eval_batch_size 2 --num_epochs 200

# %cd ../

In [14]:
# Download the Pretrained Segmentation Guided Diffusion Model

%cd RSGuidedDiffusion

# download pretrained model
!curl -L -o "ddim-BEGC-256-segguided.zip" "https://www.dropbox.com/scl/fi/86i7mvr3fe1rkgejdewcj/ddim-BEGC-256-segguided.zip?rlkey=eugkdfero832mecdu9mdk0fio&st=k245vc5h&dl=0"
!unzip "ddim-BEGC-256-segguided.zip"
    
%cd ../

/teamspace/studios/this_studio/RSGuidedDiffusion
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0     31      0 --:--:-- --:--:-- --:--:--    30
100   491    0   491    0     0    642      0 --:--:-- --:--:-- --:--:--   642
100  401M  100  401M    0     0  97.5M      0  0:00:04  0:00:04 --:--:--  132M
Archive:  ddim-BEGC-256-segguided.zip
   creating: ddim-BEGC-256-segguided/
  inflating: ddim-BEGC-256-segguided/model_index.json  
   creating: ddim-BEGC-256-segguided/unet/
  inflating: ddim-BEGC-256-segguided/unet/config.json  
  inflating: ddim-BEGC-256-segguided/unet/diffusion_pytorch_model.safetensors  
   creating: ddim-BEGC-256-segguided/scheduler/
  inflating: ddim-BEGC-256-segguided/scheduler/scheduler_config.json  
/teamspace/studios/this_studio


## Step 5: Generate the Synthetic Dataset using our Segmentation Guided Diffusion Model

In [4]:
import shutil

# By default, the code will use test dataset to generate synthetic images
# But we do not want to use BEGC2024 test dataset for this, to prevent data leakage
# Instead, we make a copy of train dataset as "test" dataset, which will be used to generate the synthetic dataset
src_path = "RSGuidedDiffusion/segmentation/diffusion_data/mask/all/train"
dst_path = "RSGuidedDiffusion/segmentation/diffusion_data/mask/all/test"

# Recursively copy the entire directory tree
shutil.copytree(src_path, dst_path)

print("Directory copied successfully!")


Directory copied successfully!


In [5]:
# # Run inference to generate synthetic data
# # Uncomment to run the code
# # We set --eval_sample_size to 1584, because there is only 1584 data points in our test (train) folder
# # You can set it higher to generate more images

# %cd RSGuidedDiffusion

# import os
# pwd = os.getcwd()

# !CUDA_VISIBLE_DEVICES=0 python3 main.py --mode eval_many --model_type DDIM --img_size 256 --num_img_channels 3 --dataset BEGC --eval_batch_size 1 --eval_sample_size 1584 --seg_dir {pwd}/segmentation/diffusion_data/mask --segmentation_guided --segmentation_channel_mode single --num_segmentation_classes 7 

# %cd ../

In [18]:
# Instead, you can download the synthetic data which we have generated for you

%cd RSGuidedDiffusion/ddim-BEGC-256-segguided

# download the synthetic images generated by our model
!curl -L -o "generated_images.zip" "https://www.dropbox.com/scl/fi/slq3qcg0qhzpj9cc22ws4/generated_images.zip?rlkey=npgj3v4ki6o7sogrca742ubt3&st=fjuxt1vn&dl=0"
!unzip "generated_images.zip"

import os
os.rename('RSGuidedDiffusion/ddim-BEGC-256-segguided/generated_images', 'RSGuidedDiffusion/ddim-BEGC-256-segguided/samples_many_1584')
    
%cd ../../

## Step 6: Train a YOLOv8 Segmentation Model using IEEE BEGC2024 + Synthetic Data generated by our Diffusion Model

In [19]:
import os, shutil

# to copy the synthetic images and labels to "mydata" directory, which is used to train our YOLO model
train_img_dir = 'RSGuidedDiffusion/mydata/images/train'
train_label_dir = 'RSGuidedDiffusion/mydata/labels/train'
diffused_img_dir = 'RSGuidedDiffusion/ddim-BEGC-256-segguided/samples_many_1584'

diffused_imgs = sorted(os.listdir(diffused_img_dir))
for img in diffused_imgs:
    ori_img = img.replace('condon_', '')
    ori_label = ori_img.replace('.jpg', '.txt')
    label = 'condon_' + ori_label

    # full diffused image path
    ori_img = os.path.join(train_img_dir, ori_img)
    img = os.path.join(diffused_img_dir, img)
    
    # full label path
    ori_label = os.path.join(train_label_dir, ori_label)
    label = os.path.join(train_label_dir, label)

    # check if ori_label exits
    if os.path.isfile(ori_label):
        # copy img and into train_img_dir
        shutil.copy(img, train_img_dir)
        
        # copy ori_label and name it as label
        shutil.copyfile(ori_label, label)
    else:
        print(f'Skip {img} because label not exists')

In [22]:
!pip install ultralytics==8.1

Collecting ultralytics==8.1
  Downloading ultralytics-8.1.0-py3-none-any.whl.metadata (39 kB)
Collecting opencv-python>=4.6.0 (from ultralytics==8.1)
  Downloading opencv_python-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB)
Collecting py-cpuinfo (from ultralytics==8.1)
  Downloading py_cpuinfo-9.0.0-py3-none-any.whl.metadata (794 bytes)
Collecting thop>=0.1.1 (from ultralytics==8.1)
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl.metadata (2.7 kB)
Collecting seaborn>=0.11.0 (from ultralytics==8.1)
  Downloading seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)
Collecting hub-sdk>=0.0.2 (from ultralytics==8.1)
  Downloading hub_sdk-0.0.10-py3-none-any.whl.metadata (10 kB)
Downloading ultralytics-8.1.0-py3-none-any.whl (699 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m699.2/699.2 kB[0m [31m37.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading hub_sdk-0.0.10-py3-none-any.whl (42 kB)
Downloading opencv_python-4.10.0.84-cp

In [None]:
# Train the YOLOv8 segmentation model

from ultralytics import YOLO
import os, shutil

# yaml file of the Puerto Rico dataset
yaml_file = "RSGuidedDiffusion/mydata/data.yaml"

# use OBB pretrained YOLOv8 models for transfer learning
model = YOLO("yolov8m-seg.pt").load("yolov8m-obb.pt")

# Train the model (mainly shutdown mosaic + add flipud + add rotation)
results = model.train(data=yaml_file, epochs=50, imgsz=640, plots=True, mixup=0.2)

In [None]:
# Run inference using the trained YOLO model

import os
from ultralytics import YOLO

# Load the trained YOLOv8 model
model = YOLO('runs/segment/train/weights/last.pt')

# Directory containing test images
test_image_dir = 'RSGuidedDiffusion/building-extraction-generalization-2024/test/image'

# Decoding according to the .yaml file class names order
decoding_of_predictions ={0: 'building'}

# Iterate through images in the test directory
IDs = []
entries = []
for image_filename in sorted(os.listdir(test_image_dir)):
    # remove extension from image_filename
    ID = int(os.path.splitext(image_filename)[0])
    print(ID)

    image_path = os.path.join(test_image_dir, image_filename)

    # Perform prediction on the image
    results = model.predict(source=image_path, save=True, conf=0.2, imgsz=640, iou=0.95)

    # Print results for each image (optional)
    for r in results:
        conf_list = r.boxes.conf.cpu().numpy().tolist()
        clss_list = r.boxes.cls.cpu().numpy().tolist()
        original_list = clss_list
        updated_list = []
        for element in original_list:
                updated_list.append(decoding_of_predictions[int(element)])

    # bounding_boxes = r.boxes.xyxy.cpu().numpy()

    confidences = conf_list
    class_names = updated_list
    try:
        masks = r.masks.xy
    except:
        masks = []

    # Check if bounding boxes, confidences and class names match
    if len(masks) != len(confidences) or len(masks) != len(class_names):
        print("Error: Number of bounding boxes, confidences, and class names should be the same.")
        continue

    entry = []
    for m in masks:
        temp = []
        if len(m) <4:
            continue
        for xy in m:
            x, y = xy[0], xy[1]
            temp.append((int(x), int(y)))
        entry.append(temp)

    IDs.append(ID)
    entries.append(entry)

In [None]:
# Generate the output csv file

import csv

# Assuming you have the 'IDs' and 'entries' lists as defined in the previous code

# Create a list of dictionaries to store the data
data = []
for i in range(len(IDs)):
  data.append({'ImageID': IDs[i], 'Coordinates': entries[i]})

# Write the data to a CSV file
with open('output.csv', 'w', newline='') as csvfile:
  fieldnames = ['ImageID', 'Coordinates']
  writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

  writer.writeheader()
  writer.writerows(data)


In [None]:
# just in case
import locale
locale.getpreferredencoding = lambda: "UTF-8"

# zip into the output file
!zip output.zip /content/output.csv