# Step 2 - Creating Segmentation Annotations of GWHD 2021 Dataset for Roboflow

Goal
* To convert the bounding boxes of the GWHD 2021 dataset to segmentations for a new segmentation dataset.
* To upload the new segmentation datset to Roboflow.

Resources

* https://github.com/facebookresearch/sam2#model-description
* https://roboflow.com/formats/yolov8-pytorch-txt
* https://blog.roboflow.com/how-to-use-segment-anything-model-sam/
* https://docs.ultralytics.com/datasets/segment/#ultralytics-yolo-format
* https://blog.roboflow.com/convert-bboxes-masks-polygons/#how-to-convert-a-mask-to-a-polygon
* https://supervision.roboflow.com/detection/utils/
* https://stackoverflow.com/questions/1773805/how-can-i-parse-a-yaml-file-in-python
* https://supervision.roboflow.com/detection/utils/#supervision.detection.utils.mask_to_polygons
* https://colab.research.google.com/github/facebookresearch/sam2/blob/main/notebooks/image_predictor_example.ipynb#scrollTo=3c2e4f6b
* https://docs.ultralytics.com/datasets/segment/#ultralytics-yolo-format


# Environment Set-up

Part of the "Environment Set-up and and "Set-up" code were copied from this [notebook](https://colab.research.google.com/github/facebookresearch/sam2/blob/main/notebooks/image_predictor_example.ipynb#scrollTo=07fabfee)

In [None]:
using_colab = True

In [None]:
if using_colab:
    import torch
    import torchvision
    print("PyTorch version:", torch.__version__)
    print("Torchvision version:", torchvision.__version__)
    print("CUDA is available:", torch.cuda.is_available())
    import sys
    !{sys.executable} -m pip install opencv-python matplotlib
    !{sys.executable} -m pip install 'git+https://github.com/facebookresearch/sam2.git'

    !mkdir -p ../checkpoints/
    !wget -P ../checkpoints/ https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt

PyTorch version: 2.6.0+cu124
Torchvision version: 0.21.0+cu124
CUDA is available: True
Collecting git+https://github.com/facebookresearch/sam2.git
  Cloning https://github.com/facebookresearch/sam2.git to /tmp/pip-req-build-2h8u9d3l
  Running command git clone --filter=blob:none --quiet https://github.com/facebookresearch/sam2.git /tmp/pip-req-build-2h8u9d3l
  Resolved https://github.com/facebookresearch/sam2.git to commit 2b90b9f5ceec907a1c18123530e92e794ad901a4
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
--2025-07-13 15:21:13--  https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 13.226.210.15, 13.226.210.25, 13.226.210.78, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|13.226.210.15|:443... connected.
HTTP request sent, awaiting response... 200 OK
Lengt

# Set-up

In [None]:
import os
# if using Apple MPS, fall back to CPU for unsupported ops
os.environ["PYTORCH_ENABLE_MPS_FALLBACK"] = "1"
import numpy as np
import torch
import matplotlib.pyplot as plt
from PIL import Image
import supervision as sv
from sam2.build_sam import build_sam2
from sam2.sam2_image_predictor import SAM2ImagePredictor
from roboflow import Roboflow
import yaml
import cv2
import shutil
import roboflow

In [None]:
# select the device for computation
if torch.cuda.is_available():
    device = torch.device("cuda")
elif torch.backends.mps.is_available():
    device = torch.device("mps")
else:
    device = torch.device("cpu")
print(f"using device: {device}")

if device.type == "cuda":
    # use bfloat16 for the entire notebook
    torch.autocast("cuda", dtype=torch.bfloat16).__enter__()
    # turn on tfloat32 for Ampere GPUs (https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices)
    if torch.cuda.get_device_properties(0).major >= 8:
        torch.backends.cuda.matmul.allow_tf32 = True
        torch.backends.cudnn.allow_tf32 = True
elif device.type == "mps":
    print(
        "\nSupport for MPS devices is preliminary. SAM 2 is trained with CUDA and might "
        "give numerically different outputs and sometimes degraded performance on MPS. "
        "See e.g. https://github.com/pytorch/pytorch/issues/84936 for a discussion."
    )

using device: cuda


In [None]:
!pip install roboflow



In [None]:
# Replace PLACEHOLDER_FOR_API_KEY with your Roboflow's dataset Private API key
# More directions can be found on the website below
# https://docs.roboflow.com/developer/authentication/find-your-roboflow-api-key
API_KEY = "YOUR_API_KEY"
rf = Roboflow(api_key=API_KEY)
project = rf.workspace("gwhd-2021").project("gwhd-2021-object-detection")

# If not specificed, work will be done inside the dataset.location directory
dataset = project.version(1).download("yolov8")

loading Roboflow workspace...
loading Roboflow project...


Downloading Dataset Version Zip in GWHD-2021-Object-Detection-1 to yolov8:: 100%|██████████| 588937/588937 [00:07<00:00, 77345.32it/s]





Extracting Dataset Version Zip to GWHD-2021-Object-Detection-1 in yolov8:: 100%|██████████| 13034/13034 [00:02<00:00, 5556.77it/s]


In [None]:
! pip install supervision



In [None]:
!pip install PyYAML



# Preparing SAM2

In [None]:
sam2_checkpoint = "../checkpoints/sam2.1_hiera_large.pt"
model_cfg = "configs/sam2.1/sam2.1_hiera_l.yaml"

# Builds SAM2 model
sam2_model = build_sam2(model_cfg, sam2_checkpoint, device=device)
# Creates SAM2 predictor
predictor = SAM2ImagePredictor(sam2_model)

In [None]:
# Create segmentation annotation labels folders

os.mkdir(os.path.join(dataset.location, 'train', 'segmentation_labels'))
os.mkdir(os.path.join(dataset.location, 'valid', 'segmentation_labels'))
os.mkdir(os.path.join(dataset.location, 'test', 'segmentation_labels'))

In [None]:
def show_mask(mask, ax, random_color=False):
    if random_color:
        color = np.concatenate([np.random.random(3), np.array([0.6])], axis=0)
    else:
        color = np.array([30/255, 144/255, 255/255, 0.6])
    h, w = mask.shape[-2:]
    mask_image = mask.reshape(h, w, 1) * color.reshape(1, 1, -1)
    ax.imshow(mask_image)

def show_points(coords, labels, ax, marker_size=375):
    pos_points = coords[labels==1]
    neg_points = coords[labels==0]
    ax.scatter(pos_points[:, 0], pos_points[:, 1], color='green', marker='*', s=marker_size, edgecolor='white', linewidth=1.25)
    ax.scatter(neg_points[:, 0], neg_points[:, 1], color='red', marker='*', s=marker_size, edgecolor='white', linewidth=1.25)

def show_box(box, ax):
    x0, y0 = box[0], box[1]
    w, h = box[2] - box[0], box[3] - box[1]
    ax.add_patch(plt.Rectangle((x0, y0), w, h, edgecolor='green', facecolor=(0,0,0,0), lw=2))

In [None]:
def bbox_to_mask(base_path, split):
  """

  """
  images_path = os.path.join(base_path, split, 'images') # path to directory of images of the current split
  labels_path = os.path.join(base_path, split, 'labels') # path to directory of annotations for the images of the current split
  segmentation_labels_path = os.path.join(base_path, split, 'segmentation_labels') # path to directory that will store new segmentation labels

  for label_file in os.listdir(labels_path):
    root, extension = os.path.splitext(label_file)
    image_path = os.path.join(images_path, root + '.jpg') # path to image
    label_path = os.path.join(labels_path, label_file) # path to the corresponding image annotation

    image = cv2.cvtColor(cv2.imread(image_path), cv2.COLOR_BGR2RGB) # Numpy array representation of the image

    predictor.set_image(image) # Feed the SAM2 predictor the image


    with open(label_path, 'r') as f:
      lines = [line.rstrip() for line in f]

      if len(lines) != 0:
        for line in lines:
          bboxes = []
          coords = line.split(' ')
          bbox_center_x = float(coords[1])*640
          bbox_center_y = float(coords[2])*640
          bbox_width = float(coords[3])*640
          bbox_height = float(coords[4])*640

          xmin = bbox_center_x - bbox_width/2
          ymin = bbox_center_y - bbox_height/2

          xmax = bbox_center_x + bbox_width/2
          ymax = bbox_center_y + bbox_height/2

          bboxes.append([xmin, ymin, xmax, ymax])

          input_box = np.array(bboxes) # Represents the ground truth bounding box points
          masks, _, _ = predictor.predict(
              point_coords=None,
              point_labels=None,
              box=input_box[None, :],
              multimask_output=False,
          )


          polygons = [] # Stores the mask predictions as polygons
          for mask in masks:
            try:
              # sv.mask_to_polygons() returns a list of NumPy arrays where
              # each array represents a single polygon.
              # Within each polygon, it is described by the (x,y) coordinates of
              # the vertices of that polygon
              # (x,y) coordaintes are represented by NumPy arrays
              polygons.append(sv.mask_to_polygons(mask))
            except:
              print(f'Error with file {root}. Skipping')
              pass


          bbox_annotation = "0 "

          try:
            for polygon in polygons:
              if len(polygon[0]) > 0:
                # Assumes predictor only outputted a single mask
                # Therefore, polygons only has one polygon at index 0
                for point in polygon[0]:
                  bbox_annotation += str(float(point[0]) / 640.0) + " " + str(float(point[1]) / 640.0) + " "
              else:
                print(f"Error. Len of polygon[0] !> 0. Check file {root}")
                pass

            with open(os.path.join(segmentation_labels_path, root + '.txt'), "a") as f:
              f.write(bbox_annotation.rstrip() + "\n") # use rstrip() to remove trailing white spaces

          except:
            print(f'Error with file {root}. Skipping')
            pass

      else: # If the image doesn't have any annotations
        with open(os.path.join(segmentation_labels_path, root + '.txt'), "w") as f:
          pass
      predictor.reset_predictor() # Reset the SAM2 predictor before making predictions on the next image

In [None]:
print("Converting train images...")
bbox_to_mask(dataset.location, 'train')

print("Converting valid images...")
bbox_to_mask(dataset.location, 'valid')

print("Converting test images...")
bbox_to_mask(dataset.location, 'test')


Converting train images...
Converting valid images...
Converting test images...
Error with file 497f9dd3e34f946e46ebfafca71ac763d5a1adf4ad8db288b11e2240a13d4908_png.rf.92e464a94529948a89fb27f1863aa648. Skipping


In [None]:
# To check number of files
# Expecting train = 3655, valid = 1476, test = 1380
directory_paths = {
    'train': '/content/GWHD-2021-Object-Detection-1/train/segmentation_labels',
    'valid': '/content/GWHD-2021-Object-Detection-1/valid/segmentation_labels',
    'test': '/content/GWHD-2021-Object-Detection-1/test/segmentation_labels'
}

for split, directory_path in directory_paths.items():
  txt_files = [f for f in os.listdir(directory_path) if f.endswith('.txt') and os.path.isfile(os.path.join(directory_path, f))]

  # Print the total number of files
  print(f"Number of files in '{directory_path}': {len(txt_files)}")

Number of files in '/content/GWHD-2021-Object-Detection-1/train/segmentation_labels': 3655
Number of files in '/content/GWHD-2021-Object-Detection-1/valid/segmentation_labels': 1476
Number of files in '/content/GWHD-2021-Object-Detection-1/test/segmentation_labels': 1380


# Update YAML file

Align YAML file description with new segmentation dataset

In [None]:
with open(os.path.join(dataset.location, 'data.yaml'), 'r') as file:
  data = yaml.safe_load(file)

data['roboflow']['project'] = 'gwhd-2021-instance-segmentation'
data['roboflow'].pop('url')


# Save the updated YAML file
with open(os.path.join(dataset.location, 'data.yaml'), 'w') as file:
    yaml.dump(data, file)

# Adjust Folder Structure for Roboflow Upload

Roboflow expects a certain folder structure when uploading a dataset. The following statements will make the necessary changes.

In [None]:
# Remove READMEs
os.remove(os.path.join(dataset.location, 'README.dataset.txt'))
os.remove(os.path.join(dataset.location, 'README.roboflow.txt'))

In [None]:
# Remove original label files
shutil.rmtree(os.path.join(dataset.location, 'train', 'labels'))
shutil.rmtree(os.path.join(dataset.location, 'valid', 'labels'))
shutil.rmtree(os.path.join(dataset.location, 'test', 'labels'))

In [None]:
# Rename folders to match Roboflow upload format

os.rename(os.path.join(dataset.location, 'train', 'segmentation_labels'), os.path.join(dataset.location, 'train', 'labels'))
os.rename(os.path.join(dataset.location, 'valid', 'segmentation_labels'), os.path.join(dataset.location, 'valid', 'labels'))
os.rename(os.path.join(dataset.location, 'test', 'segmentation_labels'), os.path.join(dataset.location, 'test', 'labels'))

In [None]:
os.rename(dataset.location, 'GWHD-2021-Instance-Segmentation')

# Upload Dataset

In [None]:
roboflow.login(force=True)

visit https://app.roboflow.com/auth-cli to get your authentication token.
Paste the authentication token here: ··········


In [None]:
# Replace PLACEHOLDER_FOR_API_KEY with your Roboflow's dataset Private API key
# More directions can be found on the website below
# https://docs.roboflow.com/developer/authentication/find-your-roboflow-api-key
API_KEY = "YOUR_API_KEY"
rf = Roboflow(api_key=API_KEY)

In [None]:
# Connect to gwhd-2021 workspace on Roboflow
workspace = rf.workspace("gwhd-2021")
print(rf.workspace())

loading Roboflow workspace...
loading Roboflow workspace...
{
  "name": "GWHD 2021",
  "url": "gwhd-2021",
  "projects": [
    "gwhd-2021/gwhd-2021-instance-segmentation",
    "gwhd-2021/gwhd-2021-object-detection",
    "gwhd-2021/gwhd2021od-dgg1n",
    "gwhd-2021/gwhd2021od-od3w5",
    "gwhd-2021/gwhd2021od-youfr"
  ]
}


In [None]:
# Upload the new segmentation dataset
workspace.upload_dataset(
    '/content/GWHD-2021-Instance-Segmentation', # Make sure this is the location of the new segmentation dataset
    'GWHD-2021-Instance-Segmentation-V2',
    num_workers = 10,
    project_license = "MIT",
    project_type = "instance-segmentation",
    batch_name = "6511",
    num_retries=5
)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
[UPLOADED] /content/GWHD-2021-Instance-Segmentation/train/images/073e3a608e65126ba07fcec190f03cbb3fcc1e47cdf3fb6c70c1f91e596b1bb4_png.rf.f22fde73a5a8ed2f8b435c9b019854ad.jpg (9lPrLwkZzscqMi7H8QZV) [0.6s] / annotations = OK [0.5s]
[UPLOADED] /content/GWHD-2021-Instance-Segmentation/train/images/076a8ddf55e025ef4114ac9065f7c88a0f6686031227ac60ee4d944724061ca0_png.rf.9ef90c9c8e3f0c34e0477dbfbf0eb72b.jpg (G4TPtzDLugidgLe2ha7X) [0.6s] / annotations = OK [0.4s]
[UPLOADED] /content/GWHD-2021-Instance-Segmentation/train/images/06f97c2c7ff80a4baf439b64195700d0d9829573e523be476a83f991f33f26a5_png.rf.a5a73b34e60a310d8a43f8c658fc317d.jpg (CTiVJZ4CM8EsfDqULmgC) [1.6s] / annotations = OK [0.4s]
[UPLOADED] /content/GWHD-2021-Instance-Segmentation/train/images/078eee55a02d1b4c1996100975cdafd1431afafb89ff370ad6f1b3aeee725783_png.rf.e98e1980dfb065c07bf69ea1765e1b35.jpg (jf9SCu1UrEKTeAWSD0B8) [0.8s] / annotations = OK [0.4s]
[UPLOADED] /con