Add support for bounding boxes and instance masks to VectorDataset #2819

mayrajeo · 2025-06-06T08:17:09Z

Related to #2505 , a proposal to add bbox_xyxy, segmentation and label as the return values for VectorDataset.__getitem__. After this, VectorDataset.__getitem__ returns

        sample = {
            'mask': masks, # As before
            'bbox_xyxy': boxes_xyxy, # N, 4 tensor containing the bounding boxes in xmin, ymin, xmax, ymax
            'segmentation': segmentations, # A list of N tensors containing the COCO format segmentations
            'crs': self.crs,
            'label': labels, # N tensor containing the int labels for each object
            'bounds': query,
        }

Major downside: as the object detection related things are returned by default, VectorDataset can't be used with most of the common collate_fns, such as stack_samples if the batch size is larger than 1.

Example gist: https://gist.github.com/mayrajeo/1de0497dd82d2c9a2b6381ef482face8

mayrajeo · 2025-06-06T08:18:44Z

@microsoft-github-policy-service agree

isaaccorley · 2025-06-06T13:57:31Z

@adamjstewart Thoughts on adding an "output_type" arg to VectorDataset? My only concern is computing mask/instance mask/boxes for each polygon can add unwanted overhead in the case that I just want boxes, or I just want the mask.

adamjstewart · 2025-06-06T14:13:23Z

I'm not completely opposed. In fact, this is one of my implementation questions in #2505.

I guess I would be curious to know exactly how much overhead we are talking about. As long as the overhead is noticeable (10% slower) and the change is minimal (a few lines of code, plus a new parameter in all subclasses) then let's do it. Only downside of a parameter is that the return type becomes even more dynamic, and it's difficult to statically type check.

Note that Mask R-CNN (our only instance segmentation model) requires both boxes and masks.

isaaccorley · 2025-06-06T14:21:35Z

Yeah I was thinking output types would be [semantic, instance, boxes].

My concern is that instance masks have the ability to take up a ton of memory.

Semantic returns a mask of shape (num_classes, h, w)
Boxes aren't memory intensive (num_objects, 4)
for Instance we are turning boxes, and instance masks of shape (num_objects, h, w)

If num_objects is large (which is common for small object detection) this can take up quite a bit of memory in my experience, so I imagine we shouldn't return all output types and only the one the user wants.

adamjstewart · 2025-06-06T14:23:34Z

I wasn't thinking about the difference between semantic and instance masks, that's a good point. Yeah, we should probably have a knob to control this then.

mayrajeo · 2025-06-09T05:17:53Z

Should the sample keys be consistent so that masks is an empty list or a tensor if semantic segmentation masks are not desired, or depend on the flags specified? Maybe the latter, because it could be possible to add both oriented bounding boxes either as instance masks ([x1, y1, x2, y2, x3, y3, x4, y4]) or detectron2-style ([x_center, y_center, w, h, rotation], -180 <= rotation < 180) annotation format, or keypoints if needed.

Something like

class VectorDataset(GeoDataset):
    ...
    def __init    def __init__(
        self,
        paths: Path | Iterable[Path] = 'data',
        crs: CRS | None = None,
        res: float | tuple[float, float] = (0.0001, 0.0001),
        transforms: Callable[[dict[str, Any]], dict[str, Any]] | None = None,
        masks: bool = True,
        boxes: bool = False,
        instances: bool = False,
        label_name: str | None = None,
        gpkg_layer: str | int | None = None,
    ) -> None:
    ...
    self.masks = masks
    self.boxes = boxes
    self.instances = instances
    ...

    def __getitem__(self, query: BoundingBox) -> dict[str, Any]:
         # Create things depending on self.masks, self.boxes and self.instances
         ...
        sample = {
            'crs': self.crs,
            'bounds': query,
        }
        if self.masks:
            sample['mask'] = masks
       # etc...

maybe?

isaaccorley · 2025-06-09T11:25:33Z

I think it should depend on the flags specified although I don't think we should have a flag, it should be a list of output types that a user can specify.

For OBB I prefer the polygon approach like (x1, y1, ...,x4, y4) instead of using rotation angles just because there's several definitions of the rotation angle and keeping it in polygon format makes easy conversion to<->from shapely/geopandas.

adamjstewart · 2025-06-09T12:47:02Z

The sample keys aren't up to us, they need to follow the format expected by Kornia. Note that Kornia does not yet support OBB, so we need to add this to Kornia first.

mayrajeo · 2025-06-10T07:45:08Z

For the output types, would output_type: List[str] = ['masks'] with possible keys being ['masks', 'boxes', 'instances'] be good? Though masks and instances are mutually exclusive, as kornia seems to expect masks key for both semantic and instance masks correct? Another option might be to specify task, which is one of segmentation, object_detection, instance_segmentation or something like that, since there isn't many cases where only instance masks are needed right?.

Note that Kornia does not yet support OBB, so we need to add this to Kornia first.

They can be thought as polygons, so they need to be converted into binary masks. Getting them from polygons with shapely is easy (oriented_envelope) but they can be omitted for now.

adamjstewart · 2025-06-15T10:44:47Z

See https://github.com/kornia/kornia/blob/v0.8.1/kornia/constants.py#L142 for a list of valid keys. Our TorchGeo object detection trainer is setup to look for bbox_xyxy and label, and our instance segmentation trainer is setup to look for bbox_xyxy, label, and mask. So we should use these keys in the return dict. I'm leaning towards:

task: Literal['object_detection', 'semantic_segmentation', 'instance_segmentation'] = 'semantic_segmentation'`

for the __init__ parameter and default, although those names are quite long.

isaaccorley · 2025-06-16T17:08:56Z

See kornia/kornia@v0.8.1/kornia/constants.py#L142 for a list of valid keys. Our TorchGeo object detection trainer is setup to look for bbox_xyxy and label, and our instance segmentation trainer is setup to look for bbox_xyxy, label, and mask. So we should use these keys in the return dict. I'm leaning towards:
task: Literal['object_detection', 'semantic_segmentation', 'instance_segmentation'] = 'semantic_segmentation'`
for the __init__ parameter and default, although those names are quite long.

They are long but it's likely better to be explicit here tbh. In the future we can add support for multiple output types but let's keep it simple for now. A user can always make separate datasets with different output types and join them.

mayrajeo · 2025-06-17T10:06:44Z

Added tasks, examples here: https://gist.github.com/mayrajeo/596c43c0a0c815e6fc09c1f8e20136cd

Add support for bounding boxes and instance masks to VectorDataset

c50cbc7

github-actions bot added the datasets Geospatial or benchmark datasets label Jun 6, 2025

adamjstewart added this to the 0.8.0 milestone Jun 6, 2025

add task parameter

89f2add

Merge branch 'main' into object-instance-detection

1151372

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for bounding boxes and instance masks to VectorDataset #2819

Add support for bounding boxes and instance masks to VectorDataset #2819

mayrajeo commented Jun 6, 2025

Uh oh!

mayrajeo commented Jun 6, 2025

Uh oh!

isaaccorley commented Jun 6, 2025

Uh oh!

adamjstewart commented Jun 6, 2025

Uh oh!

isaaccorley commented Jun 6, 2025 •

edited

Loading

Uh oh!

adamjstewart commented Jun 6, 2025

Uh oh!

mayrajeo commented Jun 9, 2025

Uh oh!

isaaccorley commented Jun 9, 2025

Uh oh!

adamjstewart commented Jun 9, 2025

Uh oh!

mayrajeo commented Jun 10, 2025

Uh oh!

adamjstewart commented Jun 15, 2025

Uh oh!

isaaccorley commented Jun 16, 2025

Uh oh!

mayrajeo commented Jun 17, 2025

Uh oh!

Uh oh!

Add support for bounding boxes and instance masks to VectorDataset #2819

Are you sure you want to change the base?

Add support for bounding boxes and instance masks to VectorDataset #2819

Conversation

mayrajeo commented Jun 6, 2025

Uh oh!

mayrajeo commented Jun 6, 2025

Uh oh!

isaaccorley commented Jun 6, 2025

Uh oh!

adamjstewart commented Jun 6, 2025

Uh oh!

isaaccorley commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamjstewart commented Jun 6, 2025

Uh oh!

mayrajeo commented Jun 9, 2025

Uh oh!

isaaccorley commented Jun 9, 2025

Uh oh!

adamjstewart commented Jun 9, 2025

Uh oh!

mayrajeo commented Jun 10, 2025

Uh oh!

adamjstewart commented Jun 15, 2025

Uh oh!

isaaccorley commented Jun 16, 2025

Uh oh!

mayrajeo commented Jun 17, 2025

Uh oh!

Uh oh!

isaaccorley commented Jun 6, 2025 •

edited

Loading