improve COCO prototype #4650

pmeier · 2021-10-19T08:17:29Z

This adds a few improvements:

Add the ability to select which annotations to load, e.g. instances (default) or captions. None is a special value that will only load the images which is much faster if the annotations are not needed. This is for example useful for unsupervised training.
Provide categories for each bounding box.
Add tests.

Add support for segmentation masks. Now you can do something like this:

from torchvision.prototype import datasets
from torchvision.utils import draw_segmentation_masks

for sample in datasets.load("coco"):
    draw_segmentation_masks(sample["image"], sample["segmentations"])
    break

cc @pmeier @bjuncek

torchvision/prototype/datasets/_builtin/coco.categories

torchvision/prototype/datasets/_builtin/coco.py

torchvision/prototype/datasets/_builtin/coco.categories

pmeier · 2021-11-16T08:04:56Z

The new implementation is working, but is a major perf regression. Cold start now takes ~20 minutes on my system. With minmal fake data warm up is minimal, so I'm guessing one or more things I used in the implementation does not scale to more inputs.

torchvision/prototype/datasets/_builtin/coco.py

pmeier · 2021-11-18T15:06:22Z

I've added COCO to the benchmarks. Running

python -m torchvision.prototype.datasets.benchmark -n1 --no-start coco

with the patch described in #4650 (comment) gives:

################################################################################
coco (instances)
legacy iteration 185.052 it/s
new iteration 152.523 it/s
################################################################################
coco (captions)
legacy iteration 240.053 it/s
new iteration 198.828 it/s

fmassa

LGTM, thanks!

fmassa · 2021-11-30T16:57:33Z

torchvision/prototype/datasets/_builtin/coco.py

+            ),
+            areas=torch.tensor([ann["area"] for ann in anns]),
+            crowds=torch.tensor([ann["iscrowd"] for ann in anns], dtype=torch.bool),
+            bounding_boxes=BoundingBox(


For a future PR, I think it might be preferable to rename the class to BoundingBoxes as we hold more than one box now.

fmassa · 2021-11-30T16:58:32Z

torchvision/prototype/datasets/_builtin/coco.py

+            segmentations=torch.stack(
+                [
+                    self._segmentation_to_mask(ann["segmentation"], is_crowd=ann["iscrowd"], image_size=image_size)
+                    for ann in anns
+                ]
+            ),


Yes, we can even have a custom class that holds the raw polygons, and knows how to convert itself if needed. But can be discussed at a separate stage.

fmassa · 2021-11-30T16:59:39Z

ONNX-test error seems suspicious, but probably unrelated. I'm merging this but let's keep an eye on it

Summary: * improve COCO prototype * test 2017 annotations * add option to include captions * fix categories and add tests * cleanup * add correct image size to bounding boxes * fix annotation collation * appease mypy * add benchmark * always use image as reference * another refactor * add support for segmentations * add support for segmentations * fix CI dependencies Reviewed By: NicolasHug Differential Revision: D32759200 fbshipit-source-id: 9033e959a1014761541ec2959ec5647eaccf5d0a

improve COCO prototype

664a80f

pmeier added enhancement module: datasets prototype labels Oct 19, 2021

pmeier requested a review from datumbox October 19, 2021 08:17

pytorch-probot bot added the ciflow/default label Oct 19, 2021

facebook-github-bot added the cla signed label Oct 19, 2021

pmeier commented Oct 19, 2021

View reviewed changes

torchvision/prototype/datasets/_builtin/coco.categories Outdated Show resolved Hide resolved

torchvision/prototype/datasets/_builtin/coco.py Outdated Show resolved Hide resolved

datumbox reviewed Oct 19, 2021

View reviewed changes

torchvision/prototype/datasets/_builtin/coco.categories Outdated Show resolved Hide resolved

mszhanyi mentioned this pull request Oct 19, 2021

check if installed torch with cuda #4639

Open

pmeier added 3 commits October 19, 2021 14:40

test 2017 annotations

5a852c6

Merge branch 'main' into datasets/improve-coco

66a31df

add option to include captions

9bd8c4b

pmeier marked this pull request as draft October 20, 2021 06:34

pmeier added 4 commits November 15, 2021 10:14

Merge branch 'main' into datasets/improve-coco

d5500fa

fix categories and add tests

e3ee82d

Merge branch 'main' into datasets/improve-coco

2a2349c

cleanup

5567c47

pmeier requested review from datumbox and fmassa November 15, 2021 15:26

pmeier marked this pull request as ready for review November 15, 2021 15:26

add correct image size to bounding boxes

a6b55a3

pmeier requested a review from ejguan November 16, 2021 08:05

ejguan reviewed Nov 16, 2021

View reviewed changes

torchvision/prototype/datasets/_builtin/coco.py Outdated Show resolved Hide resolved

torchvision/prototype/datasets/_builtin/coco.py Outdated Show resolved Hide resolved

torchvision/prototype/datasets/_builtin/coco.py Outdated Show resolved Hide resolved

fix annotation collation

4e89013

ejguan mentioned this pull request Nov 17, 2021

[DataPipe] Grouper causes perf regression pytorch/pytorch#68539

Closed

pmeier added 3 commits November 18, 2021 09:12

Merge branch 'main' into datasets/improve-coco

33fc0a7

appease mypy

12dd776

Merge branch 'main' into datasets/improve-coco

756a7a4

fmassa reviewed Nov 18, 2021

View reviewed changes

torchvision/prototype/datasets/_builtin/coco.py Outdated Show resolved Hide resolved

torchvision/prototype/datasets/_builtin/coco.py Outdated Show resolved Hide resolved

torchvision/prototype/datasets/_builtin/coco.py Outdated Show resolved Hide resolved

add benchmark

0b0f958

pmeier added 8 commits November 23, 2021 10:21

always use image as reference

56f694e

Merge branch 'main' into datasets/improve-coco

77ae2b9

another refactor

5cb4a5d

Merge branch 'main' into datasets/improve-coco

1abbb3c

add support for segmentations

550087e

add support for segmentations

ab54d7f

Merge branch 'main' into datasets/improve-coco

2260a95

fix CI dependencies

a272555

pmeier requested a review from fmassa November 25, 2021 18:36

Merge branch 'main' into datasets/improve-coco

6eb4238

fmassa approved these changes Nov 30, 2021

View reviewed changes

fmassa merged commit 39cf02a into main Nov 30, 2021

fmassa deleted the datasets/improve-coco branch November 30, 2021 16:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve COCO prototype #4650

improve COCO prototype #4650

Uh oh!

pmeier commented Oct 19, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pmeier commented Nov 16, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pmeier commented Nov 18, 2021

Uh oh!

fmassa left a comment

Uh oh!

fmassa Nov 30, 2021

Uh oh!

fmassa Nov 30, 2021

Uh oh!

fmassa commented Nov 30, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

improve COCO prototype #4650

improve COCO prototype #4650

Uh oh!

Conversation

pmeier commented Oct 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pmeier commented Nov 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pmeier commented Nov 18, 2021

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

fmassa Nov 30, 2021

Choose a reason for hiding this comment

Uh oh!

fmassa Nov 30, 2021

Choose a reason for hiding this comment

Uh oh!

fmassa commented Nov 30, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pmeier commented Oct 19, 2021 •

edited

Loading

pmeier commented Nov 16, 2021 •

edited

Loading