[multimodal] Add Foundation Model for Object Detection #3164

FANGAreNotGnu · 2023-04-19T22:10:54Z

Integrate GroundingDino into Autogluon to support open vocabulary detection.
Add open vocabulary detection problem type.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

review-notebook-app · 2023-04-21T23:09:42Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

zhiqiangdon

Per our offline discussion, we don't integrate the source code of grounding-dino into our repo. Instead, we let users install grounding-dino from source. Besides, we can submit two issues in the grounding-dino repo regarding the Pypi release and supporting tokenized tensor input for model.

github-actions · 2023-04-26T03:40:07Z

Job PR-3164-67d55be is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/67d55be/index.html

.github/workflow_scripts/env_setup.sh

examples/automm/ovd/ovd_demo.py

multimodal/src/autogluon/multimodal/configs/model/fusion_mlp_image_text_tabular.yaml

...odal/src/autogluon/multimodal/configs/pretrain/ovd/grounding_dino/GroundingDINO_SwinB.cfg.py

multimodal/src/autogluon/multimodal/data/preprocess_dataframe.py

multimodal/src/autogluon/multimodal/optimization/utils.py

multimodal/src/autogluon/multimodal/presets.py

zhiqiangdon · 2023-05-04T00:45:37Z

multimodal/src/autogluon/multimodal/utils/data.py

+        elif per_name == OVD:
+            # create a multimodal processor for OVD.
+            data_processors[OVD].append(
+                create_data_processor(
+                    data_type=OVD,
+                    config=config,
+                    model=per_model,
+                )
+            )
+            if data_types is not None and IMAGE in data_types:
+                data_types.remove(IMAGE)


OVD is not a data type. If there are no correlations for processing image and text, we can create an image processor and a text processor for the ovd model.

There will be correlations for processing image and bounding boxes if we support training. Will refactor out text processor in next PR. And will keep OVD data type until we combine this and traditional detection (mmdet) to both use ROIS data type.

multimodal/src/autogluon/multimodal/data/process_ovd.py

zhiqiangdon · 2023-05-04T00:59:36Z

multimodal/src/autogluon/multimodal/predictor.py

@@ -2225,6 +2235,12 @@ def predict(
                    detection_classes=self._model.model.CLASSES,
                    result_path=None,
                )
+            elif self._problem_type == OPEN_VOCABULARY_OBJECT_DETECTION:
+                pred = save_ovd_result_df(


Why do we need to save results for ovd? result_path=None means not saving, but the function name is still misleading. By default, we need to return the same format as input.

We are returning dict if as_pandas is False. Here we reuse save_ovd_result_df that both formatting result to df and save the df (if result_path is not None). Later we will add save result feature for ovd (using this function) together with evaluation for ovd.

save_ovd_result_df seems to always return a panda dataframe, even though the input data is a dict.

Yes, that's also true for object detection. We only save panda dataframe. But will return list of dicts if input is not a dataframe and as_pandas=False.

github-actions · 2023-05-04T01:06:10Z

Job PR-3164-669fec5 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/669fec5/index.html

github-actions · 2023-05-06T02:15:12Z

Job PR-3164-4a02883 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/4a02883/index.html

github-actions · 2023-05-08T21:26:20Z

Job PR-3164-ac5478b is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/ac5478b/index.html

github-actions · 2023-05-08T21:31:11Z

Job PR-3164-9bdb722 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/9bdb722/index.html

multimodal/src/autogluon/multimodal/utils/ovd.py

multimodal/src/autogluon/multimodal/data/process_ovd.py

multimodal/src/autogluon/multimodal/data/utils.py

multimodal/src/autogluon/multimodal/utils/metric.py

github-actions · 2023-05-10T23:15:34Z

Job PR-3164-6f247cd is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/6f247cd/index.html

docs/tutorials/multimodal/object_detection/quick_start/quick_start_ovd.ipynb

github-actions · 2023-05-12T01:54:16Z

Job PR-3164-d8246ff is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/d8246ff/index.html

github-actions · 2023-05-12T01:59:48Z

Job PR-3164-3a09673 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/3a09673/index.html

github-actions · 2023-05-12T22:49:33Z

Job PR-3164-a7a9891 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/a7a9891/index.html

github-actions · 2023-05-15T10:04:09Z

Job PR-3164-0221d57 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3164/0221d57/index.html

zhiqiangdon

LGTM. Considering adding a ovd unit test later.

FANGAreNotGnu added 4 commits April 19, 2023 22:09

init code structure

7b2113c

batch inference ready

6f2f40c

ovd inference done.

5fd3f31

fix bbox format

fb08ac9

fix picker

95de36d

FANGAreNotGnu added model list checked You have updated the model list after modifying multimodal unit tests/docs and removed model list checked You have updated the model list after modifying multimodal unit tests/docs labels Apr 21, 2023

FANGAreNotGnu added 6 commits April 21, 2023 23:46

black

584be5b

black

5c2775c

fix description fr util function

88c4eb7

lint

72517b7

lint and doc index

d56f76a

Merge https://github.com/autogluon/autogluon into zeroshot_detection

b8e92e6

zhiqiangdon reviewed Apr 24, 2023

View reviewed changes

FANGAreNotGnu added 2 commits April 25, 2023 20:49

remove grounding dino source code

42a64bb

fix to import based on source code structure

cec2f93

FANGAreNotGnu added the model list checked You have updated the model list after modifying multimodal unit tests/docs label Apr 25, 2023

FANGAreNotGnu added 4 commits April 25, 2023 21:45

Merge https://github.com/autogluon/autogluon into zeroshot_detection

cebede8

remove import in setup.py

7652cb5

add setup in workflow, add try except block for import

b5f730f

fix typo

67d55be

FANGAreNotGnu added 7 commits April 26, 2023 20:19

update doc and index

e569a52

update numgpu=1 in preset for ovd

5efd05e

update doc

8ec5f42

update ovd swinb config

ef2a4be

refine code

0c159a6

Merge https://github.com/autogluon/autogluon into zeroshot_detection

b919011

fix index

dcccae5

FANGAreNotGnu requested a review from zhiqiangdon May 3, 2023 22:28

zhiqiangdon reviewed May 4, 2023

View reviewed changes

resolve comments

4a02883

FANGAreNotGnu added 2 commits May 8, 2023 18:56

edit the import for ovd doc

ac5478b

Merge https://github.com/autogluon/autogluon into zeroshot_detection

9bdb722

FANGAreNotGnu requested a review from zhiqiangdon May 8, 2023 18:58

zhiqiangdon reviewed May 10, 2023

View reviewed changes

multimodal/src/autogluon/multimodal/utils/ovd.py Outdated Show resolved Hide resolved

zhiqiangdon reviewed May 10, 2023

View reviewed changes

multimodal/src/autogluon/multimodal/data/process_ovd.py Outdated Show resolved Hide resolved

minor fix

6f247cd

FANGAreNotGnu requested a review from zhiqiangdon May 10, 2023 21:21

zhiqiangdon reviewed May 10, 2023

View reviewed changes

multimodal/src/autogluon/multimodal/data/utils.py Outdated Show resolved Hide resolved

zhiqiangdon reviewed May 10, 2023

View reviewed changes

multimodal/src/autogluon/multimodal/utils/metric.py Outdated Show resolved Hide resolved

zhiqiangdon reviewed May 11, 2023

View reviewed changes

docs/tutorials/multimodal/object_detection/quick_start/quick_start_ovd.ipynb Outdated Show resolved Hide resolved

zhiqiangdon reviewed May 11, 2023

View reviewed changes

docs/tutorials/multimodal/object_detection/quick_start/quick_start_ovd.ipynb Show resolved Hide resolved

FANGAreNotGnu added 2 commits May 11, 2023 23:15

fix tutorial and example

d8246ff

change if or to if in list

3a09673

remove is_image_input

a7a9891

remove debugging code

0221d57

FANGAreNotGnu requested a review from zhiqiangdon May 15, 2023 18:09

zhiqiangdon approved these changes May 15, 2023

View reviewed changes

zhiqiangdon merged commit ee0967d into autogluon:master May 15, 2023
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[multimodal] Add Foundation Model for Object Detection #3164

[multimodal] Add Foundation Model for Object Detection #3164

FANGAreNotGnu commented Apr 19, 2023 •

edited

review-notebook-app bot commented Apr 21, 2023

zhiqiangdon left a comment

github-actions bot commented Apr 26, 2023

zhiqiangdon May 4, 2023

FANGAreNotGnu May 5, 2023

zhiqiangdon May 4, 2023

FANGAreNotGnu May 5, 2023 •

edited

zhiqiangdon May 10, 2023

FANGAreNotGnu May 10, 2023 •

edited

github-actions bot commented May 4, 2023

github-actions bot commented May 6, 2023

github-actions bot commented May 8, 2023

github-actions bot commented May 8, 2023

github-actions bot commented May 10, 2023

github-actions bot commented May 12, 2023

github-actions bot commented May 12, 2023

github-actions bot commented May 12, 2023

github-actions bot commented May 15, 2023

zhiqiangdon left a comment

[multimodal] Add Foundation Model for Object Detection #3164

[multimodal] Add Foundation Model for Object Detection #3164

Conversation

FANGAreNotGnu commented Apr 19, 2023 • edited

review-notebook-app bot commented Apr 21, 2023

zhiqiangdon left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 26, 2023

zhiqiangdon May 4, 2023

Choose a reason for hiding this comment

FANGAreNotGnu May 5, 2023

Choose a reason for hiding this comment

zhiqiangdon May 4, 2023

Choose a reason for hiding this comment

FANGAreNotGnu May 5, 2023 • edited

Choose a reason for hiding this comment

zhiqiangdon May 10, 2023

Choose a reason for hiding this comment

FANGAreNotGnu May 10, 2023 • edited

Choose a reason for hiding this comment

github-actions bot commented May 4, 2023

github-actions bot commented May 6, 2023

github-actions bot commented May 8, 2023

github-actions bot commented May 8, 2023

github-actions bot commented May 10, 2023

github-actions bot commented May 12, 2023

github-actions bot commented May 12, 2023

github-actions bot commented May 12, 2023

github-actions bot commented May 15, 2023

zhiqiangdon left a comment

Choose a reason for hiding this comment

FANGAreNotGnu commented Apr 19, 2023 •

edited

FANGAreNotGnu May 5, 2023 •

edited

FANGAreNotGnu May 10, 2023 •

edited