[Refactor][Bug fix] Fix fit_summary + move problem_type outside predictor.py #2578

sxjscience · 2022-12-16T03:54:19Z

Issue #, if available:

Description of changes:

Fix fit_summary in MultiModalPredictor. Previously, calling predictor.fit_summary() won't work after you have loaded the predictor from folder.
Move problem_type registry outside of predictor.py. In addition, register problem type to ease future refactoring.
Revise tutorial and fix typo in multimodal NER tutorial filename.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

multimodal/src/autogluon/multimodal/predictor.py

zhiqiangdon · 2022-12-16T19:04:06Z

multimodal/src/autogluon/multimodal/problem_types.py

+
+    name: str  # Name of the problem
+    support_fit: bool = True  # Whether the problem type support `.fit()`
+    inference_ready: bool = False  # Support `.predict()` and `.evaluate()` without calling `.fit()`


I think inference_ready is better to be a state of predictor.

This indicates whether the problem supports zero-shot inference.

How about using name support_zero_shot? I think we may need inference_ready in predictor in later refactoring.

Sounds good. support_zero_shot is reasonable.

multimodal/src/autogluon/multimodal/predictor.py

github-actions · 2022-12-20T03:40:47Z

Job PR-2578-27d2029 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2578/27d2029/index.html

bryanyzhu · 2022-12-20T04:36:13Z

docs/tutorials/multimodal/multimodal-faq.md

+
+Usually you do not need to preprocess the text / image data. AutoGluon Multimodal has built-in 
+support of text / image preprocessing. However, this won't block you from appending custom preprocessing logic before 
+feeding in the dataframe to AutoGluon Multimodal.


However, this won't block you from appending custom preprocessing logic before feeding in the dataframe to AutoGluon Multimodal. Just want to confirm, if user already preprocess the image (e.g., minus 255, divided by customized mean, or data augmentation), will AutoGluon do another round of preprocessing and data augmentation?

we will do another round of normalization. As in

autogluon/multimodal/src/autogluon/multimodal/data/process_image.py

Line 313 in 3f0e36d

processor.append(self.normalization)

I can add a flag to disable normalization in a later PR. I think this is a good point.

bryanyzhu

Thanks for the refactor, looks good to me with a question.

zhiqiangdon · 2022-12-20T05:22:10Z

multimodal/src/autogluon/multimodal/registry.py

+from typing import OrderedDict as t_OrderedDict
+from typing import Union


The two lines are not used?

t_OrderedDict is used for type hint. Will remove Union.

multimodal/src/autogluon/multimodal/predictor.py

zhiqiangdon · 2022-12-20T05:53:34Z

multimodal/src/autogluon/multimodal/predictor.py

@@ -1152,7 +1119,7 @@ def _fit(
        )

        config = update_config_by_rules(
-            problem_type=self._problem_type,
+            problem_type=self.problem_type,


I see all self._problem_type are replaced by self.problem_type. Mind to tell the reasons?

Just to access it via the property call. Both self._problem_type and self.problem_type should be okay.

Right. Currently, self.problem_type has the same value as self._problem_type. But since the property is more user-faced, there may be some cases they are not exactly the same. I think using self._problem_type internally may be a better choice to avoid possible changes.

I can also change all self.problem_type to self._problem_type so that it will be consistent.

Sounds good. Thanks.

One advantage of self.problem_type over self._problem_type is that you can never use self.problem_type = XXX which accidentally overwrite the value.

github-actions · 2022-12-20T19:40:50Z

Job PR-2578-88c338f is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2578/88c338f/index.html

cheungdaven · 2022-12-20T19:37:37Z

multimodal/src/autogluon/multimodal/problem_types.py

+        support_zero_shot=True,
+        is_matching=True,
+        supported_modality_type={TEXT},
+        supported_label_type={CATEGORICAL, NUMERICAL},


For matching problems, labels can be absent. Will it support this case?

For those case, we can treat the label type as BINARY. But we need to submit a follow-up PR to refactor the code-base to better use these flags.

The logic of no label is different from binary label. We'd better distinguish between them.

Actually there are multiple concepts: modality type, column type and problem type. This is the first PR to split the problem type into a separate file. Need to submit follow-up PRs.

Matching itself also needs major refactor so I haven’t touched it extensively

cheungdaven · 2022-12-20T19:41:34Z

multimodal/src/autogluon/multimodal/problem_types.py

+    IMAGE_SIMILARITY,
+    IMAGE_TEXT_SIMILARITY,
+    MULTICLASS,
+    NAMED_ENTITY_RECOGNITION,


I do not recall I used this constant: "NAMED_ENTITY_RECOGNITION". Shall we use NER only?

Both NER and NAMED_ENTITY_RECOGNITION will be mapped to NER.

You can view NAMED_ENTITY_RECOGNITION as the alias of NER problem type.

zhiqiangdon

LGTM!

bryanyzhu

LGTM

github-actions · 2022-12-21T19:16:25Z

Job PR-2578-69f61a0 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2578/69f61a0/index.html

sxjscience added 7 commits December 15, 2022 19:49

refactor: move problem type outside predictior

c3e1274

fix coding style with black

c3579e3

Update test_predictor.py

fd639cc

fix lint

67e7578

fix lint

504b970

Update predictor.py

610794b

fix

28ec2fb

zhiqiangdon reviewed Dec 16, 2022

View reviewed changes

multimodal/src/autogluon/multimodal/predictor.py Outdated Show resolved Hide resolved

zhiqiangdon reviewed Dec 16, 2022

View reviewed changes

sxjscience added 4 commits December 16, 2022 11:57

fix

c7f73da

fix lint

8801fb3

Fix lint

8574721

update

4be5d1d

zhiqiangdon reviewed Dec 17, 2022

View reviewed changes

multimodal/src/autogluon/multimodal/predictor.py Show resolved Hide resolved

update

d2b9b8a

sxjscience mentioned this pull request Dec 17, 2022

Remove pipeline from matcher initialization API #2569

Merged

sxjscience added 7 commits December 16, 2022 21:26

Merge remote-tracking branch 'upstream/master' into fix_fit_summary

de9d67c

Update predictor.py

17d99db

Update environment.py

8f1ae63

Update predictor.py

a4c3bcc

fix

c60eafb

Update predictor.py

35db055

fix

27d2029

bryanyzhu reviewed Dec 20, 2022

View reviewed changes

zhiqiangdon reviewed Dec 20, 2022

View reviewed changes

multimodal/src/autogluon/multimodal/predictor.py Show resolved Hide resolved

zhiqiangdon reviewed Dec 20, 2022

View reviewed changes

multimodal/src/autogluon/multimodal/predictor.py Show resolved Hide resolved

zhiqiangdon reviewed Dec 20, 2022

View reviewed changes

sxjscience added 2 commits December 20, 2022 10:12

update

88c338f

remove unused import

06620b0

cheungdaven reviewed Dec 20, 2022

View reviewed changes

sxjscience added 2 commits December 20, 2022 12:07

update

8b5f16d

Merge remote-tracking branch 'upstream/master' into fix_fit_summary

76ee489

zhiqiangdon approved these changes Dec 21, 2022

View reviewed changes

bryanyzhu approved these changes Dec 21, 2022

View reviewed changes

Merge remote-tracking branch 'upstream/master' into fix_fit_summary

69f61a0

sxjscience merged commit e306b3d into autogluon:master Dec 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor][Bug fix] Fix fit_summary + move problem_type outside predictor.py #2578

[Refactor][Bug fix] Fix fit_summary + move problem_type outside predictor.py #2578

sxjscience commented Dec 16, 2022 •

edited

zhiqiangdon Dec 16, 2022

sxjscience Dec 17, 2022

zhiqiangdon Dec 20, 2022

sxjscience Dec 20, 2022

github-actions bot commented Dec 20, 2022

bryanyzhu Dec 20, 2022

sxjscience Dec 20, 2022

sxjscience Dec 20, 2022

bryanyzhu left a comment

zhiqiangdon Dec 20, 2022

sxjscience Dec 20, 2022

zhiqiangdon Dec 20, 2022 •

edited

sxjscience Dec 20, 2022

zhiqiangdon Dec 20, 2022 •

edited

sxjscience Dec 20, 2022

zhiqiangdon Dec 20, 2022

sxjscience Dec 20, 2022

github-actions bot commented Dec 20, 2022

cheungdaven Dec 20, 2022

sxjscience Dec 20, 2022

zhiqiangdon Dec 20, 2022

sxjscience Dec 20, 2022

sxjscience Dec 20, 2022

cheungdaven Dec 20, 2022

sxjscience Dec 20, 2022

sxjscience Dec 20, 2022

zhiqiangdon left a comment

bryanyzhu left a comment

github-actions bot commented Dec 21, 2022

		from typing import OrderedDict as t_OrderedDict
		from typing import Union

[Refactor][Bug fix] Fix fit_summary + move problem_type outside predictor.py #2578

[Refactor][Bug fix] Fix fit_summary + move problem_type outside predictor.py #2578

Conversation

sxjscience commented Dec 16, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 20, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bryanyzhu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiqiangdon Dec 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiqiangdon Dec 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 20, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiqiangdon left a comment

Choose a reason for hiding this comment

bryanyzhu left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 21, 2022

sxjscience commented Dec 16, 2022 •

edited

zhiqiangdon Dec 20, 2022 •

edited

zhiqiangdon Dec 20, 2022 •

edited