Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pipeline from matcher initialization API #2569

Merged
merged 6 commits into from
Dec 17, 2022

Conversation

zhiqiangdon
Copy link
Contributor

@zhiqiangdon zhiqiangdon commented Dec 14, 2022

Issue #, if available:
Need to align some details between matcher and predictor before fully merging them.

Description of changes:

  1. Remove pipeline from the matcher initialization API.
  2. Remove pipeline from infer_metrics(). We still need to know whether it's a matching task. But no longer require the data modalities in matching.
  3. For multiclass and regression problem types, we will use spearman correlation as suggested in papers https://arxiv.org/pdf/1908.10084.pdf, https://arxiv.org/pdf/2010.08240.pdf, and https://arxiv.org/pdf/2004.09813.pdf. This PR adds a metric placeholder for such problem types. Will have following PRs to support the metric in matching.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@sxjscience
Copy link
Collaborator

Is it a temporary solution? We should later unify the design of problem_type.

@github-actions
Copy link

Job PR-2569-c54b2cc is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2569/c54b2cc/index.html

@zhiqiangdon
Copy link
Contributor Author

Is it a temporary solution? We should later unify the design of problem_type.

According to our general design discussions, predictor may need the presets argument to better support various presets. This can also avoid the conflicts between some new problem types and traditional problem types.

@zhiqiangdon zhiqiangdon changed the title Replace matcher pipeline with presets Replace pipeline from matcher initialization API Dec 15, 2022
@zhiqiangdon zhiqiangdon changed the title Replace pipeline from matcher initialization API Remove pipeline from matcher initialization API Dec 15, 2022
@zhiqiangdon
Copy link
Contributor Author

Is it a temporary solution? We should later unify the design of problem_type.

PR updated. Now matcher uses only problem_type to specify image_similarity, text_similarity, and image_text_similarity.

@github-actions
Copy link

Job PR-2569-d423d4b is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2569/d423d4b/index.html

Copy link
Contributor

@bryanyzhu bryanyzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@cheungdaven cheungdaven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -501,3 +502,7 @@ def convert_data_for_ranking(
response_data = pd.DataFrame({response_column: data[response_column].unique().tolist()})

return data_with_label, query_data, response_data, label_column


def is_matching(pipeline: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use

problem_property_dict = OrderedDict(
?

problem_property_dict[pipeline].is_matching

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is circular import issue if importing problem_property_dict from predictor.py. We may refactor this later.

self._problem_type = problem_type.lower() if problem_type is not None else None
self._pipeline = pipeline.lower() if pipeline is not None else None
self._problem_type = None # always infer problem type for matching.
self._pipeline = problem_type.lower() if problem_type is not None else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next PR, we can remove pipeline.

@sxjscience
Copy link
Collaborator

We can refactor the implementation further after #2578

@github-actions
Copy link

Job PR-2569-f723802 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2569/f723802/index.html

@zhiqiangdon zhiqiangdon merged commit 6d655cd into autogluon:master Dec 17, 2022
@zhiqiangdon zhiqiangdon deleted the mm-matcher branch December 19, 2022 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants