Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoMM] Support using data path in fit() #3006

Merged
merged 4 commits into from
Mar 7, 2023

Conversation

zhiqiangdon
Copy link
Contributor

Issue #, if available:

Description of changes:
Support passing path of training data in fit().

predictor = MultiModalPredictor(label="label")
predictor.fit(train_data=train_data_path)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@zhiqiangdon zhiqiangdon requested a review from liangfu March 6, 2023 06:15
@zhiqiangdon zhiqiangdon added the model list checked You have updated the model list after modifying multimodal unit tests/docs label Mar 6, 2023
Copy link
Contributor

@suzhoum suzhoum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great feature!

Copy link
Contributor

@FANGAreNotGnu FANGAreNotGnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link

github-actions bot commented Mar 6, 2023

Job PR-3006-7c81474 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3006/7c81474/index.html

Comment on lines 579 to 586
def split_train_tuning_data(
train_data: Union[pd.DataFrame, str],
tuning_data: Optional[Union[pd.DataFrame, str]] = None,
holdout_frac: Optional[float] = None,
is_classification: Optional[bool] = False,
label_column: Optional[str] = None,
seed: Optional[int] = 123,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data loading should not be done during data splitting. This is not the logical place for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have considered loading data in predictor.py before calling split_train_tuning_data, but I need repeat the same code of loading data in matcher.py, which increase the boilerplate code. How about changing this function name to prepare_train_tuning_data to make it more than just splitting data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering PR #3004, move it outside of split_train_tuning_data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Innixma Any further questions on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is moved outside, I am ok to merge.

The ideal solution long term is using a mix-in or having both predictor.py and matcher.py inherit from the same abstract class. This would avoid code dupe.

@github-actions
Copy link

github-actions bot commented Mar 7, 2023

Job PR-3006-1f12920 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3006/1f12920/index.html

@github-actions
Copy link

github-actions bot commented Mar 7, 2023

Job PR-3006-13509b2 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3006/13509b2/index.html

Copy link
Contributor

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@zhiqiangdon zhiqiangdon merged commit 67e6e83 into autogluon:master Mar 7, 2023
@zhiqiangdon zhiqiangdon deleted the mm-fix branch March 10, 2023 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model list checked You have updated the model list after modifying multimodal unit tests/docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants