AutoMM: Fix crash due to undefined DataContainer #2630

Innixma · 2023-01-03T23:44:27Z

Issue #, if available:

Description of changes:

Fix crash due to undefined DataContainer on MacOS without GPU

Example:

/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py:96: UserWarning: Only CPU is detected in the instance. This may result in slow speed for MultiModalPredictor. Consider using an instance with GPU support.
  warnings.warn(
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  | Name              | Type                | Params
----------------------------------------------------------
0 | model             | MultimodalFusionMLP | 109 M 
1 | validation_metric | AUROC               | 0     
2 | loss_func         | CrossEntropyLoss    | 0     
----------------------------------------------------------
109 M     Trainable params
0         Non-trainable params
109 M     Total params
439.137   Total estimated model params size (MB)
/Users/neerick/workspace/virtual/autogluon38/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1892: PossibleUserWarning: The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  rank_zero_warn(
Epoch 0, global step 1: 'val_roc_auc' reached 1.00000 (best 1.00000), saving model to '/Users/neerick/workspace/code/autogluon-scratch/scripts/AutogluonModels/ag-20230103_230805/models/MultiModalPredictor/automm_model/epoch=0-step=1.ckpt' as top 3
Epoch 1, global step 2: 'val_roc_auc' reached 1.00000 (best 1.00000), saving model to '/Users/neerick/workspace/code/autogluon-scratch/scripts/AutogluonModels/ag-20230103_230805/models/MultiModalPredictor/automm_model/epoch=1-step=2.ckpt' as top 3
Epoch 2, global step 3: 'val_roc_auc' reached 1.00000 (best 1.00000), saving model to '/Users/neerick/workspace/code/autogluon-scratch/scripts/AutogluonModels/ag-20230103_230805/models/MultiModalPredictor/automm_model/epoch=2-step=3.ckpt' as top 3
Epoch 3, global step 4: 'val_roc_auc' was not in top 3
Epoch 4, global step 5: 'val_roc_auc' was not in top 3
Epoch 5, global step 6: 'val_roc_auc' was not in top 3
Epoch 6, global step 7: 'val_roc_auc' was not in top 3
Epoch 7, global step 8: 'val_roc_auc' was not in top 3
Epoch 8, global step 9: 'val_roc_auc' was not in top 3
Epoch 9, global step 10: 'val_roc_auc' was not in top 3
`Trainer.fit` stopped: `max_epochs=10` reached.
	Warning: Exception caused MultiModalPredictor to fail during training... Skipping this model.
		name 'DataContainer' is not defined
Detailed Traceback:
Traceback (most recent call last):
  File "/Users/neerick/workspace/code/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 1422, in _train_and_save
    model = self._train_single(X, y, model, X_val, y_val, total_resources=total_resources, **model_fit_kwargs)
  File "/Users/neerick/workspace/code/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 1367, in _train_single
    model = model.fit(X=X, y=y, X_val=X_val, y_val=y_val, total_resources=total_resources, **model_fit_kwargs)
  File "/Users/neerick/workspace/code/autogluon/core/src/autogluon/core/models/abstract/abstract_model.py", line 703, in fit
    out = self._fit(**kwargs)
  File "/Users/neerick/workspace/code/autogluon/tabular/src/autogluon/tabular/models/automm/automm_model.py", line 180, in _fit
    self.model.fit(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 848, in fit
    self._fit(**_fit_args)
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 1517, in _fit
    self._top_k_average(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 1593, in _top_k_average
    best_score = self.evaluate(val_df, [validation_metric_name])[validation_metric_name]
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 2044, in evaluate
    outputs = predict(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/inference.py", line 525, in predict
    outputs = realtime_predict(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/inference.py", line 415, in realtime_predict
    output = infer_batch(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/inference.py", line 165, in infer_batch
    output = move_to_device(output, device=torch.device("cpu"))
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py", line 151, in move_to_device
    res[k] = move_to_device(v, device)
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py", line 151, in move_to_device
    res[k] = move_to_device(v, device)
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py", line 158, in move_to_device
    elif isinstance(obj, (int, float, str, DataContainer)):
NameError: name 'DataContainer' is not defined

Script:

if __name__ == '__main__':
    from autogluon.tabular import TabularPredictor, TabularDataset
    path_prefix = 'https://autogluon.s3.amazonaws.com/datasets/AdultIncomeBinaryClassification/'
    path_train = path_prefix + 'train_data.csv'
    path_test = path_prefix + 'test_data.csv'

    label = 'class'
    sample = 10  # Number of rows to use to train / infer
    train_data = TabularDataset(path_train)

    if sample is not None and (sample < len(train_data)):
        train_data = train_data.sample(n=sample, random_state=0).reset_index(drop=True)

    test_data = TabularDataset(path_test)
    fit_kwargs = dict(
        train_data=train_data,
        hyperparameters={
            'AG_AUTOMM': {"optimization.max_epochs": 1,},
        },
        # time_limit=120,
        # num_bag_folds=2,
        # num_stack_levels=1,
        # hyperparameter_tune_kwargs='auto',
    )
    predictor = TabularPredictor(
        label=label,
        eval_metric='roc_auc',
        verbosity=2,
    )
    predictor.fit(**fit_kwargs)

    leaderboard = predictor.leaderboard(test_data)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

zhiqiangdon

LGTM!

github-actions · 2023-01-04T01:27:18Z

Job PR-2630-82b71a9 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2630/82b71a9/index.html

Fix crash due to undefined DataContainer

cf0595d

Innixma added this to the 0.6.2 Release milestone Jan 3, 2023

Innixma requested a review from zhiqiangdon January 3, 2023 23:44

fixed lint error

82b71a9

zhiqiangdon approved these changes Jan 4, 2023

View reviewed changes

Innixma merged commit b418cab into master Jan 4, 2023

Innixma deleted the fix_missing_datacontainer branch January 18, 2023 18:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoMM: Fix crash due to undefined DataContainer #2630

AutoMM: Fix crash due to undefined DataContainer #2630

Innixma commented Jan 3, 2023 •

edited

zhiqiangdon left a comment

github-actions bot commented Jan 4, 2023

AutoMM: Fix crash due to undefined DataContainer #2630

AutoMM: Fix crash due to undefined DataContainer #2630

Conversation

Innixma commented Jan 3, 2023 • edited

zhiqiangdon left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 4, 2023

Innixma commented Jan 3, 2023 •

edited