Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoMM: Fix crash due to undefined DataContainer #2630

Merged
merged 2 commits into from
Jan 4, 2023

Conversation

Innixma
Copy link
Contributor

@Innixma Innixma commented Jan 3, 2023

Issue #, if available:

Description of changes:

  • Fix crash due to undefined DataContainer on MacOS without GPU

Example:

/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py:96: UserWarning: Only CPU is detected in the instance. This may result in slow speed for MultiModalPredictor. Consider using an instance with GPU support.
  warnings.warn(
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  | Name              | Type                | Params
----------------------------------------------------------
0 | model             | MultimodalFusionMLP | 109 M 
1 | validation_metric | AUROC               | 0     
2 | loss_func         | CrossEntropyLoss    | 0     
----------------------------------------------------------
109 M     Trainable params
0         Non-trainable params
109 M     Total params
439.137   Total estimated model params size (MB)
/Users/neerick/workspace/virtual/autogluon38/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1892: PossibleUserWarning: The number of training batches (1) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  rank_zero_warn(
Epoch 0, global step 1: 'val_roc_auc' reached 1.00000 (best 1.00000), saving model to '/Users/neerick/workspace/code/autogluon-scratch/scripts/AutogluonModels/ag-20230103_230805/models/MultiModalPredictor/automm_model/epoch=0-step=1.ckpt' as top 3
Epoch 1, global step 2: 'val_roc_auc' reached 1.00000 (best 1.00000), saving model to '/Users/neerick/workspace/code/autogluon-scratch/scripts/AutogluonModels/ag-20230103_230805/models/MultiModalPredictor/automm_model/epoch=1-step=2.ckpt' as top 3
Epoch 2, global step 3: 'val_roc_auc' reached 1.00000 (best 1.00000), saving model to '/Users/neerick/workspace/code/autogluon-scratch/scripts/AutogluonModels/ag-20230103_230805/models/MultiModalPredictor/automm_model/epoch=2-step=3.ckpt' as top 3
Epoch 3, global step 4: 'val_roc_auc' was not in top 3
Epoch 4, global step 5: 'val_roc_auc' was not in top 3
Epoch 5, global step 6: 'val_roc_auc' was not in top 3
Epoch 6, global step 7: 'val_roc_auc' was not in top 3
Epoch 7, global step 8: 'val_roc_auc' was not in top 3
Epoch 8, global step 9: 'val_roc_auc' was not in top 3
Epoch 9, global step 10: 'val_roc_auc' was not in top 3
`Trainer.fit` stopped: `max_epochs=10` reached.
	Warning: Exception caused MultiModalPredictor to fail during training... Skipping this model.
		name 'DataContainer' is not defined
Detailed Traceback:
Traceback (most recent call last):
  File "/Users/neerick/workspace/code/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 1422, in _train_and_save
    model = self._train_single(X, y, model, X_val, y_val, total_resources=total_resources, **model_fit_kwargs)
  File "/Users/neerick/workspace/code/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 1367, in _train_single
    model = model.fit(X=X, y=y, X_val=X_val, y_val=y_val, total_resources=total_resources, **model_fit_kwargs)
  File "/Users/neerick/workspace/code/autogluon/core/src/autogluon/core/models/abstract/abstract_model.py", line 703, in fit
    out = self._fit(**kwargs)
  File "/Users/neerick/workspace/code/autogluon/tabular/src/autogluon/tabular/models/automm/automm_model.py", line 180, in _fit
    self.model.fit(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 848, in fit
    self._fit(**_fit_args)
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 1517, in _fit
    self._top_k_average(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 1593, in _top_k_average
    best_score = self.evaluate(val_df, [validation_metric_name])[validation_metric_name]
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/predictor.py", line 2044, in evaluate
    outputs = predict(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/inference.py", line 525, in predict
    outputs = realtime_predict(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/inference.py", line 415, in realtime_predict
    output = infer_batch(
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/inference.py", line 165, in infer_batch
    output = move_to_device(output, device=torch.device("cpu"))
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py", line 151, in move_to_device
    res[k] = move_to_device(v, device)
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py", line 151, in move_to_device
    res[k] = move_to_device(v, device)
  File "/Users/neerick/workspace/code/autogluon/multimodal/src/autogluon/multimodal/utils/environment.py", line 158, in move_to_device
    elif isinstance(obj, (int, float, str, DataContainer)):
NameError: name 'DataContainer' is not defined

Script:

if __name__ == '__main__':
    from autogluon.tabular import TabularPredictor, TabularDataset
    path_prefix = 'https://autogluon.s3.amazonaws.com/datasets/AdultIncomeBinaryClassification/'
    path_train = path_prefix + 'train_data.csv'
    path_test = path_prefix + 'test_data.csv'

    label = 'class'
    sample = 10  # Number of rows to use to train / infer
    train_data = TabularDataset(path_train)

    if sample is not None and (sample < len(train_data)):
        train_data = train_data.sample(n=sample, random_state=0).reset_index(drop=True)

    test_data = TabularDataset(path_test)
    fit_kwargs = dict(
        train_data=train_data,
        hyperparameters={
            'AG_AUTOMM': {"optimization.max_epochs": 1,},
        },
        # time_limit=120,
        # num_bag_folds=2,
        # num_stack_levels=1,
        # hyperparameter_tune_kwargs='auto',
    )
    predictor = TabularPredictor(
        label=label,
        eval_metric='roc_auc',
        verbosity=2,
    )
    predictor.fit(**fit_kwargs)

    leaderboard = predictor.leaderboard(test_data)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Innixma Innixma added this to the 0.6.2 Release milestone Jan 3, 2023
Copy link
Contributor

@zhiqiangdon zhiqiangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@github-actions
Copy link

github-actions bot commented Jan 4, 2023

Job PR-2630-82b71a9 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2630/82b71a9/index.html

@Innixma Innixma merged commit b418cab into master Jan 4, 2023
@Innixma Innixma deleted the fix_missing_datacontainer branch January 18, 2023 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants