Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabular: Refactor params_aux & memory checks #3033

Merged
merged 5 commits into from
Mar 16, 2023

Conversation

Innixma
Copy link
Contributor

@Innixma Innixma commented Mar 12, 2023

Issue #, if available:

resolves #3031

Description of changes:

1: Refactor params_aux

  • Previously it was very confusing how to specify aux hyperparameters of a model.
  • Now, we simply prefix the hyperparameter with ag. to indicate it as an aux hyperparameter, which is much easier.
  • Prior API functionality remains in-tact, this is purely an extra way to pass the arguments.
  • Added unit tests for this logic.
  • I've added a TODO to expand upon this long term for improved ease of use and extensibility by enabling aux hyperparameters to be part of HPO search spaces. This should be made available by v1.0 release, but is too complicated to make sense including in this PR.

Mainline:

hyperparameters = {
    'GBM': {'ag_args_fit': {'max_memory_usage_ratio': 2.5}},
}

This PR:

hyperparameters = {
    'GBM': {'ag.max_memory_usage_ratio': 2.5}
}

2: Refactor memory checks

  • Previously, it was very unclear what users should do if they run into memory check errors.
  • This has been greatly expanded on, and now the logs provide ample information.
  • Additionally, the logic has been refactored and standardized to remove code duplication.

Mainline:

Fitting model: LightGBM ...
	Warning: Not enough memory to safely train model, roughly requires: 10.0 GB, but only 6.517 GB is available...
	Not enough memory to train LightGBM... Skipping this model.

This PR:

Fitting model: LightGBM ...
	Warning: Not enough memory to safely train model. Estimated to require 10.0 GB out of 8.273 GB available memory (134.312%)... (90.0% of avail memory is the max safe size)
	To force training the model, specify the model hyperparameter "ag.max_memory_usage_ratio" to a larger value (currently 1.0, set to >=1.39 to avoid the error)
		To set the same value for all models, do the following when calling predictor.fit: `predictor.fit(..., ag_args_fit={"ag.max_memory_usage_ratio": VALUE})`
		Setting "ag.max_memory_usage_ratio" to values above 1 may result in out-of-memory errors. You may consider using a machine with more memory as a safer alternative.
	Not enough memory to train LightGBM... Skipping this model.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Innixma Innixma added API & Doc Improvements or additions to documentation enhancement New feature or request module: tabular priority: 1 High priority labels Mar 12, 2023
@Innixma Innixma added this to the 0.7.1 Release milestone Mar 12, 2023
@github-actions
Copy link

Job PR-3033-06d16a0 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3033/06d16a0/index.html

Copy link
Collaborator

@yinweisu yinweisu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link

Job PR-3033-24b8d59 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3033/24b8d59/index.html

if ag_arg_prefix is not None:
param_aux_keys = list(params_aux.keys())
for k in param_aux_keys:
if isinstance(k, str) and k.startswith(ag_arg_prefix):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do a warning if the key is not string? I don't think we ever use non-string keys -> must be coding error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically there is nothing stopping a model from using an integer as a key for a hyperparameter (although I would actively be angry if they did). Maybe the scenario is so absurd that we should disallow it or log a warning until proven otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added:

Warning: Specified LGBModel hyperparameter key is not of type str: 5 (type=<class 'int'>). There might be a bug in your configuration.
Warning: Specified LGBModel hyperparameter key is not of type str: 7.3 (type=<class 'float'>). There might be a bug in your configuration.
Fitting 1 L1 models ...
Fitting model: LightGBM ...
[LightGBM] [Warning] Unknown parameter: 7.3
[LightGBM] [Warning] Unknown parameter: 5
	-9.7463	 = Validation score   (-root_mean_squared_error)
	2.13s	 = Training   runtime
	0.02s	 = Validation runtime

param_aux_keys = list(params_aux.keys())
for k in param_aux_keys:
if isinstance(k, str) and k.startswith(ag_arg_prefix):
k_no_prefix = k[len(ag_arg_prefix):]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we expect just one level of nesting? Would it bring value supporting multi-level nesting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add in future, but there is nothing implemented that would take advantage of it currently.

Comment on lines 201 to 209
for k in param_keys:
if isinstance(k, str) and k.startswith(ag_arg_prefix):
k_no_prefix = k[len(ag_arg_prefix):]
if k_no_prefix in params_aux:
logger.warning(f'Warning: {cls.__name__} hyperparameter "{k}" is present '
f'in both `ag_args_fit` and `hyperparameters`. '
f'Will use `hyperparameters` value.')
params_aux[k_no_prefix] = params[k]
params.pop(k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is effectively a repeating code block as above, just on different nesting level -> do we want to generalize this function to multi-level nested construct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there are meaningful differences between the code blocks and very different log statements, I think it is ok to keep as is. If you'd like to propose a code refactor of the logic, I'd be happy to review.

logger.warning('\tWarning: Potentially not enough memory to safely train model, roughly requires: %s GB, but only %s GB is available...' % (round(approx_mem_size_req / 1e9, 3), round(available_mem / 1e9, 3)))
elif min_warning_memory_ratio > max_memory_usage_warning_ratio:
log_user_guideline += f'\n\tTo avoid this warning, specify the model hyperparameter "ag.max_memory_usage_ratio" to a larger value ' \
f'(currently {max_memory_usage_ratio}, set to >={round(min_warning_memory_ratio + 0.05, 2)} to avoid the warning)' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round is not required - formatter supports that out-of-the box:

f'(currently {max_memory_usage_ratio}, set to >={min_warning_memory_ratio + 0.05:.2f} to avoid the warning)'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, updated

Comment on lines 1550 to 1555
log_user_guideline = f'Estimated to require {round(approx_mem_size_req / 1e9, 3)} GB ' \
f'out of {round(available_mem / 1e9, 3)} GB available memory ({round(min_error_memory_ratio*100, 3)}%)... ' \
f'({round(max_memory_usage_error_ratio*100, 3)}% of avail memory is the max safe size)'
if min_error_memory_ratio > max_memory_usage_error_ratio:
log_user_guideline += f'\n\tTo force training the model, specify the model hyperparameter "ag.max_memory_usage_ratio" to a larger value ' \
f'(currently {max_memory_usage_ratio}, set to >={round(min_error_memory_ratio + 0.05, 2)} to avoid the error)' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round is not required - formatter supports that out-of-the box:

f'(currently {max_memory_usage_ratio}, set to >={min_warning_memory_ratio + 0.05:.2f} to avoid the warning)'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, updated

@github-actions
Copy link

Job PR-3033-8ba915c is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3033/8ba915c/index.html

@github-actions
Copy link

Job PR-3033-6d70b14 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3033/6d70b14/index.html

@Innixma Innixma merged commit 7d0772c into autogluon:master Mar 16, 2023
@Innixma Innixma modified the milestones: 0.7.1 Release, 0.8 Release May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API & Doc Improvements or additions to documentation enhancement New feature or request module: tabular priority: 1 High priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tabular: Clarify memory requirements when memory is insufficient to fit
3 participants