-
Notifications
You must be signed in to change notification settings - Fork 862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tabular: Refactor params_aux & memory checks #3033
Conversation
Job PR-3033-06d16a0 is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Job PR-3033-24b8d59 is done. |
if ag_arg_prefix is not None: | ||
param_aux_keys = list(params_aux.keys()) | ||
for k in param_aux_keys: | ||
if isinstance(k, str) and k.startswith(ag_arg_prefix): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we do a warning if the key is not string? I don't think we ever use non-string keys -> must be coding error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically there is nothing stopping a model from using an integer as a key for a hyperparameter (although I would actively be angry if they did). Maybe the scenario is so absurd that we should disallow it or log a warning until proven otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added:
Warning: Specified LGBModel hyperparameter key is not of type str: 5 (type=<class 'int'>). There might be a bug in your configuration.
Warning: Specified LGBModel hyperparameter key is not of type str: 7.3 (type=<class 'float'>). There might be a bug in your configuration.
Fitting 1 L1 models ...
Fitting model: LightGBM ...
[LightGBM] [Warning] Unknown parameter: 7.3
[LightGBM] [Warning] Unknown parameter: 5
-9.7463 = Validation score (-root_mean_squared_error)
2.13s = Training runtime
0.02s = Validation runtime
param_aux_keys = list(params_aux.keys()) | ||
for k in param_aux_keys: | ||
if isinstance(k, str) and k.startswith(ag_arg_prefix): | ||
k_no_prefix = k[len(ag_arg_prefix):] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we expect just one level of nesting? Would it bring value supporting multi-level nesting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add in future, but there is nothing implemented that would take advantage of it currently.
for k in param_keys: | ||
if isinstance(k, str) and k.startswith(ag_arg_prefix): | ||
k_no_prefix = k[len(ag_arg_prefix):] | ||
if k_no_prefix in params_aux: | ||
logger.warning(f'Warning: {cls.__name__} hyperparameter "{k}" is present ' | ||
f'in both `ag_args_fit` and `hyperparameters`. ' | ||
f'Will use `hyperparameters` value.') | ||
params_aux[k_no_prefix] = params[k] | ||
params.pop(k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is effectively a repeating code block as above, just on different nesting level -> do we want to generalize this function to multi-level nested construct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that there are meaningful differences between the code blocks and very different log statements, I think it is ok to keep as is. If you'd like to propose a code refactor of the logic, I'd be happy to review.
logger.warning('\tWarning: Potentially not enough memory to safely train model, roughly requires: %s GB, but only %s GB is available...' % (round(approx_mem_size_req / 1e9, 3), round(available_mem / 1e9, 3))) | ||
elif min_warning_memory_ratio > max_memory_usage_warning_ratio: | ||
log_user_guideline += f'\n\tTo avoid this warning, specify the model hyperparameter "ag.max_memory_usage_ratio" to a larger value ' \ | ||
f'(currently {max_memory_usage_ratio}, set to >={round(min_warning_memory_ratio + 0.05, 2)} to avoid the warning)' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
round
is not required - formatter supports that out-of-the box:
f'(currently {max_memory_usage_ratio}, set to >={min_warning_memory_ratio + 0.05:.2f} to avoid the warning)'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, updated
log_user_guideline = f'Estimated to require {round(approx_mem_size_req / 1e9, 3)} GB ' \ | ||
f'out of {round(available_mem / 1e9, 3)} GB available memory ({round(min_error_memory_ratio*100, 3)}%)... ' \ | ||
f'({round(max_memory_usage_error_ratio*100, 3)}% of avail memory is the max safe size)' | ||
if min_error_memory_ratio > max_memory_usage_error_ratio: | ||
log_user_guideline += f'\n\tTo force training the model, specify the model hyperparameter "ag.max_memory_usage_ratio" to a larger value ' \ | ||
f'(currently {max_memory_usage_ratio}, set to >={round(min_error_memory_ratio + 0.05, 2)} to avoid the error)' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
round
is not required - formatter supports that out-of-the box:
f'(currently {max_memory_usage_ratio}, set to >={min_warning_memory_ratio + 0.05:.2f} to avoid the warning)'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, updated
Job PR-3033-8ba915c is done. |
Job PR-3033-6d70b14 is done. |
Issue #, if available:
resolves #3031
Description of changes:
1: Refactor params_aux
ag.
to indicate it as an aux hyperparameter, which is much easier.Mainline:
This PR:
2: Refactor memory checks
Mainline:
This PR:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.