-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable evaluation with LLMs <7B #3478
Conversation
…e model weights are re-registered after training. TODO: fix evaluation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…into zero_copy_load
for more information, see https://pre-commit.ci
…into zero_copy_load
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…into deepspeed-eval
for more information, see https://pre-commit.ci
ludwig/api.py
Outdated
@@ -1567,7 +1568,8 @@ def load( | |||
# Upgrades deprecated fields and adds new required fields in case the config loaded from disk is old. | |||
config_obj = ModelConfig.from_dict(config) | |||
|
|||
if backend_param is None and "backend" in config: | |||
# Ensure that the original backend is used if it was specified in the config and user requests it | |||
if use_backend_from_config or (backend_param is None and "backend" in config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is use_backend_from_config
needed here? Seems like the same effect can be achieved by letting backend
param be None
. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the issue comes from the evaluate CLI:
Lines 267 to 274 in adc82cc
args.backend = initialize_backend(args.backend) | |
if args.backend.is_coordinator(): | |
print_ludwig("Evaluate", LUDWIG_VERSION) | |
logger.info(f"Dataset path: {args.dataset}") | |
logger.info(f"Model path: {args.model_path}") | |
logger.info("") | |
evaluate_cli(**vars(args)) |
The backend is initialized fresh when running ludwig evaluate
. In doing this, the backend config is entirely ignored. This made it difficult to iterate quickly on this PR (my strategy was to run ludwig train
once, then run ludwig evaluate
to debug batch evaluation and prediction).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, going to push a quick change to fix the issue here, by not plumbing through the backend used in the CLI evaluate code path.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
No description provided.