Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable evaluation with LLMs <7B #3478

Merged
merged 47 commits into from
Jul 29, 2023
Merged

Enable evaluation with LLMs <7B #3478

merged 47 commits into from
Jul 29, 2023

Conversation

geoffreyangus
Copy link
Collaborator

No description provided.

arnavgarg1 and others added 29 commits July 14, 2023 14:35
…e model weights are re-registered after training. TODO: fix evaluation
@github-actions
Copy link

github-actions bot commented Jul 25, 2023

Unit Test Results

  6 files  +       1    6 suites  +1   58m 23s ⏱️ - 20m 15s
34 tests  - 2 740  29 ✔️  - 2 727    5 💤  - 7  0  - 6 
88 runs   - 2 721  72 ✔️  - 2 713  16 💤  - 2  0  - 6 

Results for commit 3f0bc9f. ± Comparison against base commit 539e8e0.

♻️ This comment has been updated with latest results.

ludwig/api.py Outdated
@@ -1567,7 +1568,8 @@ def load(
# Upgrades deprecated fields and adds new required fields in case the config loaded from disk is old.
config_obj = ModelConfig.from_dict(config)

if backend_param is None and "backend" in config:
# Ensure that the original backend is used if it was specified in the config and user requests it
if use_backend_from_config or (backend_param is None and "backend" in config):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is use_backend_from_config needed here? Seems like the same effect can be achieved by letting backend param be None. Am I missing something?

Copy link
Collaborator Author

@geoffreyangus geoffreyangus Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so the issue comes from the evaluate CLI:

ludwig/ludwig/evaluate.py

Lines 267 to 274 in adc82cc

args.backend = initialize_backend(args.backend)
if args.backend.is_coordinator():
print_ludwig("Evaluate", LUDWIG_VERSION)
logger.info(f"Dataset path: {args.dataset}")
logger.info(f"Model path: {args.model_path}")
logger.info("")
evaluate_cli(**vars(args))

The backend is initialized fresh when running ludwig evaluate. In doing this, the backend config is entirely ignored. This made it difficult to iterate quickly on this PR (my strategy was to run ludwig train once, then run ludwig evaluate to debug batch evaluation and prediction).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, going to push a quick change to fix the issue here, by not plumbing through the backend used in the CLI evaluate code path.

ludwig/backend/ray.py Outdated Show resolved Hide resolved
ludwig/backend/ray.py Outdated Show resolved Hide resolved
ludwig/distributed/deepspeed.py Show resolved Hide resolved
Copy link
Collaborator

@tgaddair tgaddair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tgaddair tgaddair merged commit 8f5cec6 into master Jul 29, 2023
16 checks passed
@tgaddair tgaddair deleted the deepspeed-eval branch July 29, 2023 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Testing new llm feature with mosaicml/mpt-7b-8k, bigscience/bloomz-3b and openlm-research/open_llama_3b_v2
3 participants