Skip to content

Comments

e2e eval integration tests updates#1569

Merged
nikg7 merged 16 commits intomainfrom
nikg4/test-e2e-march
Mar 25, 2025
Merged

e2e eval integration tests updates#1569
nikg7 merged 16 commits intomainfrom
nikg4/test-e2e-march

Conversation

@nikg7
Copy link

@nikg7 nikg7 commented Mar 25, 2025

Description

-- Reset shard_for_eval to False for e2e tests using 1 GPU (leads to errors otherwise)
-- Optimize some test params e.g., move iterative_logs to test specs (currently hardcoded)
-- Update Phi3 eval config to use bfloat16
-- Tested e2e eval tests on A100:1 and A100:4

Related issues

Towards OPE-909

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

@nikg7 nikg7 requested review from kaisopos, optas, taenin and wizeng23 March 25, 2025 22:51
@nikg7 nikg7 marked this pull request as ready for review March 25, 2025 22:52
@nikg7 nikg7 changed the title [WIP] e2e integration tests updates e2e eval integration tests updates Mar 25, 2025
@@ -77,13 +80,20 @@ def _test_eval_impl(
config_path = test_config.config_path
# Overriding nested fields using OmegaConf's dot-list syntax is complicated,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Note that this should be possible now: #1430

@nikg7 nikg7 merged commit ac3dcda into main Mar 25, 2025
2 checks passed
@nikg7 nikg7 deleted the nikg4/test-e2e-march branch March 25, 2025 23:17
penfever pushed a commit that referenced this pull request Aug 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants