Skip to content

Change how to run multiple domains#100

Merged
JosephMarinier merged 4 commits into
mainfrom
joseph/revert-multi-domains
May 14, 2026
Merged

Change how to run multiple domains#100
JosephMarinier merged 4 commits into
mainfrom
joseph/revert-multi-domains

Conversation

@JosephMarinier
Copy link
Copy Markdown
Collaborator

@JosephMarinier JosephMarinier commented Apr 29, 2026

The current implementation for running multiple domains doesn't scale to running multiple perturbations, models, or other configurations. Also, that code felt like fighting against Pydantic. So, I suggest:

  1. Revert "Allow running multiple domains" for now, so we can think of a more generalizable and Pydantic-friendly implementation.
  2. Document how to run multiple domains, models, etc. using simple shell loops. Although not a first-class EVA feature, this is quite simple and generalizable.

What do you think?

Here is what I added in the REDME:


Running Multiple Configurations

Here is an example of shell loop to sweep over domains, models, or any combination of parameters.
Each iteration is an independent eva run. The loop continues on failure and exits with the last non-zero exit code.

exit_code=0;
for domain in airline itsm medical_hr; do
    for llm in gpt-5-mini gpt-5; do
        eva --domain "$domain" --model.llm "$llm" || exit_code=$?;
    done;
done;
exit $exit_code

💡 If you need a single command, like in Docker, you can wrap the shell script with sh -c '...'.

@JosephMarinier JosephMarinier added this pull request to the merge queue May 14, 2026
Merged via the queue into main with commit d21bdbb May 14, 2026
1 check passed
@JosephMarinier JosephMarinier deleted the joseph/revert-multi-domains branch May 14, 2026 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants