diff --git a/docs/guides/about_the_framework.md b/docs/guides/about_the_framework.md index ff19a4f..5a642c1 100644 --- a/docs/guides/about_the_framework.md +++ b/docs/guides/about_the_framework.md @@ -39,7 +39,7 @@ The Python CLI entry point is `run-evals`, defined in run-evals --json ./eval_set.json # Mode 2: Direct CLI arguments (what you used in Part 1) -run-evals --task question_answer --model google/gemini-2.0-flash --dataset samples.json +run-evals --task question_answer --model google/gemini-2.5-flash --dataset samples.json ``` ### JSON runner diff --git a/docs/guides/using_the_cli.md b/docs/guides/using_the_cli.md index b105d91..e1fb7ee 100644 --- a/docs/guides/using_the_cli.md +++ b/docs/guides/using_the_cli.md @@ -35,7 +35,7 @@ my-project/ - The starter task uses `func: analyze_codebase` — fine for a smoke test, but you'll want to change `func` to match your eval type (`question_answer`, `bug_fix`, `code_gen`, etc.) -- The job defaults to `google/gemini-2.0-flash`. Update `models:` to the +- The job defaults to `google/gemini-2.5-flash`. Update `models:` to the provider(s) you want to test. - `files` points at `../../` (your project root). Update if your workspace lives elsewhere. diff --git a/packages/devals_cli/example/evals/jobs/local_dev.yaml b/packages/devals_cli/example/evals/jobs/local_dev.yaml index 1af8005..06c9eb8 100644 --- a/packages/devals_cli/example/evals/jobs/local_dev.yaml +++ b/packages/devals_cli/example/evals/jobs/local_dev.yaml @@ -40,7 +40,7 @@ # Which models to evaluate. Format: "provider/model-name" # If omitted, falls back to DEFAULT_MODELS from the Python registries. models: - - google/gemini-2.0-flash + - google/gemini-2.5-flash # ============================================================================= # VARIANTS (Optional) diff --git a/packages/devals_cli/lib/src/commands/init_command.dart b/packages/devals_cli/lib/src/commands/init_command.dart index 40ff771..ace1790 100644 --- a/packages/devals_cli/lib/src/commands/init_command.dart +++ b/packages/devals_cli/lib/src/commands/init_command.dart @@ -67,7 +67,7 @@ class InitCommand extends Command { File(jobPath).writeAsStringSync( initJobTemplate( name: 'local_dev', - models: ['google/gemini-2.0-flash'], + models: ['google/gemini-2.5-flash'], tasks: ['get_started'], ), );