Skip to content

fix(config): update OpenRouter vision model id#630

Merged
andreatgretel merged 7 commits into
mainfrom
andreatgretel/fix/openrouter-vision-model
May 11, 2026
Merged

fix(config): update OpenRouter vision model id#630
andreatgretel merged 7 commits into
mainfrom
andreatgretel/fix/openrouter-vision-model

Conversation

@andreatgretel
Copy link
Copy Markdown
Contributor

@andreatgretel andreatgretel commented May 11, 2026

Summary

Fix the scheduled provider health checks by updating the OpenRouter vision default to the currently routable nano-3 omni free model, aligning its inference parameters with the nano-3 omni config, making empty chat responses easier to diagnose, and documenting hosted-provider data handling expectations. This is separate from #626, which updates startup alias collection but does not change provider defaults.

cc @nabinchha

Changes

Changed

  • Updated constants.py so openrouter-vision uses nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free.
  • Aligned openrouter-vision with NEMOTRON_3_NANO_OMNI_30B_A3B_REASONING_INFERENCE_PARAMS (temperature=0.60, top_p=0.95), matching the nano-3 omni default.
  • Updated test_default_model_settings.py to assert both the new built-in model ID and inference parameters.
  • Updated the default model settings docs and Fern page to show the new openrouter-vision model ID and matching inference parameters.

Fixed

  • Hardened health_checks.py to retry empty/non-string chat responses once before failing.
  • Replaced bare provider health-check assertions with provider/model-specific failure messages.

Docs

  • Added a hosted-provider data handling warning to default-model-settings.md, covering hosted endpoints, provider terms/privacy practices, and confidential or personal data.

Review Feedback

  • Addressed Nabin's inline comments by using the nano-3 omni free OpenRouter model and its matching NEMOTRON_3_NANO_OMNI_30B_A3B_REASONING_INFERENCE_PARAMS.
  • Addressed Greptile's stale-evidence concern by confirming the PR description and testing notes now refer to nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free, the model currently committed.

Testing

  • .venv/bin/pytest packages/data-designer-config/tests/config/test_default_model_settings.py
  • .venv/bin/ruff check --fix packages/data-designer-config/src/data_designer/config/utils/constants.py packages/data-designer-config/tests/config/test_default_model_settings.py scripts/health_checks.py
  • .venv/bin/ruff format packages/data-designer-config/src/data_designer/config/utils/constants.py packages/data-designer-config/tests/config/test_default_model_settings.py scripts/health_checks.py
  • git diff --check
  • env -u NVIDIA_API_KEY -u OPENAI_API_KEY .venv/bin/python scripts/health_checks.py
  • OpenRouter-only health check passes with nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free: 4 passed, 0 failed, 8 skipped.

Description updated with AI

@andreatgretel andreatgretel marked this pull request as ready for review May 11, 2026 14:19
@andreatgretel andreatgretel requested a review from a team as a code owner May 11, 2026 14:19
@github-actions
Copy link
Copy Markdown
Contributor

Review: PR #630 — fix(config): update OpenRouter vision model id

Summary

Small, focused PR (+25/-5) that does two things:

  1. Updates the built-in openrouter-vision default from nvidia/nemotron-nano-12b-v2-vl to nvidia/nemotron-nano-12b-v2-vl:free (and updates the corresponding assertion in test_default_model_settings.py) so that the scheduled provider health checks can route to a currently available OpenRouter model.
  2. Hardens scripts/health_checks.py:
    • Replaces bare assert statements with explicit AssertionErrors that include the provider, model type, model name, and actual result — much easier to diagnose from CI logs.
    • Adds a one-shot retry (CHAT_COMPLETION_ATTEMPTS = 2) for chat-completion probes that return an empty/non-string result.

The PR description matches the diff; a live OpenRouter probe against the new ID is reported as returning OK.

Findings

Correctness

  • The constants change is a one-line swap and the accompanying test was updated in lock-step — index-based assertion at builtin_model_configs[10] still matches. Good.
  • The new embedding path correctly returns early after a single attempt; only chat completions retry. This matches intent (empty strings are a known flaky behavior for some chat models; embeddings don't have the same failure mode).
  • Loop is bounded (range(1, CHAT_COMPLETION_ATTEMPTS + 1)), and the terminal AssertionError fires only after all attempts are exhausted. No infinite-loop risk.
  • result = None initialization before the loop is defensive but fine — the loop always executes at least once, so the result!r in the final error message will always reflect the last real response.

Style / conventions

  • scripts/health_checks.py already used assert for validation; the PR moves to explicit AssertionError with rich messages. Keeping AssertionError (rather than a project-specific error type) preserves existing semantics for this ad-hoc script — reasonable since it lives under scripts/ and isn't exercised by library callers.
  • No changes to import direction or the three-package layering — the fix is contained to packages/data-designer-config/ (constants + its test) and scripts/ (standalone script).
  • from __future__ import annotations is not present in scripts/health_checks.py; the repo invariant in AGENTS.md calls for it in every Python source file. The PR does not regress this — it was already absent — but it's worth noting as a pre-existing drift, not a blocker for this change.

Robustness of the retry

  • Retries are immediate (no sleep / backoff). For a 2-attempt CI smoke test this is acceptable, but if OpenRouter's transient emptiness is rate-limit-shaped, a short time.sleep(1) between attempts would be cheap insurance. Optional nit.
  • Only empty/non-string responses trigger a retry — if the underlying call raises, the exception propagates without retry. That's consistent with the original behavior and probably intentional (distinguish "API reachable but returning junk" from "API broken"), but worth being deliberate about.
  • The retry notice is printed to stdout via print(...), matching the script's existing print-based output convention (no logger is used elsewhere in the file). Consistent.

Coupling to :free SKU

  • Hard-coding :free into a built-in default couples the DataDesigner default to OpenRouter's free-tier SKU naming. If the :free variant is ever deprecated or rate-capped more aggressively, this default will silently degrade. The trade-off is that without :free, the model currently isn't routable at all — so this is the correct short-term choice. Consider filing a follow-up to document (in constants or PR history) why the :free suffix is load-bearing.

Test coverage

  • test_default_model_settings.py was updated to match the new ID — the existing index-based assertion style is brittle but pre-existing, not introduced here.
  • No unit test was added for the new retry behavior in health_checks.py. Given that it's a CI-only script and the retry logic is five lines, a targeted test isn't strictly required; the existing env -u ... python scripts/health_checks.py manual probe documented in the PR description is reasonable validation.

Security

  • No secret handling changes. The improved error messages include provider name, model type, model name, and the raw result repr. result comes from the model API and can in principle contain anything, but for a chat-completion probe with prompt "Say 'OK' and nothing else." the blast radius is negligible. No API keys or environment values are logged.

Suggestions (all optional)

  1. Add a brief comment near the CHAT_COMPLETION_ATTEMPTS = 2 constant explaining why retries are needed (e.g., "OpenRouter occasionally returns empty content on first call for newly-routed free-tier models").
  2. Consider a small time.sleep(1) between attempts to cover transient rate-limit cases.
  3. Follow-up (out of scope): add from __future__ import annotations to scripts/health_checks.py to bring it in line with the repo-wide invariant.

Verdict

Low-risk, well-scoped fix that does exactly what the description says. Safe to merge. No blockers; the suggestions above are nice-to-haves.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 11, 2026

Greptile Summary

This PR fixes the scheduled provider health checks by updating the openrouter-vision default model from nvidia/nemotron-nano-12b-v2-vl to nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free and aligning its inference parameters with the existing nano-3 omni config (temperature=0.60, top_p=0.95). Tests, docs, and the Fern page are all updated to match.

  • constants.py: Swaps the vision model ID and switches from DEFAULT_VISION_INFERENCE_PARAMS to NEMOTRON_3_NANO_OMNI_30B_A3B_REASONING_INFERENCE_PARAMS, keeping it consistent with the nim/nvidia entry that already uses the same constant.
  • health_checks.py: Adds one retry for empty/non-string chat responses and replaces bare assert statements with descriptive AssertionError messages that include provider, model type, and model name.
  • Docs: Updates the openrouter-vision table row in both the Markdown and Fern MDX pages and adds a hosted-provider data handling warning.

Confidence Score: 5/5

Safe to merge — the change is a targeted model ID and inference parameter update with matching test, doc, and health-check coverage.

All three touch points (constants, tests, health checks) are updated consistently. The new model ID is already used in the codebase for the nim/nvidia entry, the retry logic in health_checks.py is correct, and the live probe confirmed the new OpenRouter model routes successfully.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-config/src/data_designer/config/utils/constants.py Switches openrouter-vision from nvidia/nemotron-nano-12b-v2-vl with DEFAULT_VISION_INFERENCE_PARAMS to nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free with NEMOTRON_3_NANO_OMNI_30B_A3B_REASONING_INFERENCE_PARAMS (temperature=0.60, top_p=0.95); no issues found.
packages/data-designer-config/tests/config/test_default_model_settings.py Updates builtin model config assertions to match new model ID and adds ChatCompletionInferenceParams assertion for temperature=0.60, top_p=0.95; ChatCompletionInferenceParams is already imported.
scripts/health_checks.py Adds one retry for empty/non-string chat responses and replaces bare assertions with descriptive provider/model-specific error messages; embedding path is unchanged; retry logic is correct.
docs/concepts/models/default-model-settings.md Updates openrouter-vision table row to new model ID and parameters; adds hosted-provider data handling warning.
fern/versions/v0.5.8/pages/concepts/models/default-model-settings.mdx Mirrors the markdown doc changes in the Fern MDX page: updated model ID, parameters, and added the hosted-provider Warning block.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[health_checks.py: _check_model] --> B{model_type == embedding?}
    B -- Yes --> C[facade.generate_text_embeddings]
    C --> D{1 result, non-empty?}
    D -- Yes --> E[return OK]
    D -- No --> F[raise AssertionError with provider/model details]
    B -- No --> G[attempt loop: 1..CHAT_COMPLETION_ATTEMPTS=2]
    G --> H[facade.generate prompt='Say OK']
    H --> I{result is non-empty str?}
    I -- Yes --> E
    I -- No, attempt < 2 --> J[print RETRY and loop]
    J --> G
    I -- No, attempt == 2 --> K[raise AssertionError with provider/model details]
Loading

Reviews (5): Last reviewed commit: "Merge branch 'main' into andreatgretel/f..." | Re-trigger Greptile

"reasoning": {"model": "openai/gpt-oss-20b", "inference_parameters": DEFAULT_REASONING_INFERENCE_PARAMS},
"vision": {"model": "nvidia/nemotron-nano-12b-v2-vl", "inference_parameters": DEFAULT_VISION_INFERENCE_PARAMS},
"vision": {
"model": "nvidia/nemotron-nano-12b-v2-vl:free",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to switch to free, we should use the nano-3-omni

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Updated openrouter-vision to use nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free instead of nvidia/nemotron-nano-12b-v2-vl:free.

I also added a hosted-provider data handling warning to the default model settings docs since this default now depends on a free hosted endpoint, and updated the docs table to match the new model ID.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 11, 2026

Docs preview: https://9e5d3189.dd-docs-preview.pages.dev

Notebook tutorials are placeholder-only in previews.

"vision": {"model": "nvidia/nemotron-nano-12b-v2-vl", "inference_parameters": DEFAULT_VISION_INFERENCE_PARAMS},
"vision": {
"model": "nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free",
"inference_parameters": DEFAULT_VISION_INFERENCE_PARAMS,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use NEMOTRON_3_NANO_OMNI_30B_A3B_REASONING_INFERENCE_PARAMS instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Updated openrouter-vision to use NEMOTRON_3_NANO_OMNI_30B_A3B_REASONING_INFERENCE_PARAMS instead of DEFAULT_VISION_INFERENCE_PARAMS.

I also synced the docs/Fern tables and added a test assertion for the OpenRouter vision params: temperature=0.60, top_p=0.95.

@andreatgretel andreatgretel merged commit 16db8d6 into main May 11, 2026
50 checks passed
@andreatgretel andreatgretel deleted the andreatgretel/fix/openrouter-vision-model branch May 14, 2026 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants