Fix configured model filtering and test SDK bump by shanselman · Pull Request #714 · openclaw/openclaw-windows-node

shanselman · 2026-06-08T00:09:58Z

Summary

filter explicitly unconfigured gateway models out of the chat model selector while preserving older gateway payloads that omit configured
preserve the configured flag presence when parsing models.list
bump Microsoft.NET.Test.Sdk from 18.4.0 to 18.6.0

Fixes #684.
Fixes #705.

Validation

OPENCLAW_REPO_ROOT=D:\github\copilot-worktrees\moltbot-windows-hub\shanselman-sturdy-bassoon; .\build.ps1
OPENCLAW_REPO_ROOT=D:\github\copilot-worktrees\moltbot-windows-hub\shanselman-sturdy-bassoon; dotnet test .\tests\OpenClaw.Shared.Tests\OpenClaw.Shared.Tests.csproj --no-restore
OPENCLAW_REPO_ROOT=D:\github\copilot-worktrees\moltbot-windows-hub\shanselman-sturdy-bassoon; dotnet test .\tests\OpenClaw.Tray.Tests\OpenClaw.Tray.Tests.csproj --no-restore

Filter explicitly unconfigured gateway models out of the chat selector while preserving compatibility with older gateway payloads that omit the configured flag. Bump Microsoft.NET.Test.Sdk to 18.6.0 to finish the dependency chore. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

clawsweeper · 2026-06-08T00:10:54Z

Codex review: needs real behavior proof before merge. Reviewed June 7, 2026, 8:14 PM ET / 00:14 UTC.

Summary
The PR filters explicitly unconfigured gateway models from the tray chat model selector while preserving legacy payloads without configured, adds parser/provider regression tests, and bumps Microsoft.NET.Test.Sdk to 18.6.0.

Reproducibility: yes. source-reproducible: current master collapses explicit configured: false and missing configured into the same IsConfigured=false value, then the tray selector includes every non-empty id. I did not execute the app path in this read-only review.

Review metrics: 2 noteworthy metrics.

Diff scope: 6 files changed, 84 additions, 3 deletions. The PR is small but combines runtime selector behavior, regression tests, and one test SDK package bump.
Validation listed: 3 required commands listed. The PR body names the build, shared tests, and tray tests expected by AGENTS.md, though outputs are not included.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P1] Add real behavior proof showing a configured-false model absent from the selector while a legacy model without configured remains visible.
[P1] Ensure CI or maintainer-run AGENTS validation covers the build plus shared and tray tests before merge.

Proof guidance:

[P1] Needs real behavior proof before merge: Missing: the PR body lists build/test commands but no after-fix real app or gateway proof of the model dropdown filtering. Please add a screenshot, recording, copied live output, terminal output, or redacted logs showing the behavior, redact private details, and update the PR body so ClawSweeper can re-review automatically; if it does not, ask a maintainer to comment @clawsweeper re-review.

Mantis proof suggestion
A short visual proof of the tray chat model selector would materially satisfy the missing real behavior proof gate. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify that a models.list payload with configured false keeps that model out of the chat model dropdown while legacy models without configured still appear.

Risk before merge

[P1] Real behavior proof is missing: the body lists validation commands but does not show the after-fix selector behavior in a real app or gateway setup.
[P1] This read-only review did not rerun the Windows build/test validation, so merge should still be gated on CI or maintainer-run AGENTS validation.

Maintainer options:

Decide the mitigation before merge
Merge the narrow flag-preserving filter with its regression coverage once real behavior proof and required validation are available.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

[P1] No narrow code repair is needed from ClawSweeper; the remaining blocker is contributor-supplied real behavior proof plus normal validation/merge review.

Security
Cleared: No concrete security or supply-chain concern found; the diff does not touch secrets, permissions, workflows, scripts, or executable download paths, and the package bump targets a NuGet-listed Microsoft test SDK version. (nuget.org)

Review details

Best possible solution:

Merge the narrow flag-preserving filter with its regression coverage once real behavior proof and required validation are available.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: current master collapses explicit configured: false and missing configured into the same IsConfigured=false value, then the tray selector includes every non-empty id. I did not execute the app path in this read-only review.

Is this the best way to solve the issue?

Yes: preserving flag presence is the narrowest compatibility fix because it filters explicit false while keeping older gateway payloads that omit configured. No supported existing tray path appears to solve that distinction today.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 25c11dd07f09.

Label changes

Label changes:

add P2: The PR fixes a normal model/provider selection bug with limited blast radius and adds focused regression coverage.
add rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🐚 platinum hermit.
add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: Missing: the PR body lists build/test commands but no after-fix real app or gateway proof of the model dropdown filtering. Please add a screenshot, recording, copied live output, terminal output, or redacted logs showing the behavior, redact private details, and update the PR body so ClawSweeper can re-review automatically; if it does not, ask a maintainer to comment @clawsweeper re-review.

Label justifications:

P2: The PR fixes a normal model/provider selection bug with limited blast radius and adds focused regression coverage.
rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🐚 platinum hermit.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: Missing: the PR body lists build/test commands but no after-fix real app or gateway proof of the model dropdown filtering. Please add a screenshot, recording, copied live output, terminal output, or redacted logs showing the behavior, redact private details, and update the PR body so ClawSweeper can re-review automatically; if it does not, ask a maintainer to comment @clawsweeper re-review.

Evidence reviewed

What I checked:

Current parser loses flag presence: On current master, ParseModelsList only sets IsConfigured when configured is JSON true, so explicit false and missing legacy fields both become false. (src/OpenClaw.Shared/OpenClawGatewayClient.cs:3569, 25c11dd07f09)
Current selector includes every model id: ExtractModelNames currently adds every non-empty model id to AvailableModels, which explains the reported dropdown showing models that the gateway marked unconfigured. (src/OpenClaw.Tray.WinUI/Chat/OpenClawChatDataProvider.cs:1151, 25c11dd07f09)
PR implements the narrow distinction: The PR head adds HasConfiguredFlag, preserves it during models.list parsing, and filters only entries with configured: false, leaving legacy entries without the field visible. (src/OpenClaw.Tray.WinUI/Chat/OpenClawChatDataProvider.cs:1150, 8491c4d63710)
Regression tests cover parser and tray behavior: The PR adds shared parser coverage for true, false, and missing configured, plus tray provider coverage that excludes explicit false and keeps legacy models. (tests/OpenClaw.Tray.Tests/OpenClawChatDataProviderTests.cs:1174, 8491c4d63710)
Dependency bump is isolated: Current tests/Directory.Build.props has a single Microsoft.NET.Test.Sdk reference at 18.4.0, and the PR changes only that reference to 18.6.0; NuGet Gallery lists Microsoft.NET.Test.Sdk 18.6.0. (nuget.org) (tests/Directory.Build.props:29, 25c11dd07f09)
History provenance: git blame points the current model parsing, ModelInfo, and chat selector extraction lines to 85445c7; recent chat provider history also includes 9a3a7a6. (src/OpenClaw.Shared/OpenClawGatewayClient.cs:3549, 85445c78066b)

Likely related people:

christineyan4: Available blame attributes the current model list parser, ModelInfo, and chat model extraction code to commit 85445c7. (role: introduced behavior in available history; confidence: medium; commits: 85445c78066b; files: src/OpenClaw.Shared/OpenClawGatewayClient.cs, src/OpenClaw.Shared/Models.cs, src/OpenClaw.Tray.WinUI/Chat/OpenClawChatDataProvider.cs)
ranjeshj: Recent history touches OpenClawChatDataProvider and its tests in the same chat provider area affected by this PR. (role: recent adjacent contributor; confidence: medium; commits: 9a3a7a6131a8, 3d26594b4448; files: src/OpenClaw.Tray.WinUI/Chat/OpenClawChatDataProvider.cs, tests/OpenClaw.Tray.Tests/OpenClawChatDataProviderTests.cs)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 8, 2026

shanselman merged commit d1b1363 into master Jun 8, 2026
29 of 30 checks passed

shanselman deleted the shanselman/issue-triage branch June 8, 2026 00:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix configured model filtering and test SDK bump#714

Fix configured model filtering and test SDK bump#714
shanselman merged 1 commit into
masterfrom
shanselman/issue-triage

shanselman commented Jun 8, 2026

Uh oh!

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shanselman commented Jun 8, 2026

Summary

Validation

Uh oh!

clawsweeper Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented Jun 8, 2026 •

edited

Loading