Skip to content

fix(opus-4.7): flip CSV to adaptive + remove PDD_FORCE gate on github_copilot#1156

Merged
gltanaka merged 2 commits into
mainfrom
fix/opus-47-adaptive-and-copilot-filter
May 24, 2026
Merged

fix(opus-4.7): flip CSV to adaptive + remove PDD_FORCE gate on github_copilot#1156
gltanaka merged 2 commits into
mainfrom
fix/opus-47-adaptive-and-copilot-filter

Conversation

@gltanaka
Copy link
Copy Markdown
Contributor

Summary

  • Flip the Anthropic claude-opus-4-7 CSV row to reasoning_type=adaptive (Anthropic enforced the new shape ~2026-05-23 17:25 UTC; PR feat(llm_invoke): add reasoning_type='adaptive' for Anthropic Opus 4.7 #1047 added the code path but deferred the CSV flip "out of scope" — this closes that).
  • Teach the catalog generator about adaptive so re-generation doesn't revert the flip.
  • Drop the PDD_FORCE gate on the github_copilot token-file check so non-interactive contexts (Cloud Run, library use) don't hang on litellm device-flow OAuth when no Copilot token is present.

See commit body for the full incident timeline and rationale.

Test plan

  • pytest tests/test_llm_invoke.py -k github_copilot -v (three new tests pass)
  • pytest tests/test_generate_model_catalog.py -k reasoning_type -v (three new generator tests pass)
  • pytest tests/test_llm_invoke.py -v (no regression in existing tests)
  • /gcbrun cloud-test passes
  • After merge + 0.0.249 release + downstream redeploy: verify PROD fixcode returns success on POST with model=claude-opus-4-7 and no Please visit github.com/login/device warnings appear in Cloud Run logs.

🤖 Generated with Claude Code

@gltanaka
Copy link
Copy Markdown
Contributor Author

/gcbrun

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@gltanaka
Copy link
Copy Markdown
Contributor Author

LGTM — approved for merge as-is.

Verified all ten checklist items against the diff and /tmp/pdd-fix-tier2 on disk:

  • CSV row: the Anthropic claude-opus-4-7 row is well-formed (all 11 columns present), reasoning_type=adaptive, max_reasoning_tokens=16000. The adaptive code path in llm_invoke.py (~L3556) does not read max_reasoning_tokens; only the budget path (~L3491) does, so the 16000 value is harmless.
  • Generator self-consistency: _infer_reasoning_type returns 'adaptive' for the Anthropic row (via _is_adaptive_anthropic_model); _infer_max_reasoning_tokens returns 16000 for adaptive models. A fresh generator run reproduces the intended CSV values — no silent revert risk.
  • _is_adaptive_anthropic_model correctness: matches on bare model-id claude-opus-4-7 only when provider is direct Anthropic; returns False for azure_ai/claude-opus-4-7, Vertex, and Bedrock. Future suffix variants (e.g., claude-opus-4-7-fast) would NOT match the hardcoded set — acceptable for now since the set is intentionally narrow; a follow-up can widen it when those variants exist.
  • PDD_FORCE blast radius: removal is confined to the github_copilot token-file check. The interactive API-key-prompt skip at ~L2397 retains its own if os.environ.get('PDD_FORCE'): guard — unaffected.
  • Backward compat: users with a valid Copilot token file still pass the check. test_github_copilot_allowed_when_token_present_no_pdd_force exercises the real branch and returns True because the token file exists, not because of any mock short-circuit.
  • Warning text: pdd setup does invoke the GitHub Copilot OAuth flow via provider_manager.py. The wording is accurate.
  • Test isolation: monkeypatch restores env vars after each test; tmp_path is per-test. No global model-dataframe cache in pdd.llm_invoke that would leak across tests.
  • Pytest: pytest tests/test_llm_invoke.py -v → 296 passed, 0 failed; pytest tests/test_generate_model_catalog.py -v → 25 passed, 0 failed.
  • Build mirror: build/lib/pdd/data/llm_model.csv does not exist — no stale mirror to update.
  • Commit references: git show 8646b166e resolves; PR feat(llm_invoke): add reasoning_type='adaptive' for Anthropic Opus 4.7 #1047 grep confirms the referenced history is present.

@gltanaka
Copy link
Copy Markdown
Contributor Author

/heal

gltanaka and others added 2 commits May 24, 2026 12:53
…_copilot

Anthropic enforced the new adaptive thinking API for Claude Opus 4.7 on
2026-05-23 ~17:25 UTC; the legacy thinking.type.enabled shape now returns
400 invalid_request_error. PR #1047 (commit 8646b16) added the adaptive
code path on 2026-05-17 but explicitly deferred the CSV flip "out of
scope". The deferral expired; PROD pdd cloud functions (fixcode,
verifycode, crashcode, generatetest, generateexample) were broken for
~12h.

Compounding issue: when the 400 fires, candidate iteration falls through
to github_copilot/* rows. The credential check at _ensure_api_key gates
the token-file existence check behind PDD_FORCE, which server contexts
(Cloud Run) don't set. Copilot models pass the check, get tried, and
hang for minutes on litellm device-flow OAuth.

Changes:

1. pdd/data/llm_model.csv: flip Anthropic,claude-opus-4-7 to
   reasoning_type=adaptive, max_reasoning_tokens=16000. Azure AI row
   stays at budget pending separate audit — adaptive serialization in
   llm_invoke.py is gated on provider=='anthropic' anyway.

2. pdd/generate_model_catalog.py: teach _infer_reasoning_type and
   _infer_max_reasoning_tokens about the adaptive shape so regeneration
   doesn't revert the manual flip. Adaptive list is hardcoded
   ({"claude-opus-4-7"}) — extend as future models require adaptive.

3. pdd/llm_invoke.py: drop the `and os.environ.get('PDD_FORCE')` gate on
   the github_copilot token-file check. The token file is a precondition
   for any successful Copilot call (interactive or not); checking it
   unconditionally turns a multi-minute device-flow hang into a clean
   fast-fail with a `pdd setup` hint. Authenticated CLI users with a
   token file present are unaffected.

4. tests: cover the three github_copilot credential paths (no token /
   token present / PDD_FORCE-set) plus the generator's adaptive
   classification.

Process note: this PR closes the deferred CSV flip from PR #1047. Future
commits that add new reasoning_type values to llm_invoke.py should land
the CSV row atomically — deferring past a release means production can
break the moment a provider enforces the new shape (as happened here).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gltanaka gltanaka force-pushed the fix/opus-47-adaptive-and-copilot-filter branch from 4efc683 to 18a3a94 Compare May 24, 2026 19:53
@gltanaka gltanaka merged commit 1c32e98 into main May 24, 2026
9 checks passed
@gltanaka gltanaka deleted the fix/opus-47-adaptive-and-copilot-filter branch May 24, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant