Skip to content

fix: use max_completion_tokens for gpt-4.1+, gpt-5.x, and o-series models#38

Open
YuHuang0525 wants to merge 1 commit intoPickle-Pixel:mainfrom
YuHuang0525:fix/openai-max-completion-tokens
Open

fix: use max_completion_tokens for gpt-4.1+, gpt-5.x, and o-series models#38
YuHuang0525 wants to merge 1 commit intoPickle-Pixel:mainfrom
YuHuang0525:fix/openai-max-completion-tokens

Conversation

@YuHuang0525
Copy link
Copy Markdown

Problem

Newer OpenAI models — gpt-4.1, gpt-5.x, o1, o3, o4 — reject the legacy max_tokens parameter with HTTP 400 and require max_completion_tokens instead. This means any user who sets LLM_MODEL=gpt-5.2 (or any other newer model) gets an immediate 400 error on every LLM call, breaking scoring, tailoring, and cover letter generation entirely.

Relevant code before this fix (llm.py _chat_compat()):

payload = {
    "model": self.model,
    "messages": messages,
    "temperature": temperature,
    "max_tokens": max_tokens,   # ← rejected by gpt-4.1+, gpt-5.x, o-series
}

Fix

Detect the model prefix at call time and send the correct parameter:

_new_param_models = ("gpt-4.1", "gpt-5", "o1", "o3", "o4")
if any(self.model.startswith(p) for p in _new_param_models):
    token_param = {"max_completion_tokens": max_tokens}
else:
    token_param = {"max_tokens": max_tokens}
  • All other providers (Gemini compat, Gemini native, local/Ollama) are unaffected — they continue using max_tokens as before.
  • The native Gemini path already uses maxOutputTokens and is untouched.
  • No behaviour change for existing gpt-4o, gpt-4o-mini, or local model users.

Testing

Verified manually with gpt-5.2 — the 400 error is resolved and completions return successfully after this change. No existing automated tests cover _chat_compat() directly.

…dels

Newer OpenAI models (gpt-4.1+, gpt-5.x, o1, o3, o4) reject the legacy
max_tokens parameter with HTTP 400 and require max_completion_tokens instead.

_chat_compat() now detects the model prefix at call time and sends the
correct parameter, while all other providers (Gemini, local) continue
using max_tokens unchanged.
Marissa0912 pushed a commit to Marissa0912/ApplyPilot that referenced this pull request Apr 29, 2026
LinkedIn (and other major sites' bot detection) probe extensions
by attempting to fetch resources from `chrome-extension://{id}/...`
where {id} is one of ~6,000 known extension IDs. If the resource
loads, the site has a strong signal that this user runs that
extension.

Empty `web_accessible_resources: []` means our extension exposes
NO resources to web pages, so the probe always fails. Combined
with the per-install random extension key (decision Pickle-Pixel#38), this
removes ApplyPilot's extension from cross-user fingerprintability
entirely.

`alarms` permission was already present in the manifest, so the
SW-heartbeat support from spec §3.1 is also in place.

Smoke verified: extension still loads in CfT 148 with the empty
WAR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant