Skip to content

[codex] Add provider model request profiles#278

Merged
michaelmwu merged 5 commits into
mainfrom
michaelmwu/provider-model-factory
May 14, 2026
Merged

[codex] Add provider model request profiles#278
michaelmwu merged 5 commits into
mainfrom
michaelmwu/provider-model-factory

Conversation

@michaelmwu
Copy link
Copy Markdown
Member

@michaelmwu michaelmwu commented May 14, 2026

Summary

  • add a shared ProviderModel factory for OpenAI-compatible clients
  • add compiled model request profiles for top OpenAI model choices
  • route job matching, resume extraction, and skills extraction through profile-aware request kwargs

Why

Bifrost strictly rejects unsupported model parameters. gpt-5-mini does not accept non-default temperature, but job posting extraction and reranking always sent temperature=0.1, causing automatic posting analysis to fail.

Impact

GPT-5-style models now omit unsupported temperature while older chat models such as gpt-4.1-mini keep it. OpenRouter model prefixing is centralized in the same factory.

Validation

  • ./scripts/test.sh
  • ./scripts/lint.sh
  • ./scripts/mypy.sh

Summary by CodeRabbit

  • New Features

    • Added LLM model profile configuration system supporting OpenAI-compatible models with capability-aware request parameter handling.
  • Refactor

    • Refactored LLM request construction across multiple modules to use standardized provider model abstraction for improved configuration consistency.
  • Tests

    • Added unit tests validating model profile behavior and request option support across different LLM providers.

Review Change Stack

Copilot AI review requested due to automatic review settings May 14, 2026 10:31
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minute and 15 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a4ed1a3-3fbf-4e09-bf1d-2c9bd8d82019

📥 Commits

Reviewing files that changed from the base of the PR and between e1d9bc5 and 54dcc01.

📒 Files selected for processing (9)
  • apps/worker/src/five08/worker/crm/skills_extractor.py
  • packages/shared/pyproject.toml
  • packages/shared/src/five08/data/model-profiles.json
  • packages/shared/src/five08/llm.py
  • packages/shared/src/five08/resume_extractor.py
  • packages/shared/src/five08/resume_skills_extractor.py
  • tests/evals/model-profiles.json
  • tests/unit/test_llm.py
  • tests/unit/test_resume_extractor.py
📝 Walkthrough

Walkthrough

Introduces a ProviderModel abstraction that centralizes OpenAI-compatible LLM request construction. New module defines model profiles, conditionally builds request kwargs based on model capabilities, and normalizes provider-prefixed model names. Four existing modules refactored to route LLM requests through this abstraction instead of directly passing parameters to OpenAI client.

Changes

LLM Provider Model Abstraction and Refactoring

Layer / File(s) Summary
Provider Model Abstraction and Configuration
packages/shared/src/five08/llm.py, packages/shared/src/five08/llm_model_profiles.json
New module introduces ProviderModel, ModelProfile, and OpenAIClientKwargs types. ProviderModel.openai_compatible(...) resolves model names (including OpenRouter normalization), detects provider prefixes, and loads capability profiles from packaged JSON. Conditional kwargs building includes temperature, response_format, reasoning_effort, and verbosity only when the model profile supports them. Helper functions normalize model-name lookups and resolve OpenRouter hosts.
Provider Model Unit Tests
tests/unit/test_llm.py
Five comprehensive tests validate profile resolution for gpt-5-mini and gpt-4.1-mini, OpenRouter name prefixing, unknown model fallback behavior, and rejection of unsupported options (e.g., seed on unsupported models).
Worker Skills Extractor Refactoring
apps/worker/src/five08/worker/crm/skills_extractor.py
Refactors SkillsExtractor to build ProviderModel during init, derive self.model from provider, initialize OpenAI client via provider_model.client_kwargs(), and construct LLM requests through provider_model.chat_completion_kwargs(...) instead of direct parameter passing.
Resume Skills Extractor Refactoring
packages/shared/src/five08/resume_skills_extractor.py
Similar refactoring: stores provider_model, sets self.model from provider, wires OpenAI client with provider-generated kwargs, and routes extract_skills LLM request through chat_completion_kwargs(...).
Job Match Pipeline Refactoring
packages/shared/src/five08/job_match.py, tests/unit/test_job_match.py
Refactors both extract_job_requirements and rerank_shortlisted_candidates to use ProviderModel.openai_compatible(...), client-kwargs generation, and chat_completion_kwargs(...) for request construction. Two unit tests tightened to assert exact model (gpt-5-mini), JSON response format, and omission of temperature in request arguments.
Resume Extractor Refactoring
packages/shared/src/five08/resume_extractor.py, tests/unit/test_resume_extractor.py
Extracts _provider_model() helper, stores base_url on instance, updates _build_completion_kwargs(...) to accept response_format and delegate to provider_model.chat_completion_kwargs(...). Refactors both structured-output and fallback LLM paths to use the new abstraction. Name-splitting LLM call also refactored. Five test fixtures updated to explicitly set extractor.model = "fake-model" after client wiring to ensure consistent model handling across structured and unstructured scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • 508-dev/508-workflows#142: Refactors job_match.py's OpenAI request construction for extract_job_requirements and reranking to use the ProviderModel abstraction.
  • 508-dev/508-workflows#113: Adds ResumeProfileExtractor with LLM extraction logic that is now refactored in this PR to use ProviderModel for standardized chat-completion kwargs construction.

Poem

🐰 Model profiles dance in JSON arrays bright,
ProviderModels weave request options just right—
Temperature here, reasoning_effort there,
Each LLM call now has a profile to share!
🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.37% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[codex] Add provider model request profiles' accurately summarizes the main change—introduction of a shared ProviderModel factory with compiled model request profiles—and is clear, concise, and directly related to the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch michaelmwu/provider-model-factory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a centralized provider/model “request profile” layer for OpenAI-compatible chat completions, then routes key extraction flows (job matching + resume/skills extraction) through that layer so strict providers (e.g., Bifrost) don’t reject unsupported parameters (notably temperature on GPT‑5-style models).

Changes:

  • Added ProviderModel + compiled model request profiles to filter request kwargs based on model capabilities (e.g., omit temperature for gpt-5-mini).
  • Updated job requirements extraction, candidate reranking, resume extraction, and skills extraction to build OpenAI request kwargs via ProviderModel.chat_completion_kwargs().
  • Added/updated unit tests to assert profile-driven kwargs behavior (model selection, response_format, and omitted temperature).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/test_resume_extractor.py Adjusts tests to set a non-GPT5 model where temperature must remain present.
tests/unit/test_llm.py Adds unit coverage for profile-driven kwargs filtering and OpenRouter prefixing.
tests/unit/test_job_match.py Asserts job-match calls use JSON response_format and omit temperature for GPT‑5 models.
packages/shared/src/five08/resume_skills_extractor.py Routes skills extraction calls through ProviderModel-filtered kwargs.
packages/shared/src/five08/resume_extractor.py Routes resume extraction + name-splitting calls through profile-aware kwargs.
packages/shared/src/five08/llm.py Adds ProviderModel/ModelProfile helpers + profile loading/filtering logic.
packages/shared/src/five08/llm_model_profiles.json Adds compiled request-option support matrix for common models.
packages/shared/src/five08/job_match.py Routes extraction and reranking calls through ProviderModel-filtered kwargs.
apps/worker/src/five08/worker/crm/skills_extractor.py Updates worker-side skills extraction to use ProviderModel for request kwargs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +85 to +112
def chat_completion_kwargs(
self,
*,
messages: list[dict[str, str]],
temperature: float | None = None,
max_tokens: int | None = None,
response_format: Any | None = None,
reasoning_effort: str | None = None,
verbosity: str | None = None,
**extra: Any,
) -> dict[str, Any]:
"""Build strict-provider-safe kwargs for chat.completions calls."""
kwargs: dict[str, Any] = {
"model": self.model,
"messages": messages,
}
if max_tokens is not None:
kwargs["max_tokens"] = max_tokens
if response_format is not None and self.supports("response_format"):
kwargs["response_format"] = response_format
if temperature is not None and self.supports("temperature"):
kwargs["temperature"] = temperature
if reasoning_effort is not None and self.supports("reasoning_effort"):
kwargs["reasoning_effort"] = reasoning_effort
if verbosity is not None and self.supports("verbosity"):
kwargs["verbosity"] = verbosity
kwargs.update(extra)
return kwargs
Comment thread packages/shared/src/five08/llm.py Outdated
Comment on lines +128 to +135
def get_model_profile(model: str) -> ModelProfile | None:
"""Look up a model profile, handling provider-prefixed OpenAI names."""
profiles = _load_model_profiles()
normalized = _profile_lookup_key(model)
payload = profiles.get(normalized)
if payload is None:
return _fallback_profile(model)
return _profile_from_payload(normalized, payload)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/unit/test_job_match.py (1)

178-181: ⚡ Quick win

Cover the legacy-model branch too.

Line 178 and Line 405 only pin the default gpt-5-mini behavior. Add one test that passes model="gpt-4.1-mini" and asserts temperature=0.1 is still forwarded, so this layer also guards the model-dependent filtering path it now relies on.

Also applies to: 405-408

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_job_match.py` around lines 178 - 181, The tests only assert
behavior for the default model "gpt-5-mini" and miss the legacy-model branch;
add an additional assertion case in tests referencing the mock call args
(mock_client.chat.completions.create.call_args.kwargs / create_kwargs) that
invokes the code with model="gpt-4.1-mini" and checks that "temperature" is
present and equals 0.1 (i.e., assert create_kwargs["model"] == "gpt-4.1-mini"
and assert create_kwargs["temperature"] == 0.1) so the model-dependent filtering
path is covered; replicate the same additional assertion in the other test block
that currently pins "gpt-5-mini" to cover both spots.
tests/unit/test_resume_extractor.py (1)

953-953: ⚡ Quick win

Use real profiled model names in these request-shaping tests.

fake-model will bypass the concrete model profiles this PR adds, so these assertions can keep passing even if the GPT-5 / gpt-4.1 request-shaping logic regresses. Prefer explicit profiled names per scenario so the test exercises the branch you actually ship.

Also applies to: 1562-1562, 1641-1641, 1712-1712

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_resume_extractor.py` at line 953, Replace the placeholder
"fake-model" with the actual profiled model name that matches the scenario being
tested (e.g., a gpt-4.1 or gpt-5 profile string) so the request-shaping branches
are exercised; specifically update extractor.model assignments in
tests/unit/test_resume_extractor.py (the occurrence setting extractor.model =
"fake-model" and the other occurrences referenced) to use the concrete profiled
names for each scenario so the test hits the real profile-based logic rather
than bypassing it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/shared/src/five08/llm.py`:
- Around line 85-94: The messages parameter in chat_completion_kwargs is too
narrow (list[dict[str, str]]); change its annotation to allow non-string content
(e.g., list[dict[str, Any]] or list[dict[str, Union[str, list[Any]]]]) so
callers can pass structured content for vision models; update the signature of
chat_completion_kwargs and any internal handling that relies on messages being
str-only to accept and pass through arbitrary content types without casting.

In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 2686-2706: The code calls self.client.chat.completions.create with
kwargs from self._provider_model().chat_completion_kwargs but does not request
JSON output, yet later unconditionally parses message.content as JSON; update
the chat completion call to include the response_format (e.g.,
response_format="json" or response_format=True) via
_provider_model().chat_completion_kwargs so the provider is asked to return
JSON, keeping the system/user messages (the name-splitting prompt) intact;
ensure the provider_model call that builds chat_completion_kwargs is passed this
response_format so message.content parsing in the name-splitting routine
succeeds without falling back to heuristics.

---

Nitpick comments:
In `@tests/unit/test_job_match.py`:
- Around line 178-181: The tests only assert behavior for the default model
"gpt-5-mini" and miss the legacy-model branch; add an additional assertion case
in tests referencing the mock call args
(mock_client.chat.completions.create.call_args.kwargs / create_kwargs) that
invokes the code with model="gpt-4.1-mini" and checks that "temperature" is
present and equals 0.1 (i.e., assert create_kwargs["model"] == "gpt-4.1-mini"
and assert create_kwargs["temperature"] == 0.1) so the model-dependent filtering
path is covered; replicate the same additional assertion in the other test block
that currently pins "gpt-5-mini" to cover both spots.

In `@tests/unit/test_resume_extractor.py`:
- Line 953: Replace the placeholder "fake-model" with the actual profiled model
name that matches the scenario being tested (e.g., a gpt-4.1 or gpt-5 profile
string) so the request-shaping branches are exercised; specifically update
extractor.model assignments in tests/unit/test_resume_extractor.py (the
occurrence setting extractor.model = "fake-model" and the other occurrences
referenced) to use the concrete profiled names for each scenario so the test
hits the real profile-based logic rather than bypassing it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9ae12f52-607a-4823-afd0-d844e2b156b2

📥 Commits

Reviewing files that changed from the base of the PR and between bf167ff and e1d9bc5.

📒 Files selected for processing (9)
  • apps/worker/src/five08/worker/crm/skills_extractor.py
  • packages/shared/src/five08/job_match.py
  • packages/shared/src/five08/llm.py
  • packages/shared/src/five08/llm_model_profiles.json
  • packages/shared/src/five08/resume_extractor.py
  • packages/shared/src/five08/resume_skills_extractor.py
  • tests/unit/test_job_match.py
  • tests/unit/test_llm.py
  • tests/unit/test_resume_extractor.py

Comment thread packages/shared/src/five08/llm.py
Comment thread packages/shared/src/five08/resume_extractor.py
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment on lines +161 to +168
@lru_cache(maxsize=1)
def _load_model_profiles() -> dict[str, Mapping[str, Any]]:
try:
raw = resources.files("five08").joinpath(_PROFILE_RESOURCE).read_text()
data = json.loads(raw)
except Exception as exc: # pragma: no cover - static package data should exist
logger.warning("Failed to load LLM model profiles: %s", exc)
return {}
Comment on lines +209 to +214
def _profile_lookup_key(model: str) -> str:
value = (model or "").strip()
for prefix in _OPENAI_PROVIDER_PREFIXES:
if value.startswith(prefix):
return value.removeprefix(prefix)
return value
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0d7b492063

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

temperature=temperature,
max_tokens=max_tokens,
response_format=response_format,
reasoning_effort="minimal",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use a gpt-5.1-supported reasoning effort

When the resume extractor is configured with gpt-5.1, the new profile in llm_model_profiles.json marks reasoning_effort as supported, so this hard-coded default is sent unchanged; however, the OpenAI Chat Completions docs list gpt-5.1 reasoning values as none, low, medium, and high rather than minimal (https://platform.openai.com/docs/api-reference/chat/create-completion). In that configuration, resume extraction requests will be rejected before falling back or parsing, so this should be model-specific or omitted unless the catalog supplies a valid value.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings May 14, 2026 14:30
@michaelmwu michaelmwu merged commit a767be2 into main May 14, 2026
9 checks passed
@michaelmwu michaelmwu deleted the michaelmwu/provider-model-factory branch May 14, 2026 14:34
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants