[codex] Add provider model request profiles by michaelmwu · Pull Request #278 · 508-dev/508-workflows

michaelmwu · 2026-05-14T10:31:35Z

Summary

add a shared ProviderModel factory for OpenAI-compatible clients
add compiled model request profiles for top OpenAI model choices
route job matching, resume extraction, and skills extraction through profile-aware request kwargs

Why

Bifrost strictly rejects unsupported model parameters. gpt-5-mini does not accept non-default temperature, but job posting extraction and reranking always sent temperature=0.1, causing automatic posting analysis to fail.

Impact

GPT-5-style models now omit unsupported temperature while older chat models such as gpt-4.1-mini keep it. OpenRouter model prefixing is centralized in the same factory.

Validation

./scripts/test.sh
./scripts/lint.sh
./scripts/mypy.sh

Summary by CodeRabbit

New Features
- Added LLM model profile configuration system supporting OpenAI-compatible models with capability-aware request parameter handling.
Refactor
- Refactored LLM request construction across multiple modules to use standardized provider model abstraction for improved configuration consistency.
Tests
- Added unit tests validating model profile behavior and request option support across different LLM providers.

coderabbitai · 2026-05-14T10:31:45Z

Warning

Rate limit exceeded

@michaelmwu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 1 minute and 15 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a4ed1a3-3fbf-4e09-bf1d-2c9bd8d82019

📥 Commits

Reviewing files that changed from the base of the PR and between e1d9bc5 and 54dcc01.

📒 Files selected for processing (9)

apps/worker/src/five08/worker/crm/skills_extractor.py
packages/shared/pyproject.toml
packages/shared/src/five08/data/model-profiles.json
packages/shared/src/five08/llm.py
packages/shared/src/five08/resume_extractor.py
packages/shared/src/five08/resume_skills_extractor.py
tests/evals/model-profiles.json
tests/unit/test_llm.py
tests/unit/test_resume_extractor.py

📝 Walkthrough

Walkthrough

Introduces a ProviderModel abstraction that centralizes OpenAI-compatible LLM request construction. New module defines model profiles, conditionally builds request kwargs based on model capabilities, and normalizes provider-prefixed model names. Four existing modules refactored to route LLM requests through this abstraction instead of directly passing parameters to OpenAI client.

Changes

LLM Provider Model Abstraction and Refactoring

Layer / File(s)	Summary
Provider Model Abstraction and Configuration `packages/shared/src/five08/llm.py`, `packages/shared/src/five08/llm_model_profiles.json`	New module introduces `ProviderModel`, `ModelProfile`, and `OpenAIClientKwargs` types. `ProviderModel.openai_compatible(...)` resolves model names (including OpenRouter normalization), detects provider prefixes, and loads capability profiles from packaged JSON. Conditional kwargs building includes `temperature`, `response_format`, `reasoning_effort`, and `verbosity` only when the model profile supports them. Helper functions normalize model-name lookups and resolve OpenRouter hosts.
Provider Model Unit Tests `tests/unit/test_llm.py`	Five comprehensive tests validate profile resolution for `gpt-5-mini` and `gpt-4.1-mini`, OpenRouter name prefixing, unknown model fallback behavior, and rejection of unsupported options (e.g., `seed` on unsupported models).
Worker Skills Extractor Refactoring `apps/worker/src/five08/worker/crm/skills_extractor.py`	Refactors `SkillsExtractor` to build `ProviderModel` during init, derive `self.model` from provider, initialize OpenAI client via `provider_model.client_kwargs()`, and construct LLM requests through `provider_model.chat_completion_kwargs(...)` instead of direct parameter passing.
Resume Skills Extractor Refactoring `packages/shared/src/five08/resume_skills_extractor.py`	Similar refactoring: stores `provider_model`, sets `self.model` from provider, wires OpenAI client with provider-generated kwargs, and routes `extract_skills` LLM request through `chat_completion_kwargs(...)`.
Job Match Pipeline Refactoring `packages/shared/src/five08/job_match.py`, `tests/unit/test_job_match.py`	Refactors both `extract_job_requirements` and `rerank_shortlisted_candidates` to use `ProviderModel.openai_compatible(...)`, client-kwargs generation, and `chat_completion_kwargs(...)` for request construction. Two unit tests tightened to assert exact model (`gpt-5-mini`), JSON response format, and omission of temperature in request arguments.
Resume Extractor Refactoring `packages/shared/src/five08/resume_extractor.py`, `tests/unit/test_resume_extractor.py`	Extracts `_provider_model()` helper, stores `base_url` on instance, updates `_build_completion_kwargs(...)` to accept `response_format` and delegate to `provider_model.chat_completion_kwargs(...)`. Refactors both structured-output and fallback LLM paths to use the new abstraction. Name-splitting LLM call also refactored. Five test fixtures updated to explicitly set `extractor.model = "fake-model"` after client wiring to ensure consistent model handling across structured and unstructured scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

508-dev/508-workflows#142: Refactors job_match.py's OpenAI request construction for extract_job_requirements and reranking to use the ProviderModel abstraction.
508-dev/508-workflows#113: Adds ResumeProfileExtractor with LLM extraction logic that is now refactored in this PR to use ProviderModel for standardized chat-completion kwargs construction.

Poem

🐰 Model profiles dance in JSON arrays bright,
ProviderModels weave request options just right—
Temperature here, reasoning_effort there,
Each LLM call now has a profile to share!
🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.37% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[codex] Add provider model request profiles' accurately summarizes the main change—introduction of a shared ProviderModel factory with compiled model request profiles—and is clear, concise, and directly related to the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch michaelmwu/provider-model-factory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR introduces a centralized provider/model “request profile” layer for OpenAI-compatible chat completions, then routes key extraction flows (job matching + resume/skills extraction) through that layer so strict providers (e.g., Bifrost) don’t reject unsupported parameters (notably temperature on GPT‑5-style models).

Changes:

Added ProviderModel + compiled model request profiles to filter request kwargs based on model capabilities (e.g., omit temperature for gpt-5-mini).
Updated job requirements extraction, candidate reranking, resume extraction, and skills extraction to build OpenAI request kwargs via ProviderModel.chat_completion_kwargs().
Added/updated unit tests to assert profile-driven kwargs behavior (model selection, response_format, and omitted temperature).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/unit/test_resume_extractor.py	Adjusts tests to set a non-GPT5 model where temperature must remain present.
tests/unit/test_llm.py	Adds unit coverage for profile-driven kwargs filtering and OpenRouter prefixing.
tests/unit/test_job_match.py	Asserts job-match calls use JSON response_format and omit temperature for GPT‑5 models.
packages/shared/src/five08/resume_skills_extractor.py	Routes skills extraction calls through `ProviderModel`-filtered kwargs.
packages/shared/src/five08/resume_extractor.py	Routes resume extraction + name-splitting calls through profile-aware kwargs.
packages/shared/src/five08/llm.py	Adds `ProviderModel`/`ModelProfile` helpers + profile loading/filtering logic.
packages/shared/src/five08/llm_model_profiles.json	Adds compiled request-option support matrix for common models.
packages/shared/src/five08/job_match.py	Routes extraction and reranking calls through `ProviderModel`-filtered kwargs.
apps/worker/src/five08/worker/crm/skills_extractor.py	Updates worker-side skills extraction to use `ProviderModel` for request kwargs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    def chat_completion_kwargs(
+        self,
+        *,
+        messages: list[dict[str, str]],
+        temperature: float | None = None,
+        max_tokens: int | None = None,
+        response_format: Any | None = None,
+        reasoning_effort: str | None = None,
+        verbosity: str | None = None,
+        **extra: Any,
+    ) -> dict[str, Any]:
+        """Build strict-provider-safe kwargs for chat.completions calls."""
+        kwargs: dict[str, Any] = {
+            "model": self.model,
+            "messages": messages,
+        }
+        if max_tokens is not None:
+            kwargs["max_tokens"] = max_tokens
+        if response_format is not None and self.supports("response_format"):
+            kwargs["response_format"] = response_format
+        if temperature is not None and self.supports("temperature"):
+            kwargs["temperature"] = temperature
+        if reasoning_effort is not None and self.supports("reasoning_effort"):
+            kwargs["reasoning_effort"] = reasoning_effort
+        if verbosity is not None and self.supports("verbosity"):
+            kwargs["verbosity"] = verbosity
+        kwargs.update(extra)
+        return kwargs


+def get_model_profile(model: str) -> ModelProfile | None:
+    """Look up a model profile, handling provider-prefixed OpenAI names."""
+    profiles = _load_model_profiles()
+    normalized = _profile_lookup_key(model)
+    payload = profiles.get(normalized)
+    if payload is None:
+        return _fallback_profile(model)
+    return _profile_from_payload(normalized, payload)


coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

tests/unit/test_job_match.py (1)
178-181: ⚡ Quick win

Cover the legacy-model branch too.

Line 178 and Line 405 only pin the default gpt-5-mini behavior. Add one test that passes model="gpt-4.1-mini" and asserts temperature=0.1 is still forwarded, so this layer also guards the model-dependent filtering path it now relies on.

Also applies to: 405-408
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_job_match.py` around lines 178 - 181, The tests only assert
behavior for the default model "gpt-5-mini" and miss the legacy-model branch;
add an additional assertion case in tests referencing the mock call args
(mock_client.chat.completions.create.call_args.kwargs / create_kwargs) that
invokes the code with model="gpt-4.1-mini" and checks that "temperature" is
present and equals 0.1 (i.e., assert create_kwargs["model"] == "gpt-4.1-mini"
and assert create_kwargs["temperature"] == 0.1) so the model-dependent filtering
path is covered; replicate the same additional assertion in the other test block
that currently pins "gpt-5-mini" to cover both spots.
tests/unit/test_resume_extractor.py (1)
953-953: ⚡ Quick win

Use real profiled model names in these request-shaping tests.

fake-model will bypass the concrete model profiles this PR adds, so these assertions can keep passing even if the GPT-5 / gpt-4.1 request-shaping logic regresses. Prefer explicit profiled names per scenario so the test exercises the branch you actually ship.

Also applies to: 1562-1562, 1641-1641, 1712-1712
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_resume_extractor.py` at line 953, Replace the placeholder
"fake-model" with the actual profiled model name that matches the scenario being
tested (e.g., a gpt-4.1 or gpt-5 profile string) so the request-shaping branches
are exercised; specifically update extractor.model assignments in
tests/unit/test_resume_extractor.py (the occurrence setting extractor.model =
"fake-model" and the other occurrences referenced) to use the concrete profiled
names for each scenario so the test hits the real profile-based logic rather
than bypassing it.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/shared/src/five08/llm.py`:
- Around line 85-94: The messages parameter in chat_completion_kwargs is too
narrow (list[dict[str, str]]); change its annotation to allow non-string content
(e.g., list[dict[str, Any]] or list[dict[str, Union[str, list[Any]]]]) so
callers can pass structured content for vision models; update the signature of
chat_completion_kwargs and any internal handling that relies on messages being
str-only to accept and pass through arbitrary content types without casting.

In `@packages/shared/src/five08/resume_extractor.py`:
- Around line 2686-2706: The code calls self.client.chat.completions.create with
kwargs from self._provider_model().chat_completion_kwargs but does not request
JSON output, yet later unconditionally parses message.content as JSON; update
the chat completion call to include the response_format (e.g.,
response_format="json" or response_format=True) via
_provider_model().chat_completion_kwargs so the provider is asked to return
JSON, keeping the system/user messages (the name-splitting prompt) intact;
ensure the provider_model call that builds chat_completion_kwargs is passed this
response_format so message.content parsing in the name-splitting routine
succeeds without falling back to heuristics.

---

Nitpick comments:
In `@tests/unit/test_job_match.py`:
- Around line 178-181: The tests only assert behavior for the default model
"gpt-5-mini" and miss the legacy-model branch; add an additional assertion case
in tests referencing the mock call args
(mock_client.chat.completions.create.call_args.kwargs / create_kwargs) that
invokes the code with model="gpt-4.1-mini" and checks that "temperature" is
present and equals 0.1 (i.e., assert create_kwargs["model"] == "gpt-4.1-mini"
and assert create_kwargs["temperature"] == 0.1) so the model-dependent filtering
path is covered; replicate the same additional assertion in the other test block
that currently pins "gpt-5-mini" to cover both spots.

In `@tests/unit/test_resume_extractor.py`:
- Line 953: Replace the placeholder "fake-model" with the actual profiled model
name that matches the scenario being tested (e.g., a gpt-4.1 or gpt-5 profile
string) so the request-shaping branches are exercised; specifically update
extractor.model assignments in tests/unit/test_resume_extractor.py (the
occurrence setting extractor.model = "fake-model" and the other occurrences
referenced) to use the concrete profiled names for each scenario so the test
hits the real profile-based logic rather than bypassing it.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9ae12f52-607a-4823-afd0-d844e2b156b2

📥 Commits

Reviewing files that changed from the base of the PR and between bf167ff and e1d9bc5.

📒 Files selected for processing (9)

apps/worker/src/five08/worker/crm/skills_extractor.py
packages/shared/src/five08/job_match.py
packages/shared/src/five08/llm.py
packages/shared/src/five08/llm_model_profiles.json
packages/shared/src/five08/resume_extractor.py
packages/shared/src/five08/resume_skills_extractor.py
tests/unit/test_job_match.py
tests/unit/test_llm.py
tests/unit/test_resume_extractor.py

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

+@lru_cache(maxsize=1)
+def _load_model_profiles() -> dict[str, Mapping[str, Any]]:
+    try:
+        raw = resources.files("five08").joinpath(_PROFILE_RESOURCE).read_text()
+        data = json.loads(raw)
+    except Exception as exc:  # pragma: no cover - static package data should exist
+        logger.warning("Failed to load LLM model profiles: %s", exc)
+        return {}


+def _profile_lookup_key(model: str) -> str:
+    value = (model or "").strip()
+    for prefix in _OPENAI_PROVIDER_PREFIXES:
+        if value.startswith(prefix):
+            return value.removeprefix(prefix)
+    return value


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0d7b492063

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-14T13:53:11Z

+            temperature=temperature,
+            max_tokens=max_tokens,
+            response_format=response_format,
+            reasoning_effort="minimal",


Use a gpt-5.1-supported reasoning effort

When the resume extractor is configured with gpt-5.1, the new profile in llm_model_profiles.json marks reasoning_effort as supported, so this hard-coded default is sent unchanged; however, the OpenAI Chat Completions docs list gpt-5.1 reasoning values as none, low, medium, and high rather than minimal (https://platform.openai.com/docs/api-reference/chat/create-completion). In that configuration, resume extraction requests will be rejected before falling back or parsing, so this should be model-specific or omitted unless the catalog supplies a valid value.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.

Add provider model request profiles

307ad13

Copilot AI review requested due to automatic review settings May 14, 2026 10:31

Copilot started reviewing on behalf of michaelmwu May 14, 2026 10:32 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

michaelmwu added 2 commits May 14, 2026 21:32

Address provider model review feedback

e1d9bc5

Merge origin/main into provider model factory

c737354

Copilot AI review requested due to automatic review settings May 14, 2026 13:38

michaelmwu temporarily deployed to test May 14, 2026 13:38 — with GitHub Actions Inactive

Copilot started reviewing on behalf of michaelmwu May 14, 2026 13:38 View session

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

Comment thread packages/shared/src/five08/llm.py

Comment thread packages/shared/src/five08/resume_extractor.py

Copilot AI reviewed May 14, 2026

View reviewed changes

Address model profile review comments

0d7b492

michaelmwu temporarily deployed to test May 14, 2026 13:48 — with GitHub Actions Inactive

chatgpt-codex-connector Bot reviewed May 14, 2026

View reviewed changes

Add gpt-5.1 chat completion profile

54dcc01

Copilot AI review requested due to automatic review settings May 14, 2026 14:30

michaelmwu temporarily deployed to test May 14, 2026 14:31 — with GitHub Actions Inactive

Copilot started reviewing on behalf of michaelmwu May 14, 2026 14:31 View session

michaelmwu merged commit a767be2 into main May 14, 2026
9 checks passed

michaelmwu deleted the michaelmwu/provider-model-factory branch May 14, 2026 14:34

Copilot AI reviewed May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add provider model request profiles#278

[codex] Add provider model request profiles#278
michaelmwu merged 5 commits into
mainfrom
michaelmwu/provider-model-factory

michaelmwu commented May 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 14, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

michaelmwu commented May 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Impact

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

michaelmwu commented May 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading