Skip to content

refactor(llm-access-codex): split request.rs into request/ submodules#4

Merged
acking-you merged 2 commits into
masterfrom
refactor/split-codex-request
May 30, 2026
Merged

refactor(llm-access-codex): split request.rs into request/ submodules#4
acking-you merged 2 commits into
masterfrom
refactor/split-codex-request

Conversation

@acking-you
Copy link
Copy Markdown
Owner

@acking-you acking-you commented May 30, 2026

Summary

Second step of the incremental llm-access* refactor: split the 3886-line
llm-access-codex/src/request.rs into a request/mod.rs facade plus 9
concern-focused submodules. Pure structural change — no behavior difference.

What changed

submodule concern
prepare request body intake: read, size-guard, decode, drive normalization
policy gpt-5.3-codex spark remap, fast tier, store alignment, billing multiplier
last_message best-effort last-message text extraction for usage logging
headers header / client-IP / origin / URL parsing
native_responses native /responses instruction injection, field stripping, tool-role repair, validation
normalization upstream base-url, model/effort, field filter/validate; hosts normalize_responses_request
tools tool-schema normalization + OpenAI tool-name mangle/restore maps
chat_completions OpenAI Chat Completions → /responses input adaptation
path gateway path classification

Module style (the agreed standard going forward):

  • mod.rs convention — matches the rest of the workspace.
  • No glob imports/re-exports. Submodules import each dependency explicitly
    from its origin (no use super::*); the facade re-exports only the 12
    items used outside the module
    , by name (no pub use X::*).
  • Minimal visibility. Items crossing a module boundary are pub with a
    detailed /// doc; file-internal helpers stay private. No pub(crate).
  • Tests inlined. The 2 adapt_openai_chat_completions_request unit tests
    move into chat_completions.rs (so the fn under test needs no extra
    exposure); the 8 pipeline integration tests stay in mod.rs against the
    public API. prepare_gateway_request / read_gateway_request_body are only
    reachable from tests, so they're #[cfg(test)]-gated.

Also removes 2 dead pub fn that PR #3 missed — codex normalize_name /
normalize_status had zero workspace callers but grep -w couldn't tell them
apart from admin's identically-named fns. Plus the 2 now-orphan
LLM_GATEWAY_KEY_STATUS_* consts they used.

Verification

  • cargo clippy -p llm-access-codex -- -D warnings: clean.
  • 79 codex tests pass (the 2 relocated unit tests included).
  • Reverse-dependency build (llm-access binary + llm-access-store) compiles
    against the narrowed public surface — proves the re-export list is complete.

Net diff is +4042 / −3886; the growth is module boilerplate + the /// docs
now required on the cross-module surface.

Review feedback (gemini-code-assist)

  • UTF-8 truncate panic in tools.rs — applied (commit in this PR). The
    mcp__ tool-name shortening used String::truncate, which panics when the
    byte index splits a multi-byte char; switched to chars().take(..).collect()
    to match the fallback already in the same fn. ASCII-identical, multi-byte-safe.
  • Redundant JSON re-parse in normalize_codex_public_model — deferred to a
    focused follow-up PR. It's a logic refactor (out of scope for this pure
    structural split), the suggested snippet would change error semantics (turns a
    non-JSON-object error into a silent pass-through), and it only affects the
    cold non-gpt--prefix fallback path, not normal codex requests.

Break the 3886-line request.rs into a request/mod.rs facade plus 9
concern-focused submodules (prepare, policy, last_message, headers,
native_responses, normalization, tools, chat_completions, path), following
the agreed module style:

- mod.rs convention (matches the rest of the workspace).
- No `use super::*` in submodules: each imports its deps explicitly from
  their origin. No `pub use X::*` facade globs: the parent re-exports only
  the 12 items used outside the module, by name.
- Minimal visibility: items crossing a module boundary are `pub` with a
  detailed `///` doc; file-internal helpers stay private. No `pub(crate)`.
- Tests inlined: the 2 adapt_openai_chat_completions_request unit tests move
  into chat_completions.rs (so the fn under test needs no extra exposure);
  the 8 pipeline integration tests stay in mod.rs against the public API.
  prepare_gateway_request / read_gateway_request_body are #[cfg(test)] (only
  reachable from tests), so they're gated accordingly.

Also removes 2 dead pub fns PR #3 missed because grep -w could not tell them
apart from admin's same-named fns: codex normalize_name / normalize_status
(zero workspace callers; admin has its own), plus the 2 now-orphan
LLM_GATEWAY_KEY_STATUS_* consts they used.

No behavior change: clippy -D warnings clean, 79 codex tests pass, and the
reverse-dep build (llm-access binary + llm-access-store) compiles against the
narrowed public surface.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive request normalization and adaptation layer for a Codex/OpenAI-compatible gateway. It adds modules to handle chat completions mapping, header extraction, last-message logging, native responses validation, and billing policies. The review feedback identifies two critical issues: a performance bottleneck in prepare.rs due to redundant JSON deserialization and serialization during model normalization, and a potential thread panic in tools.rs when truncating tool names containing multi-byte UTF-8 characters. Addressing these issues will improve both the performance and robustness of the gateway.

Comment on lines +198 to +206
let request_body = match json_value {
Some(value) => Bytes::from(
serde_json::to_vec(&value)
.map_err(|err| internal_error("Failed to encode gateway request body", err))?,
),
None => body,
};

let (request_body, model) = normalize_codex_public_model(request_body, model)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Performance Bottleneck: Redundant Deserialization/Serialization

Currently, normalize_codex_public_model takes the serialized request_body (as Bytes), deserializes it back into a serde_json::Value, modifies the model field, and then serializes it back to Bytes.

Since json_value is already parsed and available as a mutable Value in prepare_gateway_request_from_bytes, we can perform this normalization directly on json_value before serializing it. This completely avoids an extra round of JSON parsing and serialization, which is highly beneficial for performance under load.

    let mut model = model;
    if let Some(ref m) = model {
        if !m.starts_with("gpt-") {
            if let Some(Value::Object(ref mut root)) = json_value {
                root.insert("model".to_string(), Value::String(super::DEFAULT_PUBLIC_GPT_MODEL_ID.to_string()));
                model = Some(super::DEFAULT_PUBLIC_GPT_MODEL_ID.to_string());
            }
        }
    }

    let request_body = match json_value {
        Some(value) => Bytes::from(
            serde_json::to_vec(&value)
                .map_err(|err| internal_error("Failed to encode gateway request body", err))?,
        ),
        None => body,
    };

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid point on the redundant parse, but holding it out of this PR. (1) This PR is a pure structural split (no behaviour change) — inlining normalize_codex_public_model into prepare_gateway_request_from_bytes is a logic refactor, separate concern. (2) The suggested snippet changes error semantics: the current fn returns internal_error when model is non-gpt- and the body is not a JSON object, whereas if let Some(Value::Object(..)) = json_value silently passes that through. (3) It is a cold path — only the non-gpt--prefixed public-model fallback re-parses; normal codex requests (gpt-5.3-codex, ...) hit the early model.starts_with("gpt-") return and never re-serialize. Will do it as a focused follow-up PR with the error semantics preserved.

Comment on lines +104 to +106
if candidate.len() > MAX_OPENAI_TOOL_NAME_LEN {
candidate.truncate(MAX_OPENAI_TOOL_NAME_LEN);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Potential Panic on Multi-byte UTF-8 Characters

String::truncate panics if the specified limit does not fall on a UTF-8 character boundary. If a client sends a tool name containing multi-byte Unicode characters, truncating it using a byte-length limit like MAX_OPENAI_TOOL_NAME_LEN can cause a thread panic and crash the request handler.

To prevent this, truncate safely using character boundaries.

Suggested change
if candidate.len() > MAX_OPENAI_TOOL_NAME_LEN {
candidate.truncate(MAX_OPENAI_TOOL_NAME_LEN);
}
if candidate.len() > MAX_OPENAI_TOOL_NAME_LEN {
candidate = candidate.chars().take(MAX_OPENAI_TOOL_NAME_LEN).collect();
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in be58bd5: the mcp__ branch now truncates with chars().take(MAX_OPENAI_TOOL_NAME_LEN).collect() (matching the char-based fallback at the end of the fn), so a multi-byte UTF-8 tool name can no longer panic String::truncate. ASCII-identical.

`String::truncate` panics when the byte index splits a multi-byte UTF-8
character. A tool name with Unicode after the `mcp__` prefix could exceed
MAX_OPENAI_TOOL_NAME_LEN bytes and crash the request handler on truncate.
Use char-based truncation (`chars().take(..).collect()`), matching the
fallback path already in the same fn. ASCII-identical; multi-byte-safe.

Addresses a PR review finding from gemini-code-assist.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant