refactor(llm-access-codex): split request.rs into request/ submodules#4
Conversation
Break the 3886-line request.rs into a request/mod.rs facade plus 9 concern-focused submodules (prepare, policy, last_message, headers, native_responses, normalization, tools, chat_completions, path), following the agreed module style: - mod.rs convention (matches the rest of the workspace). - No `use super::*` in submodules: each imports its deps explicitly from their origin. No `pub use X::*` facade globs: the parent re-exports only the 12 items used outside the module, by name. - Minimal visibility: items crossing a module boundary are `pub` with a detailed `///` doc; file-internal helpers stay private. No `pub(crate)`. - Tests inlined: the 2 adapt_openai_chat_completions_request unit tests move into chat_completions.rs (so the fn under test needs no extra exposure); the 8 pipeline integration tests stay in mod.rs against the public API. prepare_gateway_request / read_gateway_request_body are #[cfg(test)] (only reachable from tests), so they're gated accordingly. Also removes 2 dead pub fns PR #3 missed because grep -w could not tell them apart from admin's same-named fns: codex normalize_name / normalize_status (zero workspace callers; admin has its own), plus the 2 now-orphan LLM_GATEWAY_KEY_STATUS_* consts they used. No behavior change: clippy -D warnings clean, 79 codex tests pass, and the reverse-dep build (llm-access binary + llm-access-store) compiles against the narrowed public surface. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive request normalization and adaptation layer for a Codex/OpenAI-compatible gateway. It adds modules to handle chat completions mapping, header extraction, last-message logging, native responses validation, and billing policies. The review feedback identifies two critical issues: a performance bottleneck in prepare.rs due to redundant JSON deserialization and serialization during model normalization, and a potential thread panic in tools.rs when truncating tool names containing multi-byte UTF-8 characters. Addressing these issues will improve both the performance and robustness of the gateway.
| let request_body = match json_value { | ||
| Some(value) => Bytes::from( | ||
| serde_json::to_vec(&value) | ||
| .map_err(|err| internal_error("Failed to encode gateway request body", err))?, | ||
| ), | ||
| None => body, | ||
| }; | ||
|
|
||
| let (request_body, model) = normalize_codex_public_model(request_body, model)?; |
There was a problem hiding this comment.
Performance Bottleneck: Redundant Deserialization/Serialization
Currently, normalize_codex_public_model takes the serialized request_body (as Bytes), deserializes it back into a serde_json::Value, modifies the model field, and then serializes it back to Bytes.
Since json_value is already parsed and available as a mutable Value in prepare_gateway_request_from_bytes, we can perform this normalization directly on json_value before serializing it. This completely avoids an extra round of JSON parsing and serialization, which is highly beneficial for performance under load.
let mut model = model;
if let Some(ref m) = model {
if !m.starts_with("gpt-") {
if let Some(Value::Object(ref mut root)) = json_value {
root.insert("model".to_string(), Value::String(super::DEFAULT_PUBLIC_GPT_MODEL_ID.to_string()));
model = Some(super::DEFAULT_PUBLIC_GPT_MODEL_ID.to_string());
}
}
}
let request_body = match json_value {
Some(value) => Bytes::from(
serde_json::to_vec(&value)
.map_err(|err| internal_error("Failed to encode gateway request body", err))?,
),
None => body,
};There was a problem hiding this comment.
Valid point on the redundant parse, but holding it out of this PR. (1) This PR is a pure structural split (no behaviour change) — inlining normalize_codex_public_model into prepare_gateway_request_from_bytes is a logic refactor, separate concern. (2) The suggested snippet changes error semantics: the current fn returns internal_error when model is non-gpt- and the body is not a JSON object, whereas if let Some(Value::Object(..)) = json_value silently passes that through. (3) It is a cold path — only the non-gpt--prefixed public-model fallback re-parses; normal codex requests (gpt-5.3-codex, ...) hit the early model.starts_with("gpt-") return and never re-serialize. Will do it as a focused follow-up PR with the error semantics preserved.
| if candidate.len() > MAX_OPENAI_TOOL_NAME_LEN { | ||
| candidate.truncate(MAX_OPENAI_TOOL_NAME_LEN); | ||
| } |
There was a problem hiding this comment.
Potential Panic on Multi-byte UTF-8 Characters
String::truncate panics if the specified limit does not fall on a UTF-8 character boundary. If a client sends a tool name containing multi-byte Unicode characters, truncating it using a byte-length limit like MAX_OPENAI_TOOL_NAME_LEN can cause a thread panic and crash the request handler.
To prevent this, truncate safely using character boundaries.
| if candidate.len() > MAX_OPENAI_TOOL_NAME_LEN { | |
| candidate.truncate(MAX_OPENAI_TOOL_NAME_LEN); | |
| } | |
| if candidate.len() > MAX_OPENAI_TOOL_NAME_LEN { | |
| candidate = candidate.chars().take(MAX_OPENAI_TOOL_NAME_LEN).collect(); | |
| } |
There was a problem hiding this comment.
Fixed in be58bd5: the mcp__ branch now truncates with chars().take(MAX_OPENAI_TOOL_NAME_LEN).collect() (matching the char-based fallback at the end of the fn), so a multi-byte UTF-8 tool name can no longer panic String::truncate. ASCII-identical.
`String::truncate` panics when the byte index splits a multi-byte UTF-8 character. A tool name with Unicode after the `mcp__` prefix could exceed MAX_OPENAI_TOOL_NAME_LEN bytes and crash the request handler on truncate. Use char-based truncation (`chars().take(..).collect()`), matching the fallback path already in the same fn. ASCII-identical; multi-byte-safe. Addresses a PR review finding from gemini-code-assist. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Second step of the incremental
llm-access*refactor: split the 3886-linellm-access-codex/src/request.rsinto arequest/mod.rsfacade plus 9concern-focused submodules. Pure structural change — no behavior difference.
What changed
preparepolicylast_messageheadersnative_responses/responsesinstruction injection, field stripping, tool-role repair, validationnormalizationnormalize_responses_requesttoolschat_completions/responsesinput adaptationpathModule style (the agreed standard going forward):
mod.rsconvention — matches the rest of the workspace.from its origin (no
use super::*); the facade re-exports only the 12items used outside the module, by name (no
pub use X::*).pubwith adetailed
///doc; file-internal helpers stay private. Nopub(crate).adapt_openai_chat_completions_requestunit testsmove into
chat_completions.rs(so the fn under test needs no extraexposure); the 8 pipeline integration tests stay in
mod.rsagainst thepublic API.
prepare_gateway_request/read_gateway_request_bodyare onlyreachable from tests, so they're
#[cfg(test)]-gated.Also removes 2 dead
pub fnthat PR #3 missed — codexnormalize_name/normalize_statushad zero workspace callers butgrep -wcouldn't tell themapart from admin's identically-named fns. Plus the 2 now-orphan
LLM_GATEWAY_KEY_STATUS_*consts they used.Verification
cargo clippy -p llm-access-codex -- -D warnings: clean.llm-accessbinary +llm-access-store) compilesagainst the narrowed public surface — proves the re-export list is complete.
Net diff is
+4042 / −3886; the growth is module boilerplate + the///docsnow required on the cross-module surface.
Review feedback (gemini-code-assist)
tools.rs— applied (commit in this PR). Themcp__tool-name shortening usedString::truncate, which panics when thebyte index splits a multi-byte char; switched to
chars().take(..).collect()to match the fallback already in the same fn. ASCII-identical, multi-byte-safe.
normalize_codex_public_model— deferred to afocused follow-up PR. It's a logic refactor (out of scope for this pure
structural split), the suggested snippet would change error semantics (turns a
non-JSON-object error into a silent pass-through), and it only affects the
cold non-
gpt--prefix fallback path, not normal codex requests.