-
Notifications
You must be signed in to change notification settings - Fork 7.4k
fix(core) Preserve base_instructions in SessionMeta #9427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
aibrahim-oai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean!
|
@codex review this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7b51f01332
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let base_instructions = sess.get_base_instructions().await; | ||
|
|
||
| let prompt = Prompt { | ||
| input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align token estimation with session base instructions
Sampling requests now pull base_instructions from the session (sess.get_base_instructions()), but token estimation/auto‑compaction still uses turn_context.client.get_model_info().base_instructions in ContextManager::estimate_token_count. When a session’s base instructions diverge from the model default (e.g., config override, resume/fork with preserved instructions, or switching models mid‑session), the estimate becomes inaccurate, which can delay auto‑compaction or misreport token usage and lead to context‑window errors. Consider basing the estimate on the session base instructions used for prompts so the count matches actual request payloads.
Useful? React with 👍 / 👎.
|
Looks like bazel build is failing due to cross-crate import, will duplicate across before merging |
jif-oai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this means that we can't modify the instructions once the session is started (like the /search was doing before for example)
If we are ok with this, good for me
| "cwd": ".", | ||
| "originator": "test_originator", | ||
| "cli_version": "test_version", | ||
| "base_instructions": null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make sure we have a test where this is just missing as well?
| let mut payload = serde_json::json!({ | ||
| "id": uuid, | ||
| "timestamp": ts_str, | ||
| "cwd": ".", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to have a test where this is not null
|
|
||
| fn build_agent_spawn_config(turn: &TurnContext) -> Result<Config, FunctionCallError> { | ||
| fn build_agent_spawn_config( | ||
| base_instructions: &BaseInstructions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are already some base_instructions in TurnContext, why do you need different once here? Can you document a bit please? (tbh I don't even make sure we need this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm removing base_instructions from TurnContext altogether, since it shouldn't change from turn to turn. Happy to add a general comment.
But I think the spirit of this change is "build_agent_spawn_config" should inherit session-level config, not just config housed in TurnContext. If the spirit of TurnContext is to denormalize session-level information as well, I'm happy to shift in that direction, but I think we'd want to refactor the serialization there, since we wouldn't want to save the entire base_instructions to the rollout in every turn.
|
|
||
| use super::*; | ||
|
|
||
| struct InstructionsTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yesssss
| } | ||
|
|
||
| /// Base instructions for the model in a thread. Corresponds to the `instructions` field in the ResponsesAPI. | ||
| #[derive(Debug, Clone, Serialize, Deserialize, PartialEq, JsonSchema, TS)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make 100% we don't serialize using this serializer when sending to the model as it would be wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codex check for existing test coverage on the instructions field for the responses api request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answer
I checked for existing tests that explicitly assert the instructions field sent in Responses API requests (the field described in BaseInstructions in protocol/src/models.rs). There is existing coverage in core suite tests that validate the instructions field contents across requests, including cases like prompt caching, compaction, and resume/fork flows.
Existing tests that assert instructions in Responses requests
- Prompt caching: Verifies that multiple requests send identical
instructions(derived from base instructions plus tool instructions when applicable). This is a direct assertion onbody["instructions"]for two requests. codex-rs/core/tests/suite/prompt_caching.rsL130-L223 - Manual compaction: Confirms the
instructionsfield is preserved across the initial request and the compaction request (ensuring the same instructions are sent). codex-rs/core/tests/suite/compact.rsL198-L211 - Resume with model switch: Captures
instructionsfrom the initial request and asserts the resumed request uses the sameinstructionsvalue. codex-rs/core/tests/suite/resume.rsL167-L205 - Compact → Resume → Fork: Saves
requests[0]["instructions"]and reuses it as the expected prompt payload for later requests, ensuring the original instructions are carried forward across flows. codex-rs/core/tests/suite/compact_resume_fork.rsL215-L263
These tests together exercise the instructions field in the Responses API request bodies across multiple session lifecycle scenarios, which should cover the concern raised at protocol/src/models.rs line ~170 (the BaseInstructions struct tied to the instructions field).
Commands I ran to verify coverage
rg -n "instructions" codex-rs -g"*.rs"(used to locate tests referencinginstructionsand request bodies). codex-rs/core/tests/suite/prompt_caching.rsL130-L223sed -n '130,240p' codex-rs/core/tests/suite/prompt_caching.rscodex-rs/core/tests/suite/prompt_caching.rsL130-L223sed -n '170,240p' codex-rs/core/tests/suite/compact.rscodex-rs/core/tests/suite/compact.rsL198-L211sed -n '150,230p' codex-rs/core/tests/suite/resume.rscodex-rs/core/tests/suite/resume.rsL167-L205sed -n '200,270p' codex-rs/core/tests/suite/compact_resume_fork.rscodex-rs/core/tests/suite/compact_resume_fork.rsL215-L263
If you want deeper coverage focused specifically on serialization paths in protocol/src/models.rs, I can add a unit test directly in that crate, but the current suite already validates the instructions field in actual request payloads sent to the Responses API.
Summary
This PR consolidates base_instructions onto SessionMeta / SessionConfiguration, so we ensure
base_instructionsis set once per session and should be (mostly) immutable, unless:In a future PR, we should convert all references to
base_instructionsto consistently used the typed struct, so it's less likely that we put other strings there. See #9423. However, this PR is already quite complex, so I'm deferring that to a follow-up.Testing
resume_switches_models_preserves_base_instructionsfails against main.Existing test coverage thats assert base instructions are preserved across multiple requests in a session: