Skip to content

Conversation

@dylan-hurd-oai
Copy link
Collaborator

@dylan-hurd-oai dylan-hurd-oai commented Jan 18, 2026

Summary

This PR consolidates base_instructions onto SessionMeta / SessionConfiguration, so we ensure base_instructions is set once per session and should be (mostly) immutable, unless:

  • overridden by config on resume / fork
  • sub-agent tasks, like review or collab

In a future PR, we should convert all references to base_instructions to consistently used the typed struct, so it's less likely that we put other strings there. See #9423. However, this PR is already quite complex, so I'm deferring that to a follow-up.

Testing

  • Added a resume test to assert that instructions are preserved. In particular, resume_switches_models_preserves_base_instructions fails against main.

Existing test coverage thats assert base instructions are preserved across multiple requests in a session:

  • Manual compact keeps baseline instructions: core/tests/suite/compact.rs:199
  • Auto-compact keeps baseline instructions: core/tests/suite/compact.rs:1142
  • Prompt caching reuses the same instructions across two requests: core/tests/suite/prompt_caching.rs:150 and core/tests/suite/prompt_caching.rs:157
  • Prompt caching with explicit expected string across two requests: core/tests/suite/prompt_caching.rs:213 and core/tests/suite/prompt_caching.rs:222
  • Resume with model switch keeps original instructions: core/tests/suite/resume.rs:136
  • Compact/resume/fork uses request 0 instructions for later expected payloads: core/tests/suite/compact_resume_fork.rs:215

Copy link
Collaborator

@aibrahim-oai aibrahim-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean!

@aibrahim-oai
Copy link
Collaborator

@codex review this

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b51f01332

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2795 to 2798
let base_instructions = sess.get_base_instructions().await;

let prompt = Prompt {
input,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Align token estimation with session base instructions

Sampling requests now pull base_instructions from the session (sess.get_base_instructions()), but token estimation/auto‑compaction still uses turn_context.client.get_model_info().base_instructions in ContextManager::estimate_token_count. When a session’s base instructions diverge from the model default (e.g., config override, resume/fork with preserved instructions, or switching models mid‑session), the estimate becomes inaccurate, which can delay auto‑compaction or misreport token usage and lead to context‑window errors. Consider basing the estimate on the session base instructions used for prompts so the count matches actual request payloads.

Useful? React with 👍 / 👎.

@dylan-hurd-oai
Copy link
Collaborator Author

Looks like bazel build is failing due to cross-crate import, will duplicate across before merging

Copy link
Collaborator

@jif-oai jif-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this means that we can't modify the instructions once the session is started (like the /search was doing before for example)
If we are ok with this, good for me

"cwd": ".",
"originator": "test_originator",
"cli_version": "test_version",
"base_instructions": null,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure we have a test where this is just missing as well?

let mut payload = serde_json::json!({
"id": uuid,
"timestamp": ts_str,
"cwd": ".",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to have a test where this is not null


fn build_agent_spawn_config(turn: &TurnContext) -> Result<Config, FunctionCallError> {
fn build_agent_spawn_config(
base_instructions: &BaseInstructions,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are already some base_instructions in TurnContext, why do you need different once here? Can you document a bit please? (tbh I don't even make sure we need this)

Copy link
Collaborator Author

@dylan-hurd-oai dylan-hurd-oai Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm removing base_instructions from TurnContext altogether, since it shouldn't change from turn to turn. Happy to add a general comment.

But I think the spirit of this change is "build_agent_spawn_config" should inherit session-level config, not just config housed in TurnContext. If the spirit of TurnContext is to denormalize session-level information as well, I'm happy to shift in that direction, but I think we'd want to refactor the serialization there, since we wouldn't want to save the entire base_instructions to the rollout in every turn.


use super::*;

struct InstructionsTestCase {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yesssss

}

/// Base instructions for the model in a thread. Corresponds to the `instructions` field in the ResponsesAPI.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, JsonSchema, TS)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make 100% we don't serialize using this serializer when sending to the model as it would be wrong

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codex check for existing test coverage on the instructions field for the responses api request

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer

I checked for existing tests that explicitly assert the instructions field sent in Responses API requests (the field described in BaseInstructions in protocol/src/models.rs). There is existing coverage in core suite tests that validate the instructions field contents across requests, including cases like prompt caching, compaction, and resume/fork flows.

Existing tests that assert instructions in Responses requests

These tests together exercise the instructions field in the Responses API request bodies across multiple session lifecycle scenarios, which should cover the concern raised at protocol/src/models.rs line ~170 (the BaseInstructions struct tied to the instructions field).


Commands I ran to verify coverage


If you want deeper coverage focused specifically on serialization paths in protocol/src/models.rs, I can add a unit test directly in that crate, but the current suite already validates the instructions field in actual request payloads sent to the Responses API.

View task →

@dylan-hurd-oai dylan-hurd-oai merged commit 675f165 into main Jan 20, 2026
32 checks passed
@dylan-hurd-oai dylan-hurd-oai deleted the dh--session-base-instructions branch January 20, 2026 05:59
@github-actions github-actions bot locked and limited conversation to collaborators Jan 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants