Move scratchpad injection to tail of message list for cache efficiency#147
Merged
Move scratchpad injection to tail of message list for cache efficiency#147
Conversation
Previously the scratchpad was injected right after the pinned zone (position ~2), meaning any content change invalidated the cache for the entire conversation that followed. Now it's inserted just before the final user message, keeping the long prefix byte-identical across turns and maximizing provider prompt-cache hit rates. Also strips ALL stale scratchpad blocks (not just the first match) to prevent duplicates after compaction merges messages. https://claude.ai/code/session_011wxfX5yFjds1KXZSzrRYhC
Instead of splicing before the last user message, just push to the end. Simpler code, same cache benefit, and the scratchpad is the last thing the model sees before generating. https://claude.ai/code/session_011wxfX5yFjds1KXZSzrRYhC
Tool ordering from Map insertion order is fragile — conditionally enabled tools shift the list and break provider prompt caches. Sorting by name ensures the tool block is byte-identical across sessions regardless of registration order. https://claude.ai/code/session_011wxfX5yFjds1KXZSzrRYhC
Skill iteration order from Map depends on filesystem glob order, which isn't deterministic across runs. Sorting by name ensures the skills XML block in the pinned zone is byte-identical across sessions. https://claude.ai/code/session_011wxfX5yFjds1KXZSzrRYhC
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refactored the scratchpad middleware to inject the scratchpad message just before the final user message (in the tail region) instead of after the first user message (in the pinned prefix zone). This optimization keeps the long, stable prefix (system prompt + context files + conversation history) byte-identical across turns, maximizing provider prompt-cache hit rates.
Key Changes
breakstatements tocontinuein the stripping loop to ensure all scratchpad blocks are removed before re-injectionImplementation Details
https://claude.ai/code/session_011wxfX5yFjds1KXZSzrRYhC