feat(chat): reorder harness markers and split compaction buckets by devin-ai-integration[bot] · Pull Request #222 · getsentry/junior

devin-ai-integration · 2026-04-19T19:08:40Z

Summary

Reshapes the user-turn prompt wrapper and thread-background rendering so Claude Sonnet and GPT-5 treat the current instruction as authoritative and prior thread context as read-only reference material. Addresses the failure mode tracked in #221 and getsentry/junior-prod#35, where Junior drifts onto a narrowed-but-superseded ask from earlier in a thread.

Changes in packages/junior/src/chat/respond-helpers.ts (buildUserTurnText):

Order (top → bottom): <thread-background>, <session-context>, <turn-context>, <current-instruction priority="highest"> — <current-instruction> is always the final block, matching Anthropic's long-context guidance to place the active query last.
Drops legacy <current-message> / <thread-conversation-context> wrappers.
No explanatory prose inside markers — tag names carry the signal.

Changes in packages/junior/src/chat/services/conversation-memory.ts:

buildConversationContext wraps each compaction in <compaction index=… covered_messages=… created_at=…> and each transcript entry in <message index=… ts=… role=… author=… slack_ts=…>, so each prior item is an individually addressable reference instead of a flat blob.
summarizeConversationChunk prompt now produces three fixed sections — <active-asks>, <superseded-or-completed-asks>, <facts> — so stale or already-acted-on asks stop reading as live constraints after compaction.

Rationale and authoritative prior art (Anthropic long-context guide, OpenAI GPT-5 prompting guide, OpenAI Model Spec chain-of-command) are cited in #221.

Review & Testing Checklist for Human

Sanity-check the new buildUserTurnText output shape against a real thread turn (e.g. local dev or an eval snapshot) and confirm the final tag emitted is </current-instruction> and <thread-background> precedes it.
Spot-check one compacted conversation in a real thread to confirm the summarizer is producing the three-bucket XML (active / superseded / facts) rather than a free-form paragraph. Because the summarizer is model-generated, the prompt change only shapes output — run against the production fast model to verify it complies.
Decide whether this should be gated behind an eval sweep on both Sonnet and GPT-5 gateway models before relying on the new marker shape for production traffic. This PR does not add such an eval.

Notes

Intentionally preserved the <thread-transcript> / <thread-compactions> marker names; routing fixtures in tests/unit/routing/subscribed-decision.test.ts still reference them.
No runtime behavior change beyond the emitted prompt text; no new dependencies, no schema changes. Compaction storage format (summary: string) is unchanged — only the prompt that generates it is updated.
Pre-existing unit-test failure tests/unit/services/turn-checkpoint.test.ts > reuses the latest stored transcript… reproduces on main (requires REDIS_URL) and is unrelated to this PR.
Follow-up candidates (not in this PR): add an eval that exercises narrow-then-broaden instruction drift across a compacted thread; consider also marking the assistant's own prior tool calls with an executed flag in <message> wrappers.

Link to Devin session: https://app.devin.ai/sessions/f46faf27a4354f7dab95abd8dfc50211
Requested by: @dcramer

…ecedence Put thread background first, latest user instruction last, and add an explicit instruction-precedence block. Wrap per-compaction and per-message items with metadata attributes so they read as individual references instead of one flat blob. Split compaction summaries into active-asks / superseded-or-completed-asks / facts buckets so stale or completed asks stop reading as currently active. Rationale and citations in #221. Co-Authored-By: Devin <devin@cognition.ai>

devin-ai-integration · 2026-04-19T19:08:43Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR that start with 'DevinAI' or '@devin'.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

vercel · 2026-04-19T19:08:45Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
junior-docs	Ready	Preview, Comment	Apr 19, 2026 9:01pm

Shrink JSDocs on buildUserTurnText and buildConversationContext to the intent only, drop the redundant <thread-background> preamble (the precedence block already covers it), fold the single-use attr helpers into buildConversationContext, and collapse single-line section pushes. No behavior change. Co-Authored-By: Devin <devin@cognition.ai>

…er block Keep <latest-user-instruction> as the final block of the user turn so the model sees the active ask last, and move <instruction-precedence> to the top of the wrapper so the reconciliation rules frame the context that follows. Each marker (<thread-background>, <session-context>, <turn-context>, <latest-user-instruction>, and the <thread-compactions>/<thread-transcript> blocks inside background) now opens with a one-line purpose statement so the role of every block is self-describing. Co-Authored-By: Claude sonnet-4.5 <devin-ai-integration[bot]@users.noreply.github.com>

Tag names are the system markers; they do not need an explanatory sentence inside each block. Remove the <instruction-precedence> wrapper and the descriptor lines from <thread-background>, <session-context>, <turn-context>, <latest-user-instruction>, <thread-compactions>, and <thread-transcript>. Behavior-relevant structure (ordering, per-compaction/per-message metadata, priority="highest" on the latest instruction) is preserved. Co-Authored-By: Claude sonnet-4.5 <devin-ai-integration[bot]@users.noreply.github.com>

…ion> The 'user' qualifier is implicit in the turn context and 'current' is more direct than 'latest'. Attribute (priority="highest") and placement (final block of the wrapper) are unchanged. Co-Authored-By: Claude sonnet-4.5 <devin-ai-integration[bot]@users.noreply.github.com>

specs/testing/unit-spec.md:47 bans unit tests that assert exact or substring prompt prose on prompt builders. Keep only the pure-logic branch cases (raw pass-through, empty-conversation undefined) and defer structural XML validation to integration or eval coverage. Co-Authored-By: Claude sonnet-4.5 <devin-ai-integration[bot]@users.noreply.github.com>

The local escapeAttr only handled double quotes, so author names and slack_ts values containing &, <, or > would produce malformed XML attributes. Swap to the shared escapeXml utility from @/chat/xml, which covers all five XML special characters. Co-Authored-By: Claude sonnet-4.5 <devin-ai-integration[bot]@users.noreply.github.com>

devin-ai-integration Bot assigned dcramer Apr 19, 2026

devin-ai-integration Bot requested a review from dcramer April 19, 2026 19:08

vercel Bot deployed to Preview – junior-docs April 19, 2026 19:08 View deployment

vercel Bot deployed to Preview – junior-docs April 19, 2026 19:14 View deployment

vercel Bot deployed to Preview – junior-docs April 19, 2026 19:56 View deployment

vercel Bot deployed to Preview – junior-docs April 19, 2026 20:03 View deployment

devin-ai-integration Bot changed the title ~~feat(chat): reorder harness markers for Sonnet + GPT-5 instruction precedence~~ feat(chat): reorder harness markers and split compaction buckets Apr 19, 2026

vercel Bot deployed to Preview – junior-docs April 19, 2026 20:10 View deployment

dcramer marked this pull request as ready for review April 19, 2026 20:17

This comment was marked as resolved.

Sign in to view

vercel Bot deployed to Preview – junior-docs April 19, 2026 20:27 View deployment

vercel Bot deployed to Preview – junior-docs April 19, 2026 21:01 View deployment

dcramer merged commit 7f8f845 into main Apr 19, 2026
15 checks passed

dcramer deleted the devin/1776625437-harness-markers-instruction-precedence branch April 19, 2026 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chat): reorder harness markers and split compaction buckets#222

feat(chat): reorder harness markers and split compaction buckets#222
dcramer merged 7 commits intomainfrom
devin/1776625437-harness-markers-instruction-precedence

devin-ai-integration Bot commented Apr 19, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot commented Apr 19, 2026

Uh oh!

vercel Bot commented Apr 19, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devin-ai-integration Bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration Bot commented Apr 19, 2026

🤖 Devin AI Engineer

Uh oh!

vercel Bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration Bot commented Apr 19, 2026 •

edited

Loading

vercel Bot commented Apr 19, 2026 •

edited

Loading