feat(slack): Slack mrkdwn output contract in <output> prompt section#212
feat(slack): Slack mrkdwn output contract in <output> prompt section#212devin-ai-integration[bot] wants to merge 10 commits intomainfrom
<output> prompt section#212Conversation
Formalize the design from issue #208 as a canonical draft spec covering: - Render-intent layer boundary between assistant output and Slack delivery - Three lanes: final reply, in-flight progress, durable entities - Plugin renderer registry contract (match/buildIntent/buildFallbackText/ buildActions/buildWorkObject) - SDK-first phasing using the installed chat/@chat-adapter/slack surfaces before adding Slack-specific block abstractions - Accessibility and fallback rules requiring top-level text for every block-bearing message - Failure model, degradation rules, and verification coverage targets The spec sits alongside slack-agent-delivery-spec.md and slack-outbound-contract-spec.md without changing their contracts, and is marked Draft because implementation has not landed yet. Refs: #208 Co-Authored-By: Claude Sonnet 4.5 <devin-ai-integration[bot]@users.noreply.github.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…a SKILL.md - Remove work_object_reference intent and the durable-entity lane. - Replace the plugin renderer registry with a guidance model: plugins influence rendering through SKILL.md content, not code or YAML templates. Rendering code stays in core; the intent palette is not plugin-extensible. - Collapse lanes to two: final reply and in-flight progress. - Rework the core render pipeline around one core renderer per intent kind with schema validation and plain_reply degradation on failure. - Drop the work-object observability attribute and durable_entity lane label. Drop work-object coverage from first-party targets. Refs: #208 Co-Authored-By: Claude Sonnet 4.5 <devin-ai-integration[bot]@users.noreply.github.com>
Introduce the closed, core-owned set of Slack render intents defined in specs/slack-rendering-spec.md: - plain_reply (pass-through) - summary_card - alert - comparison_table - result_carousel - progress_plan Each intent is validated by a zod discriminated-union schema. The renderer translates an intent into Slack Block Kit blocks plus a non-empty top-level fallback text derived from the same structured fields, which satisfies the outbound-contract requirement that every block-bearing message carry a non-empty top-level `text`. Wiring these intents into the turn-end path (so the model can emit an intent instead of raw text) is a follow-up change in the same track. Co-Authored-By: Devin <devin-ai-integration[bot]@users.noreply.github.com> Co-Authored-By: David Cramer <david@sentry.io>
Phase 1 of the Slack rendering engine (issue #208). - Add optional 'reply' tool (Renderer pattern): one tool whose input schema is the SlackRenderIntent discriminated union. Plain text replies keep working unchanged; 'reply' is only called when the model wants a richer intent. - Thread the captured intent from the tool -> AssistantReply -> planSlackReplyPosts -> postSlackApiReplyPosts, rendering blocks + non-empty fallback text at delivery time. - Add GitHub SKILL.md guidance teaching the model when to emit a summary_card for PRs/issues, with exact field recipes. - Update the spec to document the Intent Delivery Mechanism (ToolStrategy, Renderer vs Terminator trade-off, one tool with discriminated union). - Tests: reply-tool schema + capture; planSlackReplyPosts with intent produces blocks+fallback; existing plain text path unaffected. Co-Authored-By: David Cramer <david@sentry.io>
…ce to Linear and Sentry Per PR review: - The rendering engine ships as one coherent feature. Removed Phase 1/ Phase 2 language from the spec; replaced the 'SDK-First Phasing' section with 'Renderer Implementation' that describes what ships and notes that newer Slack block primitives can swap in behind the same intent schema without a contract change. - Strengthened the plugin guidance model to describe both axes plugins influence through SKILL.md: when to use an intent kind for their domain objects and how those objects populate the intent's structured fields. - Added render-intent reference files for Linear (issue, project) and Sentry (issue) so the pattern is proven across three plugins, not just GitHub. Each file documents field recipes, when to prefer alert/carousel, and when not to call reply at all. Co-Authored-By: David Cramer <david@sentry.io>
…im plugin recipes Add a <render-capabilities> section to the Slack system prompt when the native reply tool is registered, so the agent learns the palette and the selection rules from one place instead of duplicating them across plugin SKILL files. Plugin slack-render-intents files now carry only the domain-specific field recipes (GitHub PR/issue, Linear issue/project, Sentry issue) — no more palette preamble, no more 'when not to call reply' boilerplate. Also document the split in specs/slack-rendering-spec.md (section 5 'Guidance Model' + section 8 'Prompt and Model Behavior'): core teaches the capability, plugins teach the recipes. Co-Authored-By: David Cramer <david@sentry.io>
…e <slack-output> section
Agent was emitting GFM markdown tables (pipes render literally in Slack)
when users asked for comparisons. Root causes: (1) the <output-contract>
section carved out 'avoid tables unless explicitly requested', which
gave the model permission to emit pipe-tables; (2) mrkdwn vs GFM rules
were loose ('Slack-friendly markdown') with no syntax enumeration.
Consolidate <output-contract> and <render-capabilities> into a single
authoritative <slack-output> section that:
- Declares two forms: Form A (reply tool call with one palette intent)
and Form B (plain Slack mrkdwn). Forbids any third shape.
- Enumerates allowed mrkdwn syntax explicitly (*bold*, _italic_,
~strike~, <url|label>, etc.) and marks each GFM equivalent as literal.
- Lists forbidden constructs explicitly: markdown tables, # headings,
[label](url), HTML, raw Block Kit.
- Redirects table / comparison / matrix / diff requests to Form A
comparison_table when the reply tool is registered.
Intent rules and mrkdwn rules now live together so the model sees one
coherent Slack surface contract. Extract buildSlackOutputContract as a
top-level helper to keep buildSystemPrompt readable.
Also update specs/slack-rendering-spec.md sections 5 and 8 to reflect
the merged section name.
Co-Authored-By: David Cramer <david@sentry.io>
…als, bump model to gpt-5.4 Previously no eval exercised reply-tool intent selection end-to-end, so a regression like the agent emitting a GFM pipe-table when asked for a comparison would only surface in manual Slack testing. - Capture reply intents in the eval harness by recording AssistantReply.intent on the replyExecutor override and threading it through to EvalResult.replyIntents. - Surface reply_intents on the judge's serialized output schema so rubrics can assert 'the agent called reply with kind=comparison_table' or 'reply_intents is empty for a plain prose answer'. - Add packages/junior-evals/evals/core/slack-render-intents.eval.ts with two scenarios: (1) explicit 'give me a comparison table' must fire the reply tool with comparison_table and must not emit GFM pipe syntax; (2) a plain one-sentence question must not fire the reply tool and must stay in ordinary mrkdwn. - Bump the eval judge model from openai/gpt-5.2 to openai/gpt-5.4 to match the rest of the codebase (chat-config, vision fixtures). Co-Authored-By: David Cramer <david@sentry.io>
…rkdwn formatting guidance Drops the render-intent palette (summary_card, alert, comparison_table, result_carousel, progress_plan, plain_reply), the `reply` tool that selected them, the per-intent renderer, and the plugin-side slack-render-intents.md recipes. The output surface is now plain Slack `mrkdwn` text; the prompt's job is to teach the model which `mrkdwn` syntax Slack actually renders. - `<slack-output>` renamed to `<output surface="slack" ...>` and simplified to an allow-list (`*bold*`, `_italic_`, `~strike~`, inline/fenced code, block quotes, `<url|label>` links, mentions, bullet lists, bold section labels) and a forbid-list (pipe tables, `##` headings, `[label](url)`, `**bold**`, `~~strike~~`, HTML, raw Block Kit JSON). - `slack-rendering-spec.md` rewritten as a short output-contract spec; AGENTS.md and `specs/index.md` updated to match. - `buildRuntimeServices` / `collectResults` / `EvalResult` no longer carry `replyIntents`; `reply_intents` removed from eval output schema. - Plugin SKILL.md files (GitHub, Linear, Sentry) drop references to the deleted `slack-render-intents.md` recipes. - Replaces the render-intent eval with three mrkdwn-hygiene evals (no pipe-tables, Slack-shape emphasis/link syntax, bold section labels instead of markdown headings). Co-Authored-By: Devin <devin-ai-integration[bot]@users.noreply.github.com> Co-Authored-By: David Cramer <david@sentry.io>
<output> prompt section
The 'table' keyword has a strong prior in the model's training data that prompt-level rules don't reliably override. Dropping that scenario until we're ready to address it with a deterministic mechanism (outbound post-processor or provider-native response_format). The emphasis/link and bold-section-labels scenarios still catch the failure modes the <output> contract is designed to prevent. Co-Authored-By: Devin <devin-ai-integration[bot]@users.noreply.github.com> Co-Authored-By: David Cramer <david@sentry.io>
|
Superseded by #219. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 8b7cfc4. Configure here.
| ], | ||
| }), | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Missing pipe-table eval scenario described in PR
Medium Severity
The PR description states there are three eval scenarios, but only two are present. The missing first scenario — "Give me a short comparison table…" that validates GFM pipe-table syntax is not emitted — is absent. This is the primary regression the PR aims to fix ("the real regression we kept hitting in dev (e.g. GFM pipe-tables in comparison replies)"), and the Review & Testing Checklist instructs reviewers to "Run the three new evals," yet only two exist.
Reviewed by Cursor Bugbot for commit 8b7cfc4. Configure here.
| .replaceAll("&", "&") | ||
| .replaceAll("<", "<") | ||
| .replaceAll(">", ">"); | ||
| } |
There was a problem hiding this comment.
Unused exported escape function duplicates existing one
Low Severity
escapeSlackMrkdwnText is exported from render/blocks.ts but never imported or called anywhere in the codebase. It's functionally identical to the private escapeSlackMrkdwn in footer.ts. This is dead code left over from the removed render-intent layer.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 8b7cfc4. Configure here.


Summary
Addresses #208. Defines a single authoritative output contract for the Slack reply surface: the assistant's final reply is plain Slack
mrkdwntext, and the prompt teaches the exact syntax Slack renders versus the CommonMark/GFM it silently ignores.This PR started as a broader render-intent engine (native
replytool, six-kind intent palette, per-plugin recipe files, Block Kit renderer). Based on product review, that was scope-cut: tool calls are not preferred over a normal assistant response for visible replies, and the real regression we kept hitting in dev (e.g. GFM pipe-tables in comparison replies) was a prompt-adherence failure on mrkdwn, not an absence of a structured-layout mechanism. The render-intent code, thereplytool, the per-intent renderer, per-pluginslack-render-intents.mdrecipes, and thereplyIntentseval plumbing were all removed; what remains is the simplified output contract below.Prompt —
<output surface="slack">(chat/prompt.ts)buildSlackOutputContracthelper renames the old<output-contract format="slack-mrkdwn">section to<output surface="slack" ...>and rewrites it around two explicit lists:*bold*,_italic_,~strike~, inline/fenced code,> quotes,<url|label>links,<@USERID>/<#CHANNELID>/<!subteam^TEAMID>mentions,- itembullets, bold section labels.##headings,[label](url),**bold**,~~strike~~, HTML, raw Block Kit JSON.##→ bold label;[label](url)→<url|label>or bare URL).Spec (
specs/slack-rendering-spec.md)mrkdwn), allow-list, forbid-list, prompt surface (<output surface="slack">), failure model, verification.replytool, no structured-layout mechanism, no model-authored Block Kit. Revisit if and when there's a concrete product reason.AGENTS.mdandspecs/index.mdupdated to match.Evals
packages/junior-evals/evals/core/slack-mrkdwn-hygiene.eval.ts:*ready*,~draft~, and<url|label>(not**…**,~~…~~,[…](…)).#/##headings.openai/gpt-5.4.Small residual diffs
footer.tsstill exposesbuildSlackFooterContextBlockand re-exportsSlackMessageBlockfromrender/blocks.ts. These were extracted during the render-intent work; the extraction is kept because it's a safer shape for future composition and keeps the footer-type definition in one place.slack/reply.tschange is whitespace-only.Review & Testing Checklist for Human
<output surface="slack">copy inbuildSlackOutputContractend-to-end. This is the whole feature — if the copy drifts from how the model actually interprets it, the contract fails silently. Pay particular attention to the forbid-list reasons and the positive redirects.pnpm --filter @sentry/junior-evals test -- slack-mrkdwn-hygiene. These are the only live check that the new prompt actually steers the model away from pipe-tables,**bold**,[label](url), and##headings. CI green alone does not prove that.| pipes |, no##headings, no[label](url), no**bold**. Confirm normal conversational replies are unchanged.buildSlackFooterContextBlockinfooter.tsis currently only consumed bybuildSlackReplyBlocksin the same file. If it has no other callers after this PR, decide whether to inline it back or leave it as-is for a future composition surface.Notes
pnpm typecheck,pnpm lint,pnpm --filter @sentry/junior run test:slack-boundary,pnpm --filter @sentry/junior run test:arch-boundary, andpnpm skills:checkall pass locally.pnpm --filter @sentry/junior testhas one pre-existing failure (turn-checkpoint.test.ts,REDIS_URL is required) that reproduces onmainand is unrelated to this PR.Draft.slack-agent-delivery-spec.mdandslack-outbound-contract-spec.mdare unchanged; the output contract sits in front of them.Link to Devin session: https://app.devin.ai/sessions/8938b584a489401ba1b62021159f085d
Requested by: @dcramer