fix(commands): move noise stripping into jq pipelines for fetch-pr-feedback by anderskev · Pull Request #55 · existential-birds/beagle

anderskev · 2026-02-07T19:33:11Z

Summary

Move noise stripping from LLM instructions (Step 4) into clean_body jq functions applied directly in both comment-fetching pipelines (Step 3)
Strip <details> blocks, HTML comments, and bot footer boilerplate at the jq level before data reaches the LLM
Add 4000 char per-comment safety net with [comment truncated] marker

Context

Real data from a CodeRabbit-reviewed PR showed 125K chars total with only ~4.5K useful (3.6% signal). A single walkthrough comment was 70K chars. The previous approach relied on LLM instructions to strip noise, but the data was already too large before the LLM could process it.

Test plan

Run /beagle-core:fetch-pr-feedback against a PR with CodeRabbit reviews
Verify output is reasonable size (~5-10K chars vs 125K before)
Confirm no actionable review content is lost
Check [comment truncated] markers appear only on genuinely oversized comments (if any)

🤖 Generated with Claude Code

…edback Comment bodies from bot reviewers (e.g. CodeRabbit) contain massive amounts of noise (<details> blocks, HTML comments, bot footers) that inflate feedback files to ~125K chars with only ~4.5K useful. The noise stripping rules were LLM instructions in Step 4, meaning the data was already too large before stripping could happen. Move stripping into a clean_body jq function applied directly in both the issue comments and review comments pipelines. Add a 4000 char per-comment safety net with [comment truncated] marker. Update Step 4 to reference the jq-level stripping instead of listing LLM rules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-02-07T19:33:26Z

Walkthrough

A new clean_body jq function was added to sanitize comment text by removing <details> blocks, HTML comments, and trailing boilerplate, then truncating to 4000 characters and appending a [comment truncated] marker when needed. The transformation for both issue comments and review comments now maps body: (.body | clean_body) instead of using raw .body. Inline noise-stripping steps were removed from the narrative, and documentation text was updated to reflect that noise stripping and truncation are handled by clean_body.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: moving noise stripping logic from LLM instructions into jq pipelines for the fetch-pr-feedback command.
Description check	✅ Passed	The description is directly related to the changeset, providing context about why the change was made, what was changed, and a test plan for verification.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

No actionable comments were generated in the recent review. 🎉

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@plugins/beagle-core/commands/fetch-pr-feedback.md`:
- Around line 55-61: The truncation marker logic in clean_body never triggers
because you slice with .[:4000] before checking length; change the pipeline to
preserve the original string length (e.g., save the cleaned string to a
temporary symbol/variable like original = .) then emit original[:4000] + (if
original|length > 4000 then "\n\n[comment truncated]" else "" end); update the
clean_body pipeline to reference that temporary before slicing so the condition
can detect >4000 and append the marker.
- Line 56: The current gsub call uses a greedy pattern
"(?s)<details>.*</details>" which will remove content across multiple <details>
blocks; update the regex used in the gsub invocation to a non-greedy match like
"(?s)<details>.*?</details>" so each <details>...</details> pair is removed
independently (leave the gsub call and surrounding logic intact, only change the
regex).

plugins/beagle-core/commands/fetch-pr-feedback.md

The <details> regex used greedy .* which would consume content between multiple blocks. The truncation check ran after slicing, so the marker was never appended. Both clean_body definitions (issue + review comments) are fixed. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai bot requested changes Feb 7, 2026

View reviewed changes

plugins/beagle-core/commands/fetch-pr-feedback.md Show resolved Hide resolved

plugins/beagle-core/commands/fetch-pr-feedback.md Outdated Show resolved Hide resolved

anderskev self-assigned this Feb 7, 2026

anderskev added the bug Something isn't working label Feb 7, 2026

coderabbitai bot approved these changes Feb 7, 2026

View reviewed changes

anderskev merged commit 79cee7f into main Feb 7, 2026
1 check passed

anderskev deleted the fix/fetch-pr-feedback-noise-stripping branch February 7, 2026 20:00

anderskev mentioned this pull request Feb 7, 2026

chore(release): 2.1.1 #56

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(commands): move noise stripping into jq pipelines for fetch-pr-feedback#55

fix(commands): move noise stripping into jq pipelines for fetch-pr-feedback#55
anderskev merged 2 commits intomainfrom
fix/fetch-pr-feedback-noise-stripping

anderskev commented Feb 7, 2026

Uh oh!

coderabbitai bot commented Feb 7, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anderskev commented Feb 7, 2026

Summary

Context

Test plan

Uh oh!

coderabbitai bot commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Feb 7, 2026 •

edited

Loading