Integrate bug fix/triage tool into BugBug by suhaibmujahid · Pull Request #5959 · mozilla/bugbug

suhaibmujahid · 2026-04-22T19:04:56Z

For now, it runs via CLI; I'll follow up by creating a cloud-based service.

uv run --with bugsy,grizzly-framework,prefpicker scripts/run_bug_fix.py --bug_id 1234567

Co-Authored-By: Christian Holler (:decoder) <choller@mozilla.com>

Copilot

Pull request overview

This PR integrates the Larrey bug-triage/fix workflow into BugBug as a locally runnable CLI tool, and adds a new “duplicate_bugs” agent with associated prompts/config to support duplicate detection workflows.

Changes:

Add a CLI entry script to run the bug-fix triage agent locally (dry-run, verbose) with Pydantic CLI/env settings.
Introduce Bugzilla + Firefox in-process MCP servers and Firefox evaluation/build helpers to support triage investigation.
Add a new duplicate-bugs tool (agent + prompts + config) and ensure prompt/rule artifacts are packaged in wheels.

Reviewed changes

Copilot reviewed 19 out of 21 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
scripts/run_bug_fix.py	Local CLI wrapper for running `BugFixTool` with settings parsed from CLI/env.
pyproject.toml	Adds runtime deps and includes prompt/rule artifacts in wheel builds.
bugbug/tools/duplicate_bugs/prompts/dupdetector_local_to_local.md	Prompt for local crash-directory deduping.
bugbug/tools/duplicate_bugs/prompts/dupdetector_local.md	Prompt for matching a local crash dir to Bugzilla bugs.
bugbug/tools/duplicate_bugs/prompts/dupdetector_bugs.md	Prompt for finding duplicates among Bugzilla blockers.
bugbug/tools/duplicate_bugs/config.py	Shared config + verdict parsing helpers for duplicate-bugs flows.
bugbug/tools/duplicate_bugs/agent.py	Implements the duplicate-bugs agent modes and CLI-style runner logic.
bugbug/tools/duplicate_bugs/init.py	Exports `DuplicateBugsTool`.
bugbug/tools/bug_fix/rules/unsupported-config.md	Adds a triage ruleset about unsupported pref/config cases.
bugbug/tools/bug_fix/rules/README.md	Documents how triage rulesets are discovered/used.
bugbug/tools/bug_fix/prompts/system.md	System prompt for Larrey triage agent, including tool usage constraints.
bugbug/tools/bug_fix/firefox_tools/js_shell_evaluator.py	Runs JS shell testcases and captures crash output.
bugbug/tools/bug_fix/firefox_tools/evaluate_testcase.py	Runs browser testcases via grizzly and captures crash output.
bugbug/tools/bug_fix/firefox_tools/build_firefox.py	Builds Firefox with ASAN fuzzing mozconfig.
bugbug/tools/bug_fix/firefox_tools/init.py	Exposes Firefox tool implementations.
bugbug/tools/bug_fix/firefox_mcp.py	MCP server exposing Firefox build + evaluation tools.
bugbug/tools/bug_fix/config.py	Bug-fix tool config constants (tool allowlists, config keys).
bugbug/tools/bug_fix/bugzilla_mcp.py	MCP server wrapping Bugzilla REST access (read/write w/ dry-run + confirm).
bugbug/tools/bug_fix/agent.py	Main Larrey triage agent orchestration (Bugzilla + Firefox MCP, rules, streaming).
.gitignore	Ignores `.env` for local runs.

Comments suppressed due to low confidence (1)

pyproject.toml:23

The bugsy dependency is unconstrained while most dependencies are version-pinned or at least have a lower bound. To avoid unexpected breakage from upstream releases, add a version range (minimum tested version, optionally an upper bound).

    "beautifulsoup4==4.14.3",
    "boto3==1.42.78",
    "claude-agent-sdk>=0.1.30",
    "httpx==0.28.1",

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>

marco-c · 2026-04-23T20:40:08Z

+    if result.simulated_writes:
+        print(f"simulated writes: {len(result.simulated_writes)}")


What are the "simulated writes"?

This is a print out of the changes when running in dry-run.

marco-c · 2026-04-23T20:40:23Z

@@ -0,0 +1,47 @@
+"""Run the bug_fix triage tool locally."""


Suggested change

"""Run the bug_fix triage tool locally."""

"""Run the bug_fix tool locally."""

not just triage :)

Fixed in ff61e16

marco-c · 2026-04-23T20:42:02Z

@@ -0,0 +1,745 @@
+"""In-process MCP server wrapping bugsy for Bugzilla REST access.


Let's file an issue to merge this into the default MCP

Done: #5964

marco-c · 2026-04-23T20:54:59Z

+_CONFIG_KEYS = {"base_url", "source_repo", "rules_dir", "model", "max_turns", "effort"}
+
+# Valid values for the SDK's `effort` knob (adaptive thinking control).
+EFFORT_CHOICES = ("low", "medium", "high", "max")


there's also xhigh now, right?

It is not used anywhere, so I dropped it in ed57dbe.

marco-c · 2026-04-23T20:55:55Z

+import yaml
+
+# Tools that can modify the source repo — blocked under dry-run.
+SOURCE_WRITE_TOOLS = {"Write", "Edit", "MultiEdit", "NotebookEdit"}


What is "NotebookEdit"?

It modifies Jupyter notebook cells, it could be needed if it is using Jupyter Notebook for the STRP.

marco-c · 2026-04-23T20:58:53Z

+
+    @tool(
+        "evaluate_testcase",
+        "Run a testcase in an ASAN-instrumented Firefox under xvfb and "


It depends on the mozconfig, we should not always use ASAN

Fixed in 071cc9c

marco-c · 2026-04-23T20:59:19Z

+        "Build Firefox with the ASAN+UBSAN fuzzing mozconfig. Slow (tens of "
+        "minutes on a cold build, faster incremental). Returns JSON: "


Same here, given it's slow, we won't always use ASAN and UBSAN

Fixed in 071cc9c

marco-c · 2026-04-23T21:04:49Z

+Only label something as `unsupported-config` if that assessment would be true for all
+currently supported channels.
+
+Currently supported versions are: ESR115, ESR140, 149, 150 and 151.


This is going to be outdated very quickly :)

The whole file is hardcode, and we should consider replacing it with a dynamic way to provide the same information.

marco-c · 2026-04-23T21:05:20Z

+Unless you add the `unsupported-config` keyword, append a `[prefs-checked]` tag to the
+whiteboard so we don't have to repeat this process again.


I don't know if we'll want to do this for every bug

Dropped in e5007df

marco-c · 2026-04-23T21:13:44Z

+
+Process each bug exactly once. Do not loop back.
+
+**Task mode:** If the user message gives you a specific task directive (e.g. "set keyword X on bugs that match Y"), that directive replaces the default rules-driven triage workflow above. The rules directory remains available but is not mandatory — follow the task.


I guess we don't need this for now

I agree. Droped in 7d77c90.

marco-c · 2026-04-23T21:16:07Z

+
+# Source repository
+
+Your working directory is the source repository for the product these bugs are filed against. You have Read, Grep, Glob, and Bash to inspect it. Use this to answer questions like "does this function still exist", "where is this string defined", "what does this test actually check".


In a follow-up, we should also add the other tools that we already have like searchfox

I filed #5965

marco-c · 2026-04-23T21:17:24Z

+
+These tools are not gated by --dry-run: reproducing a crash is assessment, not modification.
+
+Only produce a fix when explicitly asked for, and follow these rules:


We might want to remove this and always have the agent attempt a fix if possible

Fixed in d7e1825

marco-c · 2026-04-23T21:17:58Z

+
+Only produce a fix when explicitly asked for, and follow these rules:
+
+- Before trying to reproduce or fix anything, ensure you are at origin/main with no local source changes.


If we create a fresh clone for each run, this line might not be necessary

Fixed in d7e1825

marco-c · 2026-04-23T21:18:52Z

+- Reproduce the issue first, then plan your fix and test that the issue no longer reproduces. If you cannot
+  reproduce the bug, do not post a fix patch. Comment instead that the bug wasn't reproducible automatically
+  and needs manual attention.


In some cases it would be feasible to attempt a fix even when the bug can't be reproduced (though we can add this option later I guess)

Let's keep it for now to reduce false positives, and we can iterate later if it turns out to be too restrictive.

marco-c · 2026-04-23T21:21:24Z

+- **What** you are about to change and **why** (cite the specific rule)
+- **Your confidence**: high / medium / low
+
+Only call `update_bug` to change fields when confidence is **high** and a specific triage rule directs it. If confidence is medium or low, `add_comment` instead to ask for clarification or note your findings — do not silently skip.


Given we have few triage rules, the confidence might never be "high" right now

Let us keep it and see how it works.

suhaibmujahid requested a review from Copilot April 22, 2026 19:04

Copilot started reviewing on behalf of suhaibmujahid April 22, 2026 19:05 View session

Integrate Larrey into BugBug

69ce6d2

Co-Authored-By: Christian Holler (:decoder) <choller@mozilla.com>

suhaibmujahid force-pushed the larrey-in-bugbug branch from 71915fb to 69ce6d2 Compare April 22, 2026 19:07

Copilot AI reviewed Apr 22, 2026

View reviewed changes

suhaibmujahid requested a review from marco-c April 22, 2026 20:03

suhaibmujahid marked this pull request as ready for review April 22, 2026 20:03

marco-c reviewed Apr 22, 2026

View reviewed changes

Comment thread bugbug/tools/bug_fix/firefox_tools/build_firefox.py Outdated

marco-c reviewed Apr 22, 2026

View reviewed changes

Comment thread bugbug/tools/bug_fix/firefox_tools/build_firefox.py Outdated

marco-c reviewed Apr 22, 2026

View reviewed changes

Comment thread bugbug/tools/bug_fix/firefox_tools/evaluate_testcase.py Outdated

marco-c reviewed Apr 22, 2026

View reviewed changes

Comment thread bugbug/tools/bug_fix/firefox_tools/evaluate_testcase.py Outdated

suhaibmujahid and others added 2 commits April 22, 2026 16:44

Apply suggestions from code review

bffdd25

Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>

Normalize naming and clarify wording

5f16dd1

marco-c changed the title ~~Integrate Larrey into BugBug~~ Integrate bug fix/triage tool into BugBug Apr 22, 2026

suhaibmujahid requested a review from marco-c April 23, 2026 16:43

marco-c reviewed Apr 23, 2026

View reviewed changes

marco-c previously approved these changes Apr 23, 2026

View reviewed changes

suhaibmujahid mentioned this pull request Apr 24, 2026

Merge bug_fix/bugzilla_mcp.py into the main MCP #5964

Open

suhaibmujahid added 2 commits April 24, 2026 12:34

Rename TriageResult to BugFixResult

a3d0366

Fix the docstring in run_bug_fix script

ff61e16

suhaibmujahid dismissed marco-c’s stale review via ff61e16 April 24, 2026 16:35

suhaibmujahid added 5 commits April 24, 2026 14:58

Remove EFFORT_CHOICES constant

ed57dbe

Remove task-mode and loop restriction from prompt

7d77c90

Clarify reproduction tooling guidance

d7e1825

Clarify and generalize tool/help text to not assume ASAN builds

071cc9c

Remove prefs-checked whiteboard guidance

e5007df

suhaibmujahid mentioned this pull request Apr 24, 2026

Expand the local tools that the bug fix agent can use #5965

Open

suhaibmujahid merged commit e4aec7a into mozilla:master Apr 27, 2026
6 checks passed

suhaibmujahid deleted the larrey-in-bugbug branch April 27, 2026 12:17

		if result.simulated_writes:
		print(f"simulated writes: {len(result.simulated_writes)}")

	"""Run the bug_fix triage tool locally."""
	"""Run the bug_fix tool locally."""

		@@ -0,0 +1,745 @@
		"""In-process MCP server wrapping bugsy for Bugzilla REST access.

		"Build Firefox with the ASAN+UBSAN fuzzing mozconfig. Slow (tens of "
		"minutes on a cold build, faster incremental). Returns JSON: "

		Unless you add the `unsupported-config` keyword, append a `[prefs-checked]` tag to the
		whiteboard so we don't have to repeat this process again.


		Process each bug exactly once. Do not loop back.

		Task mode: If the user message gives you a specific task directive (e.g. "set keyword X on bugs that match Y"), that directive replaces the default rules-driven triage workflow above. The rules directory remains available but is not mandatory — follow the task.


		# Source repository

		Your working directory is the source repository for the product these bugs are filed against. You have Read, Grep, Glob, and Bash to inspect it. Use this to answer questions like "does this function still exist", "where is this string defined", "what does this test actually check".


		These tools are not gated by --dry-run: reproducing a crash is assessment, not modification.

		Only produce a fix when explicitly asked for, and follow these rules:


		Only produce a fix when explicitly asked for, and follow these rules:

		- Before trying to reproduce or fix anything, ensure you are at origin/main with no local source changes.

Conversation

suhaibmujahid commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!