Integrate bug fix/triage tool into BugBug#5959
Integrate bug fix/triage tool into BugBug#5959suhaibmujahid merged 10 commits intomozilla:masterfrom
Conversation
Co-Authored-By: Christian Holler (:decoder) <choller@mozilla.com>
71915fb to
69ce6d2
Compare
There was a problem hiding this comment.
Pull request overview
This PR integrates the Larrey bug-triage/fix workflow into BugBug as a locally runnable CLI tool, and adds a new “duplicate_bugs” agent with associated prompts/config to support duplicate detection workflows.
Changes:
- Add a CLI entry script to run the bug-fix triage agent locally (dry-run, verbose) with Pydantic CLI/env settings.
- Introduce Bugzilla + Firefox in-process MCP servers and Firefox evaluation/build helpers to support triage investigation.
- Add a new duplicate-bugs tool (agent + prompts + config) and ensure prompt/rule artifacts are packaged in wheels.
Reviewed changes
Copilot reviewed 19 out of 21 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/run_bug_fix.py | Local CLI wrapper for running BugFixTool with settings parsed from CLI/env. |
| pyproject.toml | Adds runtime deps and includes prompt/rule artifacts in wheel builds. |
| bugbug/tools/duplicate_bugs/prompts/dupdetector_local_to_local.md | Prompt for local crash-directory deduping. |
| bugbug/tools/duplicate_bugs/prompts/dupdetector_local.md | Prompt for matching a local crash dir to Bugzilla bugs. |
| bugbug/tools/duplicate_bugs/prompts/dupdetector_bugs.md | Prompt for finding duplicates among Bugzilla blockers. |
| bugbug/tools/duplicate_bugs/config.py | Shared config + verdict parsing helpers for duplicate-bugs flows. |
| bugbug/tools/duplicate_bugs/agent.py | Implements the duplicate-bugs agent modes and CLI-style runner logic. |
| bugbug/tools/duplicate_bugs/init.py | Exports DuplicateBugsTool. |
| bugbug/tools/bug_fix/rules/unsupported-config.md | Adds a triage ruleset about unsupported pref/config cases. |
| bugbug/tools/bug_fix/rules/README.md | Documents how triage rulesets are discovered/used. |
| bugbug/tools/bug_fix/prompts/system.md | System prompt for Larrey triage agent, including tool usage constraints. |
| bugbug/tools/bug_fix/firefox_tools/js_shell_evaluator.py | Runs JS shell testcases and captures crash output. |
| bugbug/tools/bug_fix/firefox_tools/evaluate_testcase.py | Runs browser testcases via grizzly and captures crash output. |
| bugbug/tools/bug_fix/firefox_tools/build_firefox.py | Builds Firefox with ASAN fuzzing mozconfig. |
| bugbug/tools/bug_fix/firefox_tools/init.py | Exposes Firefox tool implementations. |
| bugbug/tools/bug_fix/firefox_mcp.py | MCP server exposing Firefox build + evaluation tools. |
| bugbug/tools/bug_fix/config.py | Bug-fix tool config constants (tool allowlists, config keys). |
| bugbug/tools/bug_fix/bugzilla_mcp.py | MCP server wrapping Bugzilla REST access (read/write w/ dry-run + confirm). |
| bugbug/tools/bug_fix/agent.py | Main Larrey triage agent orchestration (Bugzilla + Firefox MCP, rules, streaming). |
| .gitignore | Ignores .env for local runs. |
Comments suppressed due to low confidence (1)
pyproject.toml:23
- The
bugsydependency is unconstrained while most dependencies are version-pinned or at least have a lower bound. To avoid unexpected breakage from upstream releases, add a version range (minimum tested version, optionally an upper bound).
"beautifulsoup4==4.14.3",
"boto3==1.42.78",
"claude-agent-sdk>=0.1.30",
"httpx==0.28.1",
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>
| if result.simulated_writes: | ||
| print(f"simulated writes: {len(result.simulated_writes)}") |
There was a problem hiding this comment.
What are the "simulated writes"?
There was a problem hiding this comment.
This is a print out of the changes when running in dry-run.
| @@ -0,0 +1,47 @@ | |||
| """Run the bug_fix triage tool locally.""" | |||
There was a problem hiding this comment.
| """Run the bug_fix triage tool locally.""" | |
| """Run the bug_fix tool locally.""" |
not just triage :)
| @@ -0,0 +1,745 @@ | |||
| """In-process MCP server wrapping bugsy for Bugzilla REST access. | |||
There was a problem hiding this comment.
Let's file an issue to merge this into the default MCP
| _CONFIG_KEYS = {"base_url", "source_repo", "rules_dir", "model", "max_turns", "effort"} | ||
|
|
||
| # Valid values for the SDK's `effort` knob (adaptive thinking control). | ||
| EFFORT_CHOICES = ("low", "medium", "high", "max") |
There was a problem hiding this comment.
there's also xhigh now, right?
There was a problem hiding this comment.
It is not used anywhere, so I dropped it in ed57dbe.
| import yaml | ||
|
|
||
| # Tools that can modify the source repo — blocked under dry-run. | ||
| SOURCE_WRITE_TOOLS = {"Write", "Edit", "MultiEdit", "NotebookEdit"} |
There was a problem hiding this comment.
It modifies Jupyter notebook cells, it could be needed if it is using Jupyter Notebook for the STRP.
|
|
||
| @tool( | ||
| "evaluate_testcase", | ||
| "Run a testcase in an ASAN-instrumented Firefox under xvfb and " |
There was a problem hiding this comment.
It depends on the mozconfig, we should not always use ASAN
| "Build Firefox with the ASAN+UBSAN fuzzing mozconfig. Slow (tens of " | ||
| "minutes on a cold build, faster incremental). Returns JSON: " |
There was a problem hiding this comment.
Same here, given it's slow, we won't always use ASAN and UBSAN
| Only label something as `unsupported-config` if that assessment would be true for all | ||
| currently supported channels. | ||
|
|
||
| Currently supported versions are: ESR115, ESR140, 149, 150 and 151. |
There was a problem hiding this comment.
This is going to be outdated very quickly :)
There was a problem hiding this comment.
The whole file is hardcode, and we should consider replacing it with a dynamic way to provide the same information.
| Unless you add the `unsupported-config` keyword, append a `[prefs-checked]` tag to the | ||
| whiteboard so we don't have to repeat this process again. |
There was a problem hiding this comment.
I don't know if we'll want to do this for every bug
|
|
||
| Process each bug exactly once. Do not loop back. | ||
|
|
||
| **Task mode:** If the user message gives you a specific task directive (e.g. "set keyword X on bugs that match Y"), that directive replaces the default rules-driven triage workflow above. The rules directory remains available but is not mandatory — follow the task. |
There was a problem hiding this comment.
I guess we don't need this for now
|
|
||
| # Source repository | ||
|
|
||
| Your working directory is the source repository for the product these bugs are filed against. You have Read, Grep, Glob, and Bash to inspect it. Use this to answer questions like "does this function still exist", "where is this string defined", "what does this test actually check". |
There was a problem hiding this comment.
In a follow-up, we should also add the other tools that we already have like searchfox
|
|
||
| These tools are not gated by --dry-run: reproducing a crash is assessment, not modification. | ||
|
|
||
| Only produce a fix when explicitly asked for, and follow these rules: |
There was a problem hiding this comment.
We might want to remove this and always have the agent attempt a fix if possible
|
|
||
| Only produce a fix when explicitly asked for, and follow these rules: | ||
|
|
||
| - Before trying to reproduce or fix anything, ensure you are at origin/main with no local source changes. |
There was a problem hiding this comment.
If we create a fresh clone for each run, this line might not be necessary
| - Reproduce the issue first, then plan your fix and test that the issue no longer reproduces. If you cannot | ||
| reproduce the bug, do not post a fix patch. Comment instead that the bug wasn't reproducible automatically | ||
| and needs manual attention. |
There was a problem hiding this comment.
In some cases it would be feasible to attempt a fix even when the bug can't be reproduced (though we can add this option later I guess)
There was a problem hiding this comment.
Let's keep it for now to reduce false positives, and we can iterate later if it turns out to be too restrictive.
| - **What** you are about to change and **why** (cite the specific rule) | ||
| - **Your confidence**: high / medium / low | ||
|
|
||
| Only call `update_bug` to change fields when confidence is **high** and a specific triage rule directs it. If confidence is medium or low, `add_comment` instead to ask for clarification or note your findings — do not silently skip. |
There was a problem hiding this comment.
Given we have few triage rules, the confidence might never be "high" right now
There was a problem hiding this comment.
Let us keep it and see how it works.
For now, it runs via CLI; I'll follow up by creating a cloud-based service.