Record tool usage using hooks instead by haoranpb · Pull Request #621 · microsoft/BC-Bench

haoranpb · 2026-04-21T07:25:17Z

Previous metrics parsing relies on the debug logs, which no longer contains the tool usage.

Instead of relying on logs parsing, we leverage PreToolUse hook to record all tool usage

…bug log dependency

Copilot

Pull request overview

This PR updates BC-Bench’s agent metrics collection to stop relying on Copilot/Claude debug logs for tool usage (which no longer include it) and instead records tool usage via PreToolUse hooks, then parses a dedicated tool_usage.jsonl output.

Changes:

Add hook setup + a PowerShell hook script to log tool usage to tool_usage.jsonl.
Add a shared parser for the hooks output and wire it into Copilot/Claude agents.
Remove tool-usage-from-log parsing, keep Copilot turn counting from session logs, and update tests accordingly.

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/bcbench/operations/hooks_operations.py`	New operation to write Copilot/Claude hook configuration and define tool usage log destination.
`src/bcbench/agent/shared/hooks_parser.py`	New JSONL parser that aggregates tool usage counts from hook output.
`src/bcbench/agent/shared/hooks/log-tool-usage.ps1`	New hook script that appends tool usage entries to a JSONL file.
`src/bcbench/agent/copilot/metrics.py`	Removes tool usage parsing from Copilot logs; keeps turn counting.
`src/bcbench/agent/copilot/agent.py`	Configures hooks and attaches parsed tool usage to collected metrics.
`src/bcbench/agent/claude/metrics.py`	Removes debug-log-based tool usage parsing from Claude metrics.
`src/bcbench/agent/claude/agent.py`	Configures hooks and attaches parsed tool usage to collected metrics.
`src/bcbench/commands/run.py`	Updates “copilot-inspector” to use hook-based tool usage parsing + log-based turn counting.
`src/bcbench/config.py`	Adds hook script path and new file-pattern config entries.
`src/bcbench/operations/__init__.py`	Exposes `setup_hooks` via operations package exports.
`src/bcbench/agent/shared/__init__.py`	Re-exports `parse_tool_usage_from_hooks`.
`tests/test_hooks_parser.py`	New unit tests for hook JSONL parsing behavior.
`tests/test_hooks_operations.py`	New unit tests for hook config generation for Copilot/Claude.
`tests/test_tool_usage_parser.py`	Replaces tool-usage-from-log tests with turn-count-from-log tests.
`tests/test_copilot_metrics_parsing.py`	Removes tool usage assertions from session logs; keeps turn count assertions.
`tests/test_claude_code_metrics.py`	Removes debug-log tool usage tests; asserts tool usage remains `None` without hooks.
`pyproject.toml` / `uv.lock`	Version bump to `0.5.2`.

Co-authored-by: Copilot <copilot@github.com>

haoranpb added 2 commits April 21, 2026 09:17

Refactor metrics parsing to utilize session transcripts and remove de…

a8324fc

…bug log dependency

use hooks for tool usage recording

32300a3

haoranpb changed the title ~~Update metrics parsing after Claude Code update~~ Record tool usage using hooks instead Apr 23, 2026

haoranpb added 2 commits April 23, 2026 12:37

fix tests for linux

43f7d72

bump version to 0.5.2

01ab670

haoranpb marked this pull request as ready for review April 23, 2026 10:43

Merge branch 'main' into bugs/fix-claude-code-tool-uage

29c5a8c

haoranpb requested review from Jiawen-CS and Copilot April 23, 2026 10:44

Copilot started reviewing on behalf of haoranpb April 23, 2026 10:44 View session

Copilot AI reviewed Apr 23, 2026

View reviewed changes

Comment thread src/bcbench/commands/run.py Outdated

Comment thread src/bcbench/operations/hooks_operations.py

haoranpb and others added 2 commits April 23, 2026 13:07

refactor: remove unused copilot-inspector command and related code

ee72926

Co-authored-by: Copilot <copilot@github.com>

fix: update tool name extraction and enhance hook command structure

d198e79

haoranpb enabled auto-merge (squash) April 24, 2026 07:24

Jiawen-CS approved these changes Apr 24, 2026

View reviewed changes

haoranpb merged commit a9cd827 into main Apr 24, 2026
22 of 27 checks passed

haoranpb deleted the bugs/fix-claude-code-tool-uage branch April 24, 2026 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record tool usage using hooks instead#621

Record tool usage using hooks instead#621
haoranpb merged 7 commits intomainfrom
bugs/fix-claude-code-tool-uage

haoranpb commented Apr 21, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

haoranpb commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

haoranpb commented Apr 21, 2026 •

edited

Loading