[BOT ISSUE] Anthropic `server_tool_use` usage metrics not captured in span metrics

## Summary

The Anthropic instrumentation extracts `input_tokens` and `output_tokens` from the Messages API response `usage` object, but does not capture the `server_tool_use` sub-object. When Claude uses server-side tools (web search, code execution), the API returns usage counters like `web_search_requests` and `web_fetch_requests` inside `usage.server_tool_use`. These are silently dropped, making server-side tool usage invisible in Braintrust metrics.

## What is missing

In `InstrumentationSemConv.tagAnthropicResponse()` (lines 220–230), the usage extraction only handles top-level token fields:

```java
if (responseJson.has("usage")) {
    JsonNode usage = responseJson.get("usage");
    if (usage.has("input_tokens")) metrics.put("prompt_tokens", usage.get("input_tokens"));
    if (usage.has("output_tokens")) metrics.put("completion_tokens", usage.get("output_tokens"));
    // ... total tokens
}
```

No check for `server_tool_use`. A real response with web search looks like:

```json
"usage": {
  "input_tokens": 1200,
  "output_tokens": 350,
  "server_tool_use": {
    "web_search_requests": 3,
    "web_fetch_requests": 2
  }
}
```

The missing extraction should dynamically map each field inside `server_tool_use` to a metric named `server_tool_use_<field_name>`, e.g.:
- `server_tool_use_web_search_requests`
- `server_tool_use_web_fetch_requests`
- `server_tool_use_code_execution_requests`

Note: the full response JSON is stored in `braintrust.output_json`, so the raw data is technically present — but it is not extracted into `braintrust.metrics` where Braintrust's UI and cost calculations can use it.

For comparison, the OpenAI handler in the same file already extracts nested usage details (`output_tokens_details.reasoning_tokens` at lines 145–150), so this pattern is established.

## Braintrust docs status

- **supported** — Braintrust docs at https://www.braintrust.dev/docs/integrations/ai-providers/anthropic explicitly document server-side tool metrics: "When Claude uses server-side tools, Braintrust records the provider's tool usage counters dynamically." The documented metric names are `server_tool_use_web_search_requests`, `server_tool_use_web_fetch_requests`, `server_tool_use_code_execution_requests`.

## Upstream sources

- **Anthropic server-side tools**: https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-search-tool — web search is a GA server-side tool
- **Anthropic Messages API `usage` object**: https://docs.anthropic.com/en/api/messages — documents `server_tool_use` as part of the response `usage` object
- **Anthropic Java SDK**: `com.anthropic:anthropic-java:2.2.0` (depended on by this repo) — the HTTP response contains `server_tool_use` regardless of SDK version since it's a JSON field

## Local files inspected

- `braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java` — lines 208–235 (`tagAnthropicResponse` only extracts `input_tokens` and `output_tokens`; no `server_tool_use` check)
- `braintrust-sdk/instrumentation/anthropic_2_2_0/src/main/java/dev/braintrust/instrumentation/anthropic/v2_2_0/TracingHttpClient.java` — HTTP-level instrumentation, delegates to `InstrumentationSemConv`
- `braintrust-sdk/instrumentation/anthropic_2_2_0/src/test/java/dev/braintrust/instrumentation/anthropic/v2_2_0/BraintrustAnthropicTest.java` — no test cases exercise server-side tool responses

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BOT ISSUE] Anthropic `server_tool_use` usage metrics not captured in span metrics #77

Summary

What is missing

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BOT ISSUE] Anthropic server_tool_use usage metrics not captured in span metrics #77

Description

Summary

What is missing

Braintrust docs status

Upstream sources

Local files inspected

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[BOT ISSUE] Anthropic `server_tool_use` usage metrics not captured in span metrics #77