_call_user_fn_args silently reassigns kwargs positionally when a scorer declares an unrelated parameter

## Summary

`braintrust.framework._call_user_fn_args` ([framework.py:488](https://github.com/braintrustdata/braintrust-sdk-python/blob/main/src/braintrust/framework.py#L488)) matches scorer parameters to Braintrust-provided kwargs by name — but when a declared parameter name is **not** in the provided kwargs, it pops the next available kwarg **positionally** and assigns it anyway:

```python
for name, param in signature.parameters.items():
    if param.kind in (VAR_POSITIONAL, VAR_KEYWORD):
        continue
    if name in kwargs:
        final_kwargs[name] = kwargs.pop(name)
    else:
        next_arg = list(kwargs.keys())[0]
        final_kwargs[name] = kwargs.pop(next_arg)   # <-- surprising
```

This means any scorer that declares a keyword-only parameter whose name Braintrust doesn't inject (e.g. for dependency injection, configuration, a client, a cached resource) will silently receive one of Braintrust's own kwargs — typically `metadata` or `trace` — under the wrong name. The declared default is discarded.

## Repro

```python
from dataclasses import dataclass

@dataclass
class Config:
    threshold: float = 0.5

DEFAULT_CONFIG = Config()

async def my_scorer(
    input: str,
    output,
    expected,
    *,
    config: Config = DEFAULT_CONFIG,  # default silently ignored
    **_,
):
    # At runtime, `config` is actually the `metadata` dict that
    # Braintrust passed, not DEFAULT_CONFIG.
    return config.threshold  # AttributeError: 'dict' object has no attribute 'threshold'
```

Braintrust calls scorers with `input`, `expected`, `metadata`, `output`, `trace` ([framework.py:1571](https://github.com/braintrustdata/braintrust-sdk-python/blob/main/src/braintrust/framework.py#L1571)). Because `config` isn't in that set, `_call_user_fn_args` pops `metadata` off the dict and assigns it to `config`. `**_` is populated only with whatever remains after the walk.

## Why this is a problem

1. **Silent type-punning.** The scorer parameter has a declared default and a declared type; both are ignored with no warning.
2. **Order-dependent.** Which kwarg gets reassigned depends on the insertion order of the kwargs dict — subtle and fragile.
3. **Breaks DI patterns.** The natural way to make a scorer testable is to accept an injectable dependency with a default; this behavior makes that unsafe.
4. **No warning or error.** The user only notices when the wrong-type value explodes deep in the call stack.

## Expected behavior

One of:

- Only bind parameters that are present in Braintrust's provided kwargs by name; leave declared-but-absent parameters to their defaults.
- If reassignment is intentional for some backward-compat reason, raise or warn when a declared parameter name has no matching kwarg and is filled positionally.
- At minimum, document this behavior prominently in the scorer authoring docs.

## Workaround

Drop the extra keyword parameter from the scorer's public signature and reference the dependency from a module-level variable (or closure). Monkeypatch the module-level variable for tests.

## Environment

- `braintrust` (latest on PyPI as of 2026-04-08)
- Python 3.14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

_call_user_fn_args silently reassigns kwargs positionally when a scorer declares an unrelated parameter #230

Summary

Repro

Why this is a problem

Expected behavior

Workaround

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

_call_user_fn_args silently reassigns kwargs positionally when a scorer declares an unrelated parameter #230

Description

Summary

Repro

Why this is a problem

Expected behavior

Workaround

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions