[OAI] Support braintrust >=0.13 wrapping (fix Python CI) by wong-codaio · Pull Request #193 · braintrustdata/autoevals

Kenny Wong (wong-codaio) · 2026-06-02T13:00:47Z

Summary

[OAI] Support braintrust >=0.13 wrapping (fix Python CI)

braintrust 0.13 changed wrap_openai to patch methods in place (wrapt) instead of returning a NamedWrapper, and dropped the *Wrapper classes — breaking is_wrapped detection and test_oai.py's imports (collection abort → red Python CI on every PR).
fix-forward (no pin): detect wrapping version-agnostically — NamedWrapper (<0.13) or wrapt wrapper on create() (>=0.13).
behavior change: >=0.13 only instruments the v1 SDK, so v0 clients are no longer traced. tests updated.

PTAL:
FYI:

Test plan

pytest py/autoevals/ → 68 passed on braintrust 0.23 (was: collection ImportError)
manual sanity check below — verifies detection + a real classifier completing through the wrapped client (so the injected braintrust span metadata doesn't break the call):

braintrust 0.23.0
OK: wrapping detection works (structural)
OK: live classifier through wrapped client -> score=1

check_braintrust_wrapping.py (run it yourself)

"""Sanity check: autoevals still works with the installed braintrust.

braintrust periodically changes how wrap_openai instruments OpenAI clients
(e.g. 0.13 moved from a NamedWrapper proxy to in-place wrapt patching). If
detection regresses, scorer spans silently lose their `purpose` -- and the
injected span metadata could even break the wrapped call. Run this after a
braintrust upgrade to confirm nothing regressed.

  # structural checks only (no network/key):
  python check_braintrust_wrapping.py

  # also run a real classifier through the wrapped client:
  BRAINTRUST_API_KEY=... python check_braintrust_wrapping.py
"""

import importlib.metadata as md
import os

import openai
from braintrust.oai import wrap_openai

from autoevals import Factuality, init
from autoevals.oai import LLMClient, get_openai_wrappers, openai_client_is_wrapped, prepare_openai

print("braintrust", md.version("braintrust"))
NamedWrapper, _ = get_openai_wrappers()

# 1. helper detects a wrapped client and does not false-positive on a plain one.
wrapped = wrap_openai(openai.OpenAI(api_key="x"))
plain = openai.OpenAI(api_key="x")
assert openai_client_is_wrapped(wrapped, NamedWrapper), "wrapped client not detected"
assert not openai_client_is_wrapped(plain, NamedWrapper), "plain client falsely detected"

# 2. init() -> prepare_openai() wraps the default client and reports is_wrapped.
init(openai.OpenAI(api_key="x"))
assert prepare_openai().is_wrapped, "prepare_openai() did not wrap"

# 3. a client with custom methods opts out of wrapping.
init(None)
custom = openai.OpenAI(api_key="x")
assert not LLMClient(openai=custom, complete=custom.chat.completions.create).is_wrapped
print("OK: wrapping detection works (structural)")

# 4. live: run a real classifier through the wrapped client. The wrapped call
#    must accept the injected braintrust span metadata and return a score.
if os.environ.get("OPENAI_API_KEY") or os.environ.get("BRAINTRUST_API_KEY"):
    init(None)  # default client -> braintrust proxy + BRAINTRUST_API_KEY
    assert prepare_openai().is_wrapped
    r = Factuality(model="gpt-4o-mini").eval(
        output="Paris", expected="Paris", input="What is the capital of France?"
    )
    assert r.error is None, r.error
    assert r.score == 1, r.score
    print(f"OK: live classifier through wrapped client -> score={r.score}")
else:
    print("SKIP live classifier (set BRAINTRUST_API_KEY or OPENAI_API_KEY to run it)")

Other Notes

supersedes [Deps] Pin braintrust <0.13 to unbreak Python CI #192 (pin); surfaced while diagnosing [OAI] Allow forcing Responses API for non-gpt-5 model names #190.

braintrust 0.13 removed the wrapper classes and changed wrap_openai to patch resource methods in place (wrapt FunctionWrapper) instead of returning a NamedWrapper proxy. The isinstance(client, NamedWrapper) check no longer detected wrapping (is_wrapped wrongly False, so scorer spans lost their purpose), and test_oai.py imported now-removed classes, aborting collection. Detect wrapping in a version-agnostic way: NamedWrapper for <0.13, wrapt wrapper type on create() for >=0.13. Note >=0.13 only instruments the v1 SDK, so v0 clients are no longer traced; tests updated to match.

Kenny Wong (wong-codaio) mentioned this pull request Jun 2, 2026

[Deps] Pin braintrust <0.13 to unbreak Python CI #192

Closed

2 tasks

Kenny Wong (wong-codaio) force-pushed the wong/deps/support-braintrust-0.13 branch from da14027 to 7da7460 Compare June 2, 2026 13:13

ekeith (evanmkeith) requested a review from Erin McNulty (erin2722) June 2, 2026 13:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OAI] Support braintrust >=0.13 wrapping (fix Python CI)#193

[OAI] Support braintrust >=0.13 wrapping (fix Python CI)#193
Kenny Wong (wong-codaio) wants to merge 1 commit into
braintrustdata:mainfrom
wong-codaio:wong/deps/support-braintrust-0.13

Kenny Wong (wong-codaio) commented Jun 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kenny Wong (wong-codaio) commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Other Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Kenny Wong (wong-codaio) commented Jun 2, 2026 •

edited

Loading