Summary
The Replicate Python SDK (replicate) is the official client for the Replicate platform, which hosts 100k+ machine learning models for language generation, image synthesis, video, audio, and more. Its primary execution surface is replicate.run(model, input={}) and replicate.async_run() — a platform-agnostic model execution API distinct from any provider-specific format. This repository has zero instrumentation for any Replicate SDK surface — no integration directory, no wrapper, no patcher, no auto_instrument() support.
The Replicate API is fundamentally different from OpenAI-style APIs: there is no chat.completions.create() shape, model identifiers are version strings (e.g. "meta/llama-3-70b-instruct:..." or "owner/model"), and streaming is token-by-token via an iterator. wrap_openai() cannot be used with the Replicate client. Users who follow Replicate's official documentation and pip install replicate get zero Braintrust tracing.
What needs to be instrumented
The replicate package exposes these execution surfaces via module-level functions and the Client class, none of which are instrumented:
Model execution (highest priority)
| SDK Method |
Description |
Streaming |
Return type |
replicate.run(model, input) |
Sync model execution — runs a model version and returns the complete output |
stream=True yields ServerSentEvent objects |
str, list, or model-specific output |
replicate.async_run(model, input) |
Async model execution |
stream=True yields ServerSentEvent |
Same as sync |
replicate.stream(model, input) |
Sync streaming execution — returns an iterator of output tokens/chunks |
Always streaming |
Iterator[ServerSentEvent] |
replicate.async_stream(model, input) |
Async streaming execution |
Always streaming |
AsyncIterator[ServerSentEvent] |
Span shape for language models: The model parameter (a version string) maps to span metadata model. Inputs are the input dict (contains prompt, system_prompt, max_tokens, etc. depending on model). Output is the concatenated string of streamed tokens. Token usage is not returned by the standard API (models expose usage differently), so metrics may require best-effort extraction from model-specific output.
Predictions API (lower priority)
| SDK Method |
Description |
Return type |
replicate.predictions.create() |
Create a model prediction (lower-level than run()) |
Prediction |
replicate.predictions.get() |
Poll a prediction by ID |
Prediction |
Deployments (lower priority)
| SDK Method |
Description |
replicate.deployments.predictions.create() |
Run a model via a named deployment |
All module-level functions delegate to a default Client instance. Client and AsyncClient have corresponding instance methods.
Implementation notes
Model ID format: replicate.run() accepts model as either "owner/model" (latest version) or "owner/model:version_hash". The integration should extract and log both the model name and version.
No single response type: Unlike OpenAI where all chat completions return ChatCompletion, Replicate output varies by model (strings, lists of strings, dicts, file URLs). The integration should log the raw output and handle streaming aggregation generically.
Streaming: stream=True in run() and the dedicated stream() function yield ServerSentEvent objects with data fields. Accumulated output should be logged as the span output.
Async first-class: The SDK has AsyncClient with async_run() and async_stream() — both must be instrumented.
input dict as span input: The input dict is model-specific but typically contains prompt, system_prompt, max_tokens, temperature, etc. for language models.
No coverage in any instrumentation layer
- No integration directory (
py/src/braintrust/integrations/replicate/)
- No wrapper function (e.g.
wrap_replicate())
- No patcher in any existing integration
- No nox test session (
test_replicate)
- No version entry in
py/src/braintrust/integrations/versioning.py
- No mention in
py/src/braintrust/integrations/__init__.py
- No entry in
[tool.braintrust.matrix] in py/pyproject.toml
A grep for replicate across py/src/braintrust/ returns zero matches in integration code.
Braintrust docs status
not_found — Replicate is not listed on the Braintrust integrations directory or the tracing guide.
Upstream references
Local repo files inspected
py/src/braintrust/integrations/ — no replicate/ directory exists on main
py/src/braintrust/wrappers/ — no Replicate wrapper
py/noxfile.py — no test_replicate session
py/src/braintrust/integrations/__init__.py — Replicate not listed in integration registry
py/src/braintrust/integrations/versioning.py — no Replicate version matrix
py/pyproject.toml — no Replicate entries in [tool.braintrust.matrix]
- Full repo grep for "replicate" across
py/src/braintrust/ — zero matches
Summary
The Replicate Python SDK (
replicate) is the official client for the Replicate platform, which hosts 100k+ machine learning models for language generation, image synthesis, video, audio, and more. Its primary execution surface isreplicate.run(model, input={})andreplicate.async_run()— a platform-agnostic model execution API distinct from any provider-specific format. This repository has zero instrumentation for any Replicate SDK surface — no integration directory, no wrapper, no patcher, noauto_instrument()support.The Replicate API is fundamentally different from OpenAI-style APIs: there is no
chat.completions.create()shape, model identifiers are version strings (e.g."meta/llama-3-70b-instruct:..."or"owner/model"), and streaming is token-by-token via an iterator.wrap_openai()cannot be used with the Replicate client. Users who follow Replicate's official documentation andpip install replicateget zero Braintrust tracing.What needs to be instrumented
The
replicatepackage exposes these execution surfaces via module-level functions and theClientclass, none of which are instrumented:Model execution (highest priority)
replicate.run(model, input)stream=TrueyieldsServerSentEventobjectsstr,list, or model-specific outputreplicate.async_run(model, input)stream=TrueyieldsServerSentEventreplicate.stream(model, input)Iterator[ServerSentEvent]replicate.async_stream(model, input)AsyncIterator[ServerSentEvent]Span shape for language models: The
modelparameter (a version string) maps to span metadatamodel. Inputs are theinputdict (containsprompt,system_prompt,max_tokens, etc. depending on model). Output is the concatenated string of streamed tokens. Token usage is not returned by the standard API (models expose usage differently), so metrics may require best-effort extraction from model-specific output.Predictions API (lower priority)
replicate.predictions.create()run())Predictionreplicate.predictions.get()PredictionDeployments (lower priority)
replicate.deployments.predictions.create()All module-level functions delegate to a default
Clientinstance.ClientandAsyncClienthave corresponding instance methods.Implementation notes
Model ID format:
replicate.run()acceptsmodelas either"owner/model"(latest version) or"owner/model:version_hash". The integration should extract and log both the model name and version.No single response type: Unlike OpenAI where all chat completions return
ChatCompletion, Replicate output varies by model (strings, lists of strings, dicts, file URLs). The integration should log the raw output and handle streaming aggregation generically.Streaming:
stream=Trueinrun()and the dedicatedstream()function yieldServerSentEventobjects withdatafields. Accumulated output should be logged as the span output.Async first-class: The SDK has
AsyncClientwithasync_run()andasync_stream()— both must be instrumented.inputdict as span input: Theinputdict is model-specific but typically containsprompt,system_prompt,max_tokens,temperature, etc. for language models.No coverage in any instrumentation layer
py/src/braintrust/integrations/replicate/)wrap_replicate())test_replicate)py/src/braintrust/integrations/versioning.pypy/src/braintrust/integrations/__init__.py[tool.braintrust.matrix]inpy/pyproject.tomlA grep for
replicateacrosspy/src/braintrust/returns zero matches in integration code.Braintrust docs status
not_found— Replicate is not listed on the Braintrust integrations directory or the tracing guide.Upstream references
Local repo files inspected
py/src/braintrust/integrations/— noreplicate/directory exists onmainpy/src/braintrust/wrappers/— no Replicate wrapperpy/noxfile.py— notest_replicatesessionpy/src/braintrust/integrations/__init__.py— Replicate not listed in integration registrypy/src/braintrust/integrations/versioning.py— no Replicate version matrixpy/pyproject.toml— no Replicate entries in[tool.braintrust.matrix]py/src/braintrust/— zero matches