-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create separate proxy-based wrapper for openai. #65
base: main
Are you sure you want to change the base?
Conversation
This is the new default for when people call `wrap_openai`, though they can opt-into using the non-proxy wrappers by passing `proxy_base_url=None` (or equivalent in javascript). Other changes to the SDK to facilitate this: - Add logic for serializing spans, so we can send the root span over to the proxy. - Add logic for serializing login info, so we can send that over to the proxy and instantiate a BackgroundLogger for the span using the serialized state. This requires parameterizing the BackgroundLogger on a given `logConn`, rather than using the global state connection. - Change the `get_caller_location` logic to ignore files in the stacktrace that in a subdirectory of the first stacktrace entry, not just the same directory. This seems to slightly change a couple of the expect test stack traces, but the changes are not really important (getting rid of meaningless, internal nodeJS callsites). - Remove the UnterminatedObjectsHandler. I wanted to "stop using" the root span in the proxy after spawning a subspan, without actually ending the root span, and I didn't want these warning messages. I think the warnings are no longer useful, because we really only care about warning people when they forget to unmark their spans as current. But we've already changed the behavior so that spans can only be marked current under a context manager / within an `AsyncLocalStorage` callback, so there's really very little risk of them making this mistake.
@@ -0,0 +1,139 @@ | |||
import { Span, startSpan } from "../logger"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is purely a cut-and-paste from oai.ts
, except for changing the wrapper function name.
@@ -0,0 +1,180 @@ | |||
"""Wrappers for the openAI client which do not go through the Braintrust |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is purely a cut-and-paste from oai.py
, except for changing the names of the wrapper classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline: can we simplify the state we're sending to the proxy?
js/src/logger.ts
Outdated
* | ||
* @param args.setCurrent If true (the default), the span will be marked as the currently-active span for the duration of the callback. Equivalent to calling `braintrust.withCurrent(span, callback)`. | ||
*/ | ||
export function withResumedSpan<R>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this will only be used internally. If so, could we avoid adding yet another one of these functions? Or if not, is there a more elegant way that allows you to resume a span, and then use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will just remove this for now.
js/src/logger.ts
Outdated
} | ||
|
||
let unterminatedObjects = new UnterminatedObjectsHandler(); | ||
|
||
class FailedHTTPResponse extends Error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this carried over from the other change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was done independently, because I didn't want a warning from resuming the serialized root span and not ending it in the proxy. But now that we've decided to only create subspans out of a serialized span, I can leave the removal to the other PR.
js/src/logger.ts
Outdated
public serializeLoginInfo(): string { | ||
return JSON.stringify({ | ||
api_url: this.apiUrl, | ||
login_token: this.loginToken, | ||
org_id: this.orgId, | ||
org_name: this.orgName, | ||
log_url: this.logUrl, | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is redundant, and if you use your BRAINTRUST_API_KEY
we already re-derive this info in the logger. Perhaps we should just send x-bt-org-name
, and let the logic in the proxy handle the rest?
js/src/logger.ts
Outdated
|
||
type SerializedSpanInfo = SpanObjectIds & SpanParentSpanIds; | ||
|
||
function SerializedSpanInfoToString(info: SerializedSpanInfo): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this capitalized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
un-capitalized it
js/src/logger.ts
Outdated
const objectKindIds = (() => { | ||
if (info.object_kind === "e") { | ||
return [info.project_id, info.experiment_id]; | ||
} else if (info.object_kind === "pl") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like good logic to write some simple unit tests for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
js/src/oai.ts
Outdated
* | ||
* Currently, this only supports the `v4` API. | ||
* | ||
* @param openai | ||
* @param openai The `OpenAI` object. | ||
* @param proxyBaseUrl Use the Braintrust proxy (https://github.com/braintrustdata/braintrust-proxy) as the base URL. When left `undefined`, the URL is obtained from the environment variable `BRAINTRUST_PROXY_URL`, defaulting to `https://braintrustproxy.com/v[api_version]`. Pass `null` to leave the original base URL intact and use the non-proxy wrapper, or pass a string to use a custom URL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a fairly subtle distinction (especially between undefined
and null
). I think it'd be more intuitive to:
- rename
args
tooptions
- make the first argument
useProxy
and make it astring | bool
. Iftrue
, it uses the default proxy url. If a string, it uses that URL. If we want to keep the proxy off by default, then maybe we create a flag likedisableProxy
. - make the second argument
apiKey
.
js/src/oai.ts
Outdated
* | ||
* Currently, this only supports the `v4` API. | ||
* | ||
* @param openai | ||
* @param openai The `OpenAI` object. | ||
* @param proxyBaseUrl Use the Braintrust proxy (https://github.com/braintrustdata/braintrust-proxy) as the base URL. When left `undefined`, the URL is obtained from the environment variable `BRAINTRUST_PROXY_URL`, defaulting to `https://braintrustproxy.com/v[api_version]`. Pass `null` to leave the original base URL intact and use the non-proxy wrapper, or pass a string to use a custom URL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to just mention v1
instead of the api version
function getProxyApiKeyDefault(): string | undefined { | ||
return process.env["BRAINTRUST_API_KEY"]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to check whether process is defined or not. In browser/edge environments, it's not
* @param proxyBaseUrl Use the Braintrust proxy (https://github.com/braintrustdata/braintrust-proxy) as the base URL. When left `undefined`, the URL is obtained from the environment variable `BRAINTRUST_PROXY_URL`, defaulting to `https://braintrustproxy.com/v[api_version]`. Pass `null` to leave the original base URL intact and use the non-proxy wrapper, or pass a string to use a custom URL. | ||
* @param proxyApiKey For clients using the proxy wrapper (`proxyBaseUrl` is not `null`). When left `undefined`, the API key is set from `BRAINTRUST_API_KEY` if available. Pass `null` to leave the original API key intact, or pass a string to use a custom API key. | ||
* @param options.useProxy By default (or if `false`), the wrapper does not trace through the proxy. Pass `true` to use the Braintrust proxy as the base URL. The URL is obtained from the environment variable `BRAINTRUST_PROXY_URL`, defaulting to `https://braintrustproxy.com/v1. Pass a string to use a custom proxy URL. | ||
* @param options.apiKey Only used when `useProxy` is set. By default, the API key is set from `BRAINTRUST_API_KEY` if available. Pass a string to use a custom API key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow the apiKey
setting. Why would someone need to set this, instead of setting the api key in the OpenAI client? I'm imagining the code like
wrapOpenAI(new OpenAI({ apiKey: process.env.BRAINTRUST_API_KEY }), { useProxy: true })
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there are two ways someone can initialize the OpenAI client
// Uses OPENAI_API_KEY
new OpenAI();
// Uses custom api key
new OpenAI({apiKey: [custom_api_key]});
I was thinking that in the first case (which I imagine is the most common), they don't have to change anything about their code (just add the wrapper) and it'll change the API key to BRAINTRUST_API_KEY
. In the second case, they can move the apiKey
argument to the wrapper to tell it to set a custom API key instead of BRAINTRUST_API_KEY
. But that wouldn't work if they hadn't defined OPENAI_API_KEY
at all.
We could change the API key functionality so that if openai.apiKey === process.env.OPENAI_API_KEY
, only then do we change it to BRAINTRUST_API_KEY
, otherwise we leave it intact? And we could add a flag keepApiKey?: boolean
, which if true, will not alter the API key at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. I think this is fine as is then, because by default it'll just use their braintrust key.
Maybe instead of using the env variable though, we should use whatever key they used to login? Eg if someone does
logger = braintrust.initLogger({apiKey: myEnv.BRAINTRUST_KEY})
then the wrap call should use that, instead of process.env.BRAINTRUST_API_KEY
by default?
|
||
|
||
def wrap_openai(openai, proxy_base_url=PROXY_SENTINEL, proxy_api_key=PROXY_SENTINEL): | ||
def wrap_openai(openai, use_proxy: Union[None, str, bool] = None, api_key: Optional[str] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: slightly easier to read as Union[None, bool, str]
Oops I didn't mean quite to "accept" -- it would be good to flesh out that remaining api key argument first. But looking really good |
This is the new default for when people call
wrap_openai
, though they can opt-into using the non-proxy wrappers by passingproxy_base_url=None
(or equivalent in javascript).Other changes to the SDK to facilitate this:
Add logic for serializing spans, so we can send the root span over to the proxy.
Add logic for serializing login info, so we can send that over to the proxy and instantiate a BackgroundLogger for the span using the serialized state. This requires parameterizing the BackgroundLogger on a given
logConn
, rather than using the global state connection.Change the
get_caller_location
logic to ignore files in the stacktrace that in a subdirectory of the first stacktrace entry, not just the same directory. This seems to slightly change a couple of the internal expect test stack traces, but the changes are not really important (getting rid of meaningless, internal nodeJS callsites).