## OpenAI Monitoring Examples

This updates our monitoring code to:
1. Be compatible with `openai>=1.0.0`
2. Use patching (instead of forcing the user to import from our custom impl)
3. Use a callback system to add new functionality before/after certain stages

In [1]:
import openai
import weave
from weave.monitoring.openai.patch import LogToStreamTable, ReassembleStream


### Setup

In [2]:
# TODO: Find a better way to let the user specify their own stream
weave.monitoring.openai.patch(callbacks=
    [
        ReassembleStream(),
        LogToStreamTable.from_stream_name("monitoring4", "openai", "megatruong")
    ]
)
# weave.monitoring.openai.unpatch()


[34m[1mwandb[0m: Patching OpenAI completions.  To unpatch, call weave.monitoring.openai.patch.unpatch()


### Module-level Sync Completion

In [3]:
result = openai.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": "Tell me a joke"}],
)


In [4]:
result


ChatCompletion(id='chatcmpl-8NmdF7OBwWCTo3poGgKGhpa2Kz5Ix', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content="Sure, here's a classic one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!", role='assistant', function_call=None, tool_calls=None))], created=1700679177, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=23, prompt_tokens=11, total_tokens=34))

### Sync Completion

In [5]:
client = openai.OpenAI()

result = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": "Tell me a joke"}],
)


In [6]:
result


ChatCompletion(id='chatcmpl-8NmdG6rqiuOeHKmrwGgzak1qsRQUX', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content="Sure, here's a classic one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!", role='assistant', function_call=None, tool_calls=None))], created=1700679178, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=23, prompt_tokens=11, total_tokens=34))

### Sync Completion (Streaming)

In [7]:
client = openai.OpenAI()

stream = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": "Tell me a joke"}],
    stream=True
)


In [8]:
for x in stream:
    ...


### Async Completion

In [9]:
client = openai.AsyncOpenAI()

result = await client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": "Tell me a joke"}],
)


In [10]:
result


ChatCompletion(id='chatcmpl-8NmdIxsXqb8gSA0VXB1WPaEN3OxW2', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content="Why don't scientists trust atoms?\n\nBecause they make up everything!", role='assistant', function_call=None, tool_calls=None))], created=1700679180, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=13, prompt_tokens=11, total_tokens=24))

### Async Completion (Streaming)

In [11]:
client = openai.AsyncOpenAI()

stream = await client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": "Tell me a joke"}],
    stream=True
)


In [12]:
async for x in stream:
    ...
