OpenAI async support #15

bcordo · 2024-01-08T16:22:05Z

What happened?
I'm following this guide: https://grafana.com/blog/2023/11/02/monitor-your-openai-usage-with-grafana-cloud/ and it works great with normal synchronous OpenAI client "from openai import OpenAI; client = OpenAI(apikey=OPENAIAPIKEY)" but it doesn't work with the asynchronous client "from openai import AsyncOpenAI; aclient = AsyncOpenAI(apikey=OPENAIAPIKEY)"

I get the error:
"Exception has occurred: AttributeError
'coroutine' object has no attribute 'usage'
File ..., in asynccall_chatgpt
response = await aclient.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1, in
AttributeError: 'coroutine' object has no attribute 'usage'"

It has to do with the fact that internally it's not using await and so it's not expecting a coroutine.

What was expected to happen?
I was expecting it to also work, or to have access to an async version of the monitoring code.

Steps to reproduce the problem:

Run
from openai import AsyncOpenAI
aclient = AsyncOpenAI(apikey=OPENAIAPIKEY) aclient.chat.completions.create = chatv2.monitor(
aclient.chat.completions.create,
metricsurl="[redacted]",
metricsusername=[redacted],
logsusername=[redacted],
access_token="[redacted]"
Run :
response = await aclient.chat.completions.create(
model=model,
temperature=temperature,
maxtokens=maxtokens,
n=maxresponses, topp=topp, frequencypenalty=frequencypenalty, presencepenalty=presence_penalty,
messages=messages,
stream=stream,
)
Get error:
Exception has occurred: AttributeError
'coroutine' object has no attribute 'usage'
File ... in asynccall_chatgpt
response = await aclient.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1, in
AttributeError: 'coroutine' object has no attribute 'usage'

Version numbers (grafana, prometheus, graphite, plugins, operating system, etc.):
Mac, grafana cloud, OpenAI plugin

ishanjainn · 2024-01-10T08:44:48Z

Hey @bcordo , Thanks for raising this issue. I have added support for AsyncOpenAI, It should be available in the python library version 0.0.8. To use with AsyncOpenAI, You'll now need to pass a flag to the function - use_async and set its value as True. Here's a snippet

from openai import OpenAI
import asyncio
from grafana_openai_monitoring import chat_v2

client = OpenAI(
    api_key="sk-***",
)

# Apply the custom decorator to the OpenAI API function
client.chat.completions.create = chat_v2.monitor(
    client.chat.completions.create,
    metrics_url="https://prometheus.grafana.net/api/prom",  # Example: "https://prometheus.grafana.net/api/prom"
    logs_url="https://logs.grafana.net/loki/api/v1/push",  # Example: "https://logs.example.com/loki/api/v1/push/"
    metrics_username=123456,  # Example: "123456"
    logs_username=987654,  # Example: "987654"
    access_token="glc_ey....",
    use_async=True,  # Set to True if the function is async
)

async def main() -> None:
    chat_completion = await client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            }
        ],
        model="gpt-3.5-turbo",
    )

    print(chat_completion)

asyncio.run(main())

Please feel free to reopen the issue if you still encounter issues

bcordo · 2024-01-10T09:39:40Z

Wow @ishanjainn that was super fast! I appreciate it, you're awesome. Thanks.

bcordo · 2024-01-10T10:58:49Z

Hi @ishanjainn. Thanks for your fast response. I ran your sample code:

import os
from openai import OpenAI
import asyncio
from grafana_openai_monitoring import chat_v2
from dotenv import load_dotenv, find_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

client = OpenAI(
    api_key=OPENAI_API_KEY,
)

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
DEEPL_AUTH_KEY = os.getenv("DEEPL_AUTH_KEY")
GRAFANA_API_KEY = os.getenv("GRAFANA_API_KEY")
GRAFANA_METRICS_URL = os.getenv("GRAFANA_METRICS_URL")
GRAFANA_LOGS_URL = os.getenv("GRAFANA_LOGS_URL")
GRAFANA_METRICS_USERNAME = os.getenv("GRAFANA_METRICS_USERNAME")
GRAFANA_LOGS_USERNAME = os.getenv("GRAFANA_LOGS_USERNAME")

client = OpenAI(api_key=OPENAI_API_KEY)

# Apply the custom decorator to the OpenAI API client functions to measure usage
client.chat.completions.create = chat_v2.monitor(
    client.chat.completions.create,
    metrics_url=GRAFANA_METRICS_URL,
    logs_url=GRAFANA_LOGS_URL,
    metrics_username=GRAFANA_METRICS_USERNAME,
    logs_username=GRAFANA_LOGS_USERNAME,
    access_token=GRAFANA_API_KEY,
)

async def main() -> None:
    chat_completion = await client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            }
        ],
        model="gpt-3.5-turbo",
    )

    print(chat_completion)

asyncio.run(main())

But I get the error:

Traceback (most recent call last):
  File "test.py", line 48, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 36, in main
    chat_completion = await client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object ChatCompletion can't be used in 'await' expression

I thought there may have been a typo so I reran using AsyncOpenAI instead of OpenAI:

client = AsyncOpenAI(
    api_key=OPENAI_API_KEY,
)

And I get the error:

Traceback (most recent call last):
  File "test.py", line 48, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 36, in main
    chat_completion = await client.chat.completions.create(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/site-packages/grafana_openai_monitoring/chat_v2.py", line 161, in wrapper
    response.usage.prompt_tokens,
    ^^^^^^^^^^^^^^
AttributeError: 'coroutine' object has no attribute 'usage'
sys:1: RuntimeWarning: coroutine 'AsyncCompletions.create' was never awaited
python test.py
Traceback (most recent call last):
  File "test.py", line 48, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 36, in main
    chat_completion = await client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object ChatCompletion can't be used in 'await' expression

And just to make sure I was using the updated version, which seems right:

pip show grafana_openai_monitoring 
Name: grafana-openai-monitoring
Version: 0.0.8
Summary: Library to monitor your OpenAI usage and send metrics and logs to Grafana Cloud
Home-page: https://github.com/grafana/grafana-openai-monitoring
Author: Ishan Jain
Author-email: ishan.jain@grafana.com
License: 
Location: .conda/lib/python3.11/site-packages
Requires: requests

Maybe I am just missing something here.

ishanjainn · 2024-01-10T11:03:06Z

use_async=True

This seems to missing

bcordo · 2024-01-15T08:21:42Z

@ishanjainn Thanks again for the quick response. Yes, that worked thanks, the problem was I added use_async=True on my main script which had the error, and I see what the difference is. I need to use the streaming API, So when you add stream=True you get the error. In detail:

import os
import asyncio

from openai import OpenAI
from grafana_openai_monitoring import chat_v2
from dotenv import load_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
DEEPL_AUTH_KEY = os.getenv("DEEPL_AUTH_KEY")
GRAFANA_API_KEY = os.getenv("GRAFANA_API_KEY")
GRAFANA_METRICS_URL = os.getenv("GRAFANA_METRICS_URL")
GRAFANA_LOGS_URL = os.getenv("GRAFANA_LOGS_URL")
GRAFANA_METRICS_USERNAME = os.getenv("GRAFANA_METRICS_USERNAME")
GRAFANA_LOGS_USERNAME = os.getenv("GRAFANA_LOGS_USERNAME")

client = OpenAI(api_key=OPENAI_API_KEY)

# Apply the custom decorator to the OpenAI API client functions to measure usage
client.chat.completions.create = chat_v2.monitor(
    client.chat.completions.create,
    metrics_url=GRAFANA_METRICS_URL,
    logs_url=GRAFANA_LOGS_URL,
    metrics_username=GRAFANA_METRICS_USERNAME,
    logs_username=GRAFANA_LOGS_USERNAME,
    access_token=GRAFANA_API_KEY,
    use_async=True,
)

async def main() -> None:
    chat_completion = await client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            }
        ],
        model="gpt-3.5-turbo",
        stream=True,
    )
    
    for chunk in chat_completion:
        current_content = chunk.choices[0].delta.content
        print(current_content)

asyncio.run(main())

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "test.py", line 51, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 37, in main
    chat_completion = await client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/site-packages/grafana_openai_monitoring/chat_v2.py", line 65, in async_wrapper
    response.usage.prompt_tokens,
    ^^^^^^^^^^^^^^
AttributeError: 'Stream' object has no attribute 'usage'

I guess the problem as reported here How_to_stream_completions.ipynb

Another small drawback of streaming responses is that the response no longer includes the usage field to tell you how many tokens were consumed. After receiving and combining all of the responses, you can calculate this yourself using tiktoken.

Any good solutions to this? Thanks again.

ishanjainn · 2024-01-15T09:08:43Z

Hey @bcordo , For the streaming responses, More than token usage, The library doesnt yet supports streaming as it needs a bit different processing, We have it as an open issue right now #13.
Ill see what can be done for it

For token calculation when streaming, yeah tiktoken is the way to go (Although i don't think its 100% accurate also but gives a near good estimate)

ishanjainn · 2024-01-15T09:09:16Z

Also reopened this issue while we get streaming implemented here. Thanks!

ishanjainn mentioned this issue Jan 10, 2024

Add Support for AsyncOpenAI to the Python Library #16

Merged

ishanjainn closed this as completed in #16 Jan 10, 2024

ishanjainn reopened this Jan 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI async support #15

OpenAI async support #15

bcordo commented Jan 8, 2024

ishanjainn commented Jan 10, 2024 •

edited

bcordo commented Jan 10, 2024

bcordo commented Jan 10, 2024

ishanjainn commented Jan 10, 2024 •

edited

bcordo commented Jan 15, 2024 •

edited

ishanjainn commented Jan 15, 2024 •

edited

ishanjainn commented Jan 15, 2024

OpenAI async support #15

OpenAI async support #15

Comments

bcordo commented Jan 8, 2024

ishanjainn commented Jan 10, 2024 • edited

bcordo commented Jan 10, 2024

bcordo commented Jan 10, 2024

ishanjainn commented Jan 10, 2024 • edited

bcordo commented Jan 15, 2024 • edited

ishanjainn commented Jan 15, 2024 • edited

ishanjainn commented Jan 15, 2024

ishanjainn commented Jan 10, 2024 •

edited

ishanjainn commented Jan 10, 2024 •

edited

bcordo commented Jan 15, 2024 •

edited

ishanjainn commented Jan 15, 2024 •

edited