bug: ValueError: Usage object must have either {input, output, total, unit} or {promptTokens, completionTokens, totalTokens} #1249

pavelm10 · 2024-02-25T11:26:38Z

Describe the bug

When using langfuse callback handler for from langchain_openai import OpenAI class the tracking fails. When using from langchain_openai import ChatOpenAI tracking works.

traceback:

Usage object must have either {input, output, total, unit} or {promptTokens, completionTokens, totalTokens}
Traceback (most recent call last):
File "/app/.venv/lib/python3.11/site-packages/langfuse/client.py", line 1125, in update
"usage": _convert_usage_input(usage) if usage is not None else None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.11/site-packages/langfuse/utils/__init__.py", line 100, in _convert_usage_input
raise ValueError(
ValueError: Usage object must have either {input, output, total, unit} or {promptTokens, completionTokens, totalTokens}

To reproduce

How we use it

callbacks = []
lf_handler = CallbackHandler(
            langfuse_public_key,
            langfuse_secret_key,
            user_id=user_id,
        )
callbacks.append(lf_handler)

chain: Runnable = (
            {
                "text": itemgetter("text"),
                "change_instruction": itemgetter("change_instruction"),
            }
            | prompt
            | OpenAI(
                model="some-instruct-model-name",
                temperature=0,
                api_key="some-api-key",
                callbacks=callbacks,
                max_retries=3,
                timeout=30,
            )
            | StrOutputParser()
        )

Is the error expected because we are using langfuse wrongly with OpenAI class? Please note we cannot use in our use case ChatOpenAI class as we want to use instruct model.

Additional information

python 3.11
poetry 1.7.1
langfuse 2.16.2
langchain 0.1.9
langchain-openai 0.0.6

The text was updated successfully, but these errors were encountered:

maxdeichmann · 2024-02-26T13:47:59Z

@pavelm10 thanks for raising this issue! Does it only happen with "some-instruct-model-name"? Which model exactly are you using?

maxdeichmann · 2024-02-26T13:50:06Z

I am trying to reproduce the issue here: https://github.com/langfuse/langfuse-python/pull/413/files

pavelm10 · 2024-02-26T17:03:25Z

@maxdeichmann I am using gpt-3.5-turbo-instruct. I will try to reproduce your test as well and come back to you.

pavelm10 · 2024-02-27T08:48:52Z

@maxdeichmann I tried your test and it indeed does not generate the traceback. However, our usecase is providing list of inputs to batch() method of the chain - sorry for not including it in the original description.

input_list = [
        {"question": "where did harrison work", "language": "english"},
        {"question": "how is your day", "language": "english"},
    ]
runnable_chain.batch(input_list)

maxdeichmann · 2024-02-29T16:35:01Z

input_list = [
{"question": "where did harrison work", "language": "english"},
{"question": "how is your day", "language": "english"},
]
runnable_chain.batch(input_list)

Thanks for the clarification. This helped me to reproduce. I will ship a fix shortly

maxdeichmann · 2024-02-29T17:44:11Z

@pavelm10 this is an issue by Langchain, but i added a workaround on our end to ensure tokens are correct for total LLM chains. You will see the total sum of tokens on the first Generation in Langfuse, all succeeding ones will have 0 tokens.

Background:
For batches, Langchain sends the sum for all tokens in the first event. All succeeding events send an empty dict. This hack ensures that we do not calculate tokens on our end for all the empty dicts which would result in wrong numbers for our users.

https://github.com/langfuse/langfuse-python/pull/413/files#diff-d9c0f4dec8a45fe6c408de5493f7580c8ad07ec7e251957afc57888405f35c8eR671-R673

istandleet · 2024-02-29T18:36:43Z

@maxdeichmann I believe I am running into the same issue. I tracked it down to https://github.com/langfuse/langfuse-python/blame/main/langfuse/utils/__init__.py#L55 . In particular, the object we are passing in from langchain/openAI is such that if you ran dict(usage) you would get out what you want, but if you run usage.__dict__ you get out some wrapper dict where there's a key called _previous that has the dict you want. (If you did nothing to it it also behaves as you want wrt key lookup)

maxdeichmann · 2024-02-29T18:46:28Z

Ok do you have more details of that such as a screenshot? I get an empty object from langchain before even putting it into langfuse and the code line you were sharing.

istandleet · 2024-02-29T19:50:06Z

Hopefully this helps! it appears to be something called an OpenAIObject.

maxdeichmann · 2024-02-29T21:05:18Z

@istandleet this release should fix your issue. Let me know, when you run into issues. Regarding the tokenisation, please create an issue at Langchain. Unfortunately, they only send tokens on the first generation.

pavelm10 · 2024-03-02T08:25:18Z

@maxdeichmann thanks for quick fix on your end.

gautamcrhythmx · 2024-03-12T23:44:32Z

@maxdeichmann i am still facing this issue. as @istandleet pointed out, at this point calling __dict__ is causing wrong object to be passed on. can we make it as dict(usage) ?

pavelm10 added the 🐞❔ unconfirmed bug label Feb 25, 2024

marcklingen added bug Something isn't working integration-langchain sdk-python and removed 🐞❔ unconfirmed bug labels Feb 28, 2024

maxdeichmann closed this as completed Feb 29, 2024

Criss-Wang mentioned this issue Apr 23, 2024

bug: OpenAI 0.28 chat completion call gives response usage object that cannot be digested by langfuse #1803

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: ValueError: Usage object must have either {input, output, total, unit} or {promptTokens, completionTokens, totalTokens} #1249

bug: ValueError: Usage object must have either {input, output, total, unit} or {promptTokens, completionTokens, totalTokens} #1249

pavelm10 commented Feb 25, 2024

maxdeichmann commented Feb 26, 2024

maxdeichmann commented Feb 26, 2024

pavelm10 commented Feb 26, 2024

pavelm10 commented Feb 27, 2024

maxdeichmann commented Feb 29, 2024

maxdeichmann commented Feb 29, 2024

istandleet commented Feb 29, 2024 •

edited

maxdeichmann commented Feb 29, 2024

istandleet commented Feb 29, 2024

maxdeichmann commented Feb 29, 2024

pavelm10 commented Mar 2, 2024

gautamcrhythmx commented Mar 12, 2024 •

edited

bug: ValueError: Usage object must have either {input, output, total, unit} or {promptTokens, completionTokens, totalTokens} #1249

bug: ValueError: Usage object must have either {input, output, total, unit} or {promptTokens, completionTokens, totalTokens} #1249

Comments

pavelm10 commented Feb 25, 2024

Describe the bug

To reproduce

Additional information

maxdeichmann commented Feb 26, 2024

maxdeichmann commented Feb 26, 2024

pavelm10 commented Feb 26, 2024

pavelm10 commented Feb 27, 2024

maxdeichmann commented Feb 29, 2024

maxdeichmann commented Feb 29, 2024

istandleet commented Feb 29, 2024 • edited

maxdeichmann commented Feb 29, 2024

istandleet commented Feb 29, 2024

maxdeichmann commented Feb 29, 2024

pavelm10 commented Mar 2, 2024

gautamcrhythmx commented Mar 12, 2024 • edited

istandleet commented Feb 29, 2024 •

edited

gautamcrhythmx commented Mar 12, 2024 •

edited