# How to get log probabilities

:::info Prerequisites

This guide assumes familiarity with the following concepts:
- [Chat models](/docs/concepts/chat_models)
- [Tokens](/docs/concepts/tokens)

:::

Certain [chat models](/docs/concepts/chat_models/) can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain.

## OpenAI

Install the LangChain x OpenAI package and set your API key

In [1]:
%pip install -qU langchain-openai

Note: you may need to restart the kernel to use updated packages.


In [1]:
import getpass
import os

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass()

For the OpenAI API to return log probabilities we need to configure the `logprobs=True` param. Then, the logprobs are included on each output [`AIMessage`](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html) as part of the `response_metadata`:

In [5]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini").bind(logprobs=True)

msg = llm.invoke(("human", "how are you today"))

msg.response_metadata["logprobs"]["content"][:5]

[{'token': "I'm",
  'bytes': [73, 39, 109],
  'logprob': -0.04862971603870392,
  'top_logprobs': []},
 {'token': ' just',
  'bytes': [32, 106, 117, 115, 116],
  'logprob': -0.045479487627744675,
  'top_logprobs': []},
 {'token': ' a',
  'bytes': [32, 97],
  'logprob': -7.696077227592468e-05,
  'top_logprobs': []},
 {'token': ' program',
  'bytes': [32, 112, 114, 111, 103, 114, 97, 109],
  'logprob': -0.9782673120498657,
  'top_logprobs': []},
 {'token': ',',
  'bytes': [44],
  'logprob': -1.8550976164988242e-05,
  'top_logprobs': []}]

And are part of streamed Message chunks as well:

In [4]:
ct = 0
full = None
for chunk in llm.stream(("human", "how are you today")):
    if ct < 5:
        full = chunk if full is None else full + chunk
        if "logprobs" in full.response_metadata:
            print(full.response_metadata["logprobs"]["content"])
    else:
        break
    ct += 1

[]
[{'token': "I'm", 'bytes': [73, 39, 109], 'logprob': -0.06202784180641174, 'top_logprobs': []}]
[{'token': "I'm", 'bytes': [73, 39, 109], 'logprob': -0.06202784180641174, 'top_logprobs': []}, {'token': ' just', 'bytes': [32, 106, 117, 115, 116], 'logprob': -0.03149718791246414, 'top_logprobs': []}]
[{'token': "I'm", 'bytes': [73, 39, 109], 'logprob': -0.06202784180641174, 'top_logprobs': []}, {'token': ' just', 'bytes': [32, 106, 117, 115, 116], 'logprob': -0.03149718791246414, 'top_logprobs': []}, {'token': ' a', 'bytes': [32, 97], 'logprob': -9.555654105497524e-05, 'top_logprobs': []}]
[{'token': "I'm", 'bytes': [73, 39, 109], 'logprob': -0.06202784180641174, 'top_logprobs': []}, {'token': ' just', 'bytes': [32, 106, 117, 115, 116], 'logprob': -0.03149718791246414, 'top_logprobs': []}, {'token': ' a', 'bytes': [32, 97], 'logprob': -9.555654105497524e-05, 'top_logprobs': []}, {'token': ' computer', 'bytes': [32, 99, 111, 109, 112, 117, 116, 101, 114], 'logprob': -0.4784653484821319

## Next steps

You've now learned how to get logprobs from OpenAI models in LangChain.

Next, check out the other how-to guides chat models in this section, like [how to get a model to return structured output](/docs/how_to/structured_output) or [how to track token usage](/docs/how_to/chat_token_usage_tracking).

In [4]:
# Check if pandas is installed, and install if missing
try:
    import pandas as pd
except ImportError:
    import sys
    !{sys.executable} -m pip install pandas
    import pandas as pd

In [2]:
# Import required libraries
import pandas as pd
from langchain_openai import ChatOpenAI

# Create the LLM and get a response with logprobs
llm = ChatOpenAI(model="gpt-4o-mini").bind(logprobs=True)
msg = llm.invoke(("human", "I'ts a lovely day and I am thirsty. I will go to the "))

# Get token probabilities
token_probs = msg.response_metadata["logprobs"]["content"][:7]  # Example: first 7 tokens

# Convert to DataFrame for tabular display
df = pd.DataFrame(token_probs)

# Display the DataFrame
df

Unnamed: 0,token,bytes,logprob,top_logprobs
0,It's,"[73, 116, 39, 115]",-2.322808,[]
1,a,"[32, 97]",-0.030191,[]
2,lovely,"[32, 108, 111, 118, 101, 108, 121]",-3e-06,[]
3,day,"[32, 100, 97, 121]",-0.000229,[]
4,",",[44],-1.332562,[]
5,and,"[32, 97, 110, 100]",-0.074767,[]
6,I,"[32, 73]",-0.053837,[]
