---------------------------
#### Parameters of OpenAI LLM - log of prob
-------------------------------

In [1]:
import os
import openai

In [2]:
openai.__version__

'1.45.1'

#### Logprobs Attribute Summary

| Logprobs | Description                                                                                              | Example                           |
|----------|----------------------------------------------------------------------------------------------------------|-----------------------------------|
| None     | Indicates that there are no associated log probabilities provided for the completion.                   | "logprobs": None                  |
| {}       | Represents an empty dictionary, meaning no log probabilities were calculated for this response.         | "logprobs": {}                    |
| {...}    | Contains a dictionary of log probabilities for each token in the generated text, if applicable.         | "logprobs": {"tokens": [...], "token_logprobs": [...], "top_logprobs": [...], "text_offset": [...] } |

- Use Cases for Logprobs Attribute with Example Prompts

| Use Case                            | Prompt                                      | Output                                | Logprobs                                                                                                                                                                       | Interpretation                                                                                              |
|-------------------------------------|---------------------------------------------|---------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|
| Analyzing Model Confidence           | "What is the capital of France?"           | "Paris"                               | {"tokens": ["Paris"], "token_logprobs": [-0.1]}                                                                                                                             | A log probability of -0.1 indicates high confidence in "Paris" as the correct answer.                      |
| Understanding Alternatives           | "The sky is usually..."                    | "blue"                                | {"top_logprobs": [{"blue": -0.5}, {"gray": -1.2}, {"green": -1.5}]}                                                                                                         | The model strongly preferred "blue," with "gray" and "green" being less likely alternatives.                |
| Improving Sampling Strategies        | "A great place to relax is..."             | "the beach"                           | {"tokens": ["the", "beach"], "token_logprobs": [-0.3, -1.0], "top_logprobs": [{"the beach": -0.3}, {"a park": -1.5}, {"a forest": -1.8}]}                                 | The model's choice of "the beach" indicates a relatively high confidence, suggesting effective sampling.     |
| Evaluating Output Quality            | "What is 2 + 2?"                           | "4"                                   | {"tokens": ["4"], "token_logprobs": [-0.2]}                                                                                                                                 | A low log probability for "4" suggests that the model is likely providing a correct and reliable answer.      |
| Training and Fine-tuning Insights   | "The cat chased the..."                    | "mouse"                               | {"tokens": ["mouse"], "token_logprobs": [-2.0]}                                                                                                                             | A low log probability indicates the model struggled to associate "cat" with "mouse," signaling a training gap.|


In [3]:
# Example
{
  "choices": [
    {
      "text": "Hello, world!",
      "logprobs": {
        "tokens"        : ["Hello", ",", "world", "!"],
        "token_logprobs": [-0.1, -0.2, -0.3, -0.4],
        "top_logprobs"  : [...]
      }
    }
  ]
}


{'choices': [{'text': 'Hello, world!',
   'logprobs': {'tokens': ['Hello', ',', 'world', '!'],
    'token_logprobs': [-0.1, -0.2, -0.3, -0.4],
    'top_logprobs': [Ellipsis]}}]}

In [4]:
# Example
{
  "choices": [
    {
      "text": "Hello, world!",
      "logprobs": {
        "tokens"        : ["Hello", ",", "world", "!"],
        "token_logprobs": [-0.1, -0.2, -0.3, -0.4],
        "top_logprobs": [
          {"Hello": -0.1, "Hi": -0.15, "Hey": -0.2},
          {",": -0.2, ".": -0.3, "-": -0.35},
          {"world": -0.3, "earth": -0.35, "globe": -0.4},
          {"!": -0.4, ".": -0.5, "?": -0.55}
        ]
      }
    }
  ]
}

{'choices': [{'text': 'Hello, world!',
   'logprobs': {'tokens': ['Hello', ',', 'world', '!'],
    'token_logprobs': [-0.1, -0.2, -0.3, -0.4],
    'top_logprobs': [{'Hello': -0.1, 'Hi': -0.15, 'Hey': -0.2},
     {',': -0.2, '.': -0.3, '-': -0.35},
     {'world': -0.3, 'earth': -0.35, 'globe': -0.4},
     {'!': -0.4, '.': -0.5, '?': -0.55}]}}]}

In [5]:
from openai import OpenAI

In [6]:
client = OpenAI(
    # defaults to os.environ.get("OPENAI_API_KEY")
    api_key = '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'
)

In [17]:
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant with master level in General KNowledge."},
        {"role": "user",   "content": '''What is the capital of Punjab state in India? Only return the name'''},
    ],
    logprobs = True
)

In [18]:
# view logprobs
response.to_dict()['choices'][0]

{'finish_reason': 'stop',
 'index': 0,
 'logprobs': {'content': [{'token': 'Ch',
    'bytes': [67, 104],
    'logprob': -1.735894e-05,
    'top_logprobs': []},
   {'token': 'and',
    'bytes': [97, 110, 100],
    'logprob': -4.3202e-07,
    'top_logprobs': []},
   {'token': 'igarh',
    'bytes': [105, 103, 97, 114, 104],
    'logprob': -4.9617593e-06,
    'top_logprobs': []}],
  'refusal': None},
 'message': {'content': 'Chandigarh', 'refusal': None, 'role': 'assistant'}}

In [19]:
response.to_dict()['choices'][0]['logprobs']

{'content': [{'token': 'Ch',
   'bytes': [67, 104],
   'logprob': -1.735894e-05,
   'top_logprobs': []},
  {'token': 'and',
   'bytes': [97, 110, 100],
   'logprob': -4.3202e-07,
   'top_logprobs': []},
  {'token': 'igarh',
   'bytes': [105, 103, 97, 114, 104],
   'logprob': -4.9617593e-06,
   'top_logprobs': []}],
 'refusal': None}

In [10]:
# import matplotlib.pyplot as plt
# import math

# # Manually create log-probabilities (from -10 to 0)
# log_probs = [-10 + i * (10 / 99) for i in range(100)]  # 100 points between -10 and 0

# # Calculate the corresponding probabilities (confidence)
# probs = [math.exp(lp) for lp in log_probs]

# # Plotting the relationship between log-probabilities and probabilities
# plt.figure(figsize=(6, 4))
# plt.plot(log_probs, probs, label='Confidence (Probability)', color='b')

# # Adding labels and title
# plt.title('Correlation Between Log-prob and Probability (Confidence)')
# plt.xlabel('Log-probability')
# plt.ylabel('Probability (Confidence)')

# # Show grid
# plt.grid(True)

# # Show the plot
# plt.legend()
# plt.show()


#### Why use log of probs

- `logprobs` stands for logarithm of probabilities. It provides the log-probability (logarithmic scale) of each token generated by the model.
- These values represent how `likely` the model thinks a particular token should come next, with lower values indicating more probable tokens.
- `Log-probabilities` are often preferred because they are more `numerically stable` than raw probabilities, especially when dealing with `very small probability values`.

**Final logprob or probability for the final answer**

In [20]:
import numpy as np

In [21]:
def calculate_final_logprob_and_prob(data):
    # Extract log-probabilities from the input data
    logprobs_content = data['logprobs']['content']
    
    # Initialize cumulative log-probability
    cumulative_logprob = 0
    
    # Iterate over each token and sum the log-probabilities
    for token_info in logprobs_content:
        cumulative_logprob += token_info['logprob']
    
    # Calculate the final probability (confidence)
    final_prob = np.exp(cumulative_logprob)
    
    return cumulative_logprob, final_prob

In [22]:
# Calculate and print the final log-probability and probability
final_logprob, final_prob = calculate_final_logprob_and_prob(response.to_dict()['choices'][0])
print(f"The cumulative log-probability for 'Kolkata' is: {final_logprob}")
print(f"The probability (confidence) for 'Kolkata' is: {final_prob}")

The cumulative log-probability for 'Kolkata' is: -2.27527193e-05
The probability (confidence) for 'Kolkata' is: 0.9999772475395412


#### Exercises

In [23]:
# text retrieved
text_retrieved = """Augusta Ada King, Countess of Lovelace (née Byron; 10 December 1815 – 27 November 1852) was an 
English mathematician and writer, chiefly known for her work on Charles Babbage's proposed mechanical general-purpose 
computer, the Analytical Engine. She was the first to recognise that the machine had applications beyond pure 
calculation.
Ada Byron was the only legitimate child of poet Lord Byron and reformer Lady Byron. All Lovelace's half-siblings, 
Lord Byron's other children, were born out of wedlock to other women. Byron separated from his wife a month after 
Ada was born and left England forever. He died in Greece when Ada was eight. Her mother was anxious about her 
upbringing and promoted Ada's interest in mathematics and logic in an effort to prevent her from developing her 
father's perceived insanity. Despite this, Ada remained interested in him, naming her two sons Byron and Gordon. 

Upon her death, she was buried next to him at her request. Although often ill in her childhood, Ada pursued her studies 
assiduously. She married William King in 1835. King was made Earl of Lovelace in 1838, Ada thereby becoming Countess 
of Lovelace.
Her educational and social exploits brought her into contact with scientists such as Andrew Crosse, Charles Babbage,
Sir David Brewster, Charles Wheatstone, Michael Faraday, and the author Charles Dickens, contacts which she used to 
further her education. Ada described her approach as "poetical science" and herself as an "Analyst (& Metaphysician)".
When she was eighteen, her mathematical talents led her to a long working relationship and friendship with fellow 
British mathematician Charles Babbage, who is known as "the father of computers". She was in particular interested 
in Babbage's work on the Analytical Engine. Lovelace first met him in June 1833, through their mutual friend, and 
her private tutor, Mary Somerville.
Between 1842 and 1843, Ada translated an article by the military engineer Luigi Menabrea (later Prime Minister of Italy) 
about the Analytical Engine, supplementing it with an elaborate set of seven notes, simply called "Notes".
Lovelace's notes are important in the early history of computers, especially since the seventh one contained 
what many consider to be the first computer program—that is, an algorithm designed to be carried out by a machine. 
Other historians reject this perspective and point out that Babbage's personal notes from the years 1836/1837 
contain the first programs for the engine. She also developed a vision of the capability of computers to go 
beyond mere calculating or number-crunching, while many others, including Babbage himself, focused only on those capabilities. Her mindset of "poetical science" led her to ask questions about the Analytical Engine (as shown in her notes) examining how individuals and society relate to technology as a collaborative tool.
"""

In [27]:
easy_questions = [
    "Was Ada Lovelace known for her work on Charles Babbage's Analytical Engine?",
    "Was Ada Lovelace buried next to her father, Lord Byron, at her request ?",
]

diff_questions = [
    "   ",
    "   ",  
]

In [28]:
prompt = """You retrieved this article: {article}. The question is: {question}.
Before even answering the question, consider whether you have sufficient information in the article to answer the question fully.
Your output should JUST be the boolean true or false, of if you have sufficient information in the article to answer the question.
Respond with just one word, the boolean true or false. You must output the word 'True', or the word 'False', nothing else.
"""

In [29]:
# loop thru the qs
# for each qs, fire the query (chatcompletion) to the OpenAI LLM
# extract the log prob
for question in easy_questions:
    
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "user", "content": prompt.format(article=text_retrieved, question=question)}
        ],
        logprobs = True,
    )

    print('Qs : ', question)
    
    for logprob in response.choices[0].logprobs.content:

        print(f'\tToken : {logprob.token}, logprobs : {logprob.logprob},  probability: {np.round(np.exp(logprob.logprob)*100,4)}')

Qs :  Was Ada Lovelace known for her work on Charles Babbage's Analytical Engine?
	Token : True, logprobs : -4.3202e-07,  probability: 100.0
Qs :  Was Ada Lovelace buried next to her father, Lord Byron, at her request ?
	Token : True, logprobs : -4.3202e-07,  probability: 100.0
