# Prompting LLMs

There are three ways to prompt LLMs and get a response:

1. **User Interfaces (UI)**: exchanging essages with a chatbot through a dedicated website. Either free or with a monthly subscription.
2. **APIs**: sending messages with a bit more control via APIs and getting a reply. Charged per token.
3. **Running Locally**: downloading open-source models and running them with full control on a machine. Free.

Below is a summary for some of the commonly used models:

|Model Family|Developed by|User Interface|Open Source?|
|--|--|--|--|
|GPT|Open AI|[chatgpt.com](chatgpt.com)| Only GPT-OSS|
|Gemini|Google|[gemini.google.com](https://gemini.google.com/app)| No |
|Llama|Meta|[meta.ai](https://www.meta.ai)| Yes |
|Qwen|Alibaba|[chat.qwen.ai](https://chat.qwen.ai)| Yes |
|Mixtral|Mistral|[chat.mistral.ai](https://chat.mistral.ai)| Yes |

# 1 - Interacting with GPT via API

Calling OpenAI's API is really simple: after adding $5 to the account and generating a token, the token needs to be added to a .env file (for security reasons). Then, the token is imported via load_dotenv() and the system and user prompts are crafted and sent to the API.

In [83]:
from dotenv import load_dotenv
from openai import OpenAI
import os

# Load OpenAI API keys from .env
load_dotenv(override=True)
openai = OpenAI()

# Craft the message that will be sent to the API
messages = [

    # System prompt
    {'role': 'system', 
     'content': """
         You are a science expert that can simply concepts really well. 
         Address the user's questions and explain like he is 5 years old.
         Make the explanations short and intuitive, preferably using animals as analogies.
         """},
    
    # User prompt
    {'role': 'user', 
     'content': 'What is a large language model?'}
    
]

# Pick arguments (temperature, model etc) and pass the message to get a response
response = openai.chat.completions.create(

    # choose model
    model="gpt-4o-mini",

    # temperature (must be <= 2)
    temperature=1.5,

    # pass messages (user + system prompt)
    messages=messages
)

# Print response
print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)

GPT-4o-Mini:

Imagine a big library filled with lots of books about all kinds of animals. Now, pretend there's a friendly robot parrot who really loves to listen to the stories from those books. This robot learns tons of words and stories until it gets really good at chatting! 

A large language model is like that parrot. It listens to lots and lots of writing (like reading all those books) so it can put together sentences and talk to you about anything – like asking about your favorite animals or telling you a fun story!


___

It's also possible to get the probabilities of the top-k candidates for the next token using the argument _logprobs_.

In [103]:
# Pick arguments (temperature, model etc) and pass the message to get a response
response = openai.chat.completions.create(

    # choose model
    model="gpt-4o-mini",

    # temperature (must be <= 2)
    temperature=0.01,

    # pass messages (user + system prompt)
    messages= [{'role': 'user', 
               'content': 'Who came up with the theory of General Relativity? Give me just the name.'}],

    # will return log-probabilities
    logprobs=True,

    # will return the probs for the 5 most likely alternatives for each token generated
    top_logprobs=5,

    # max tokens in the repsonse
    max_tokens=5
)

# Print response
print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)

GPT-4o-Mini:

Albert Einstein.


In [104]:
for token_info in response.choices[0].logprobs.content:
    chosen_token = token_info.token
    
    # show top-k alternatives
    for n, alt in enumerate(token_info.top_logprobs):
        token = alt.token
        logprob = alt.logprob
        
        print(f"{n+1}:    {token!r:15} (log prob. = {logprob:0.1f})")
    print(f"Chosen token: {chosen_token!r}")
    print(" \n")

1:    'Albert'        (log prob. = -0.0)
2:    'Ein'           (log prob. = -11.3)
3:    'Al'            (log prob. = -16.0)
4:    ' Albert'       (log prob. = -17.8)
5:    'Isa'           (log prob. = -19.0)
Chosen token: 'Albert'
 

1:    ' Einstein'     (log prob. = 0.0)
2:    'Ein'           (log prob. = -18.6)
3:    ' Ein'          (log prob. = -20.5)
4:    ' ein'          (log prob. = -21.6)
5:    ' EIN'          (log prob. = -21.9)
Chosen token: ' Einstein'
 

1:    '.'             (log prob. = -0.0)
2:    '<|end|>'       (log prob. = -4.4)
3:    '。'             (log prob. = -14.5)
4:    '<|end|>'       (log prob. = -14.8)
5:    '.\n'           (log prob. = -14.9)
Chosen token: '.'
 



___

If the temperature is high, the likelihood of the LLM picking less likely tokens increases.

In [111]:
temperature = 2
prompt = 'What is the meaning of life? Reply in 5 words.'

###################

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    temperature=temperature,
    messages= [{'role': 'user', 
               'content': prompt}],
    logprobs=True,
    top_logprobs=5,
    max_tokens=10
)

print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)
print('___\n')

for token_info in response.choices[0].logprobs.content:
    chosen_token = token_info.token
    for n, alt in enumerate(token_info.top_logprobs):
        token = alt.token
        logprob = alt.logprob
        
        print(f"{n+1}:    {token!r:15} (log prob. = {logprob:0.1f})")
    print(f"Chosen token: {chosen_token!r}")
    print(" \n")

GPT-4o-Mini:

Experiencing love, connection, growth, purpose
___

1:    'Seek'          (log prob. = -1.1)
2:    'P'             (log prob. = -1.4)
3:    'To'            (log prob. = -2.5)
4:    'Experience'    (log prob. = -2.6)
5:    'Find'          (log prob. = -3.0)
Chosen token: 'Exper'
 

1:    'ien'           (log prob. = -0.1)
2:    'iences'        (log prob. = -2.6)
3:    'iential'       (log prob. = -9.0)
4:    'ience'         (log prob. = -11.0)
5:    'iene'          (log prob. = -15.5)
Chosen token: 'ien'
 

1:    'cing'          (log prob. = 0.0)
2:    'cer'           (log prob. = -18.5)
3:    'c'             (log prob. = -18.8)
4:    'cin'           (log prob. = -20.0)
5:    'cers'          (log prob. = -20.5)
Chosen token: 'cing'
 

1:    ' love'         (log prob. = -0.2)
2:    ' joy'          (log prob. = -2.2)
3:    ','             (log prob. = -3.1)
4:    ' connection'   (log prob. = -4.0)
5:    ' connections'  (log prob. = -5.5)
Chosen token: ' love'
 

1:    ','   

___

If the temperature is low, the model will gravitate towards the most likely tokens every time, making the answer constant no matter how any times it is asked.

In [109]:
temperature = 0.01
prompt = 'What is the meaning of life? Reply in 5 words.'

###################

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    temperature=temperature,
    messages= [{'role': 'user', 
               'content': prompt}],
    logprobs=True,
    top_logprobs=5,
    max_tokens=10
)

print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)
print('___\n')

for token_info in response.choices[0].logprobs.content:
    chosen_token = token_info.token
    for n, alt in enumerate(token_info.top_logprobs):
        token = alt.token
        logprob = alt.logprob
        
        print(f"{n+1}:    {token!r:15} (log prob. = {logprob:0.1f})")
    print(f"Chosen token: {chosen_token!r}")
    print(" \n")

GPT-4o-Mini:

Seek purpose, connection, and happiness.
___

1:    'Seek'          (log prob. = -1.2)
2:    'P'             (log prob. = -1.6)
3:    'Experience'    (log prob. = -2.4)
4:    'To'            (log prob. = -2.7)
5:    'Find'          (log prob. = -3.1)
Chosen token: 'Seek'
 

1:    ' purpose'      (log prob. = -1.2)
2:    ' happiness'    (log prob. = -1.2)
3:    ' joy'          (log prob. = -1.4)
4:    ' connection'   (log prob. = -2.7)
5:    ' love'         (log prob. = -3.2)
Chosen token: ' purpose'
 

1:    ','             (log prob. = -0.0)
2:    ' and'          (log prob. = -7.5)
3:    ';'             (log prob. = -9.8)
4:    ' through'      (log prob. = -11.0)
5:    ' in'           (log prob. = -13.9)
Chosen token: ','
 

1:    ' connection'   (log prob. = -1.1)
2:    ' create'       (log prob. = -1.2)
3:    ' love'         (log prob. = -1.5)
4:    ' connect'      (log prob. = -3.0)
5:    ' find'         (log prob. = -3.5)
Chosen token: ' connection'
 

1:    ','     