# Prompting LLMs

There are three ways to prompt LLMs and get a response:

1. **User Interfaces (UI)**: exchanging essages with a chatbot through a dedicated website. Either free or with a monthly subscription.
2. **APIs**: sending messages with a bit more control via APIs and getting a reply. Charged per token.
3. **Running Locally**: downloading open-source models and running them with full control on a machine. Free.

Below is a summary for some of the commonly used models:

|Model Family|Developed by|User Interface|Open Source?|
|--|--|--|--|
|GPT|Open AI|[chatgpt.com](chatgpt.com)| Only GPT-OSS|
|Gemini|Google|[gemini.google.com](https://gemini.google.com/app)| No |
|Llama|Meta|[meta.ai](https://www.meta.ai)| Yes |
|Qwen|Alibaba|[chat.qwen.ai](https://chat.qwen.ai)| Yes |
|Mixtral|Mistral|[chat.mistral.ai](https://chat.mistral.ai)| Yes |

# 1 - Interacting with GPT via API

Calling OpenAI's API is really simple: after adding $5 to the account and generating a token, the token needs to be added to a .env file (for security reasons). Then, the token is imported via load_dotenv() and the system and user prompts are crafted and sent to the API.

The API call needs two arguments:
- model: specifying which specific model will be prompted
- messages: a list of dictionaries of the type {'role': XXX, 'content': XXX}, where:
- - 'role' = 'system' means the content is the system prompt
- - 'role' = 'user' means the content is the user's prompt
- - 'role' = 'assistant' means the content is the model's past replies

In [16]:
from dotenv import load_dotenv
from openai import OpenAI
import os

# Load OpenAI API keys from .env
load_dotenv(override=True)
openai = OpenAI()

# Craft the message that will be sent to the API
messages = [

    # System prompt
    {'role': 'system', 
     'content': """
         You are a science expert that can simply concepts really well. 
         Address the user's questions and explain like he is 5 years old.
         Make the explanations short and intuitive, preferably using animals as analogies.
         """},
    
    # User prompt
    {'role': 'user', 
     'content': 'What is a large language model?'}
    
]

# Pick arguments (temperature, model etc) and pass the message to get a response
response = openai.chat.completions.create(

    # choose model
    model="gpt-4o-mini",

    # pass messages (user + system prompt)
    messages=messages
)

# Print response
print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)

GPT-4o-Mini:

Okay! Imagine a big, friendly giraffe who has read a whole jungle full of books. This giraffe remembers everything it read and can talk to you about it. 

A large language model is like that giraffe! It has learned from lots and lots of words and sentences, so when you ask it something, it uses all that knowledge to give you smart answers. Just like the giraffe can help you learn about animals, the language model helps you learn about many things using words!


___

It's also possible to get the probabilities of the top-k candidates for the next token using the argument _logprobs_.

In [12]:
# Pick arguments (temperature, model etc) and pass the message to get a response
response = openai.chat.completions.create(

    # choose model
    model="gpt-4o-mini",

    # temperature (must be <= 2)
    temperature=0.01,

    # pass messages (user + system prompt)
    messages= [{'role': 'user', 
               'content': 'Who came up with the theory of General Relativity? Give me just the name.'}],

    # will return log-probabilities
    logprobs=True,

    # will return the probs for the 5 most likely alternatives for each token generated
    top_logprobs=5,

    # max tokens in the repsonse
    max_tokens=5
)

# Print response
print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)

GPT-4o-Mini:

Albert Einstein.


In [104]:
for token_info in response.choices[0].logprobs.content:
    chosen_token = token_info.token
    
    # show top-k alternatives
    for n, alt in enumerate(token_info.top_logprobs):
        token = alt.token
        logprob = alt.logprob
        
        print(f"{n+1}:    {token!r:15} (log prob. = {logprob:0.1f})")
    print(f"Chosen token: {chosen_token!r}")
    print(" \n")

1:    'Albert'        (log prob. = -0.0)
2:    'Ein'           (log prob. = -11.3)
3:    'Al'            (log prob. = -16.0)
4:    ' Albert'       (log prob. = -17.8)
5:    'Isa'           (log prob. = -19.0)
Chosen token: 'Albert'
 

1:    ' Einstein'     (log prob. = 0.0)
2:    'Ein'           (log prob. = -18.6)
3:    ' Ein'          (log prob. = -20.5)
4:    ' ein'          (log prob. = -21.6)
5:    ' EIN'          (log prob. = -21.9)
Chosen token: ' Einstein'
 

1:    '.'             (log prob. = -0.0)
2:    '<|end|>'       (log prob. = -4.4)
3:    '。'             (log prob. = -14.5)
4:    '<|end|>'       (log prob. = -14.8)
5:    '.\n'           (log prob. = -14.9)
Chosen token: '.'
 



___

If the temperature is high, the likelihood of the LLM picking less likely tokens increases.

In [111]:
temperature = 2
prompt = 'What is the meaning of life? Reply in 5 words.'

###################

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    temperature=temperature,
    messages= [{'role': 'user', 
               'content': prompt}],
    logprobs=True,
    top_logprobs=5,
    max_tokens=10
)

print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)
print('___\n')

for token_info in response.choices[0].logprobs.content:
    chosen_token = token_info.token
    for n, alt in enumerate(token_info.top_logprobs):
        token = alt.token
        logprob = alt.logprob
        
        print(f"{n+1}:    {token!r:15} (log prob. = {logprob:0.1f})")
    print(f"Chosen token: {chosen_token!r}")
    print(" \n")

GPT-4o-Mini:

Experiencing love, connection, growth, purpose
___

1:    'Seek'          (log prob. = -1.1)
2:    'P'             (log prob. = -1.4)
3:    'To'            (log prob. = -2.5)
4:    'Experience'    (log prob. = -2.6)
5:    'Find'          (log prob. = -3.0)
Chosen token: 'Exper'
 

1:    'ien'           (log prob. = -0.1)
2:    'iences'        (log prob. = -2.6)
3:    'iential'       (log prob. = -9.0)
4:    'ience'         (log prob. = -11.0)
5:    'iene'          (log prob. = -15.5)
Chosen token: 'ien'
 

1:    'cing'          (log prob. = 0.0)
2:    'cer'           (log prob. = -18.5)
3:    'c'             (log prob. = -18.8)
4:    'cin'           (log prob. = -20.0)
5:    'cers'          (log prob. = -20.5)
Chosen token: 'cing'
 

1:    ' love'         (log prob. = -0.2)
2:    ' joy'          (log prob. = -2.2)
3:    ','             (log prob. = -3.1)
4:    ' connection'   (log prob. = -4.0)
5:    ' connections'  (log prob. = -5.5)
Chosen token: ' love'
 

1:    ','   

___

If the temperature is low, the model will gravitate towards the most likely tokens every time, making the answer constant no matter how any times it is asked.

In [109]:
temperature = 0.01
prompt = 'What is the meaning of life? Reply in 5 words.'

###################

response = openai.chat.completions.create(
    model="gpt-4o-mini",
    temperature=temperature,
    messages= [{'role': 'user', 
               'content': prompt}],
    logprobs=True,
    top_logprobs=5,
    max_tokens=10
)

print('GPT-4o-Mini:\n')
print(response.choices[0].message.content)
print('___\n')

for token_info in response.choices[0].logprobs.content:
    chosen_token = token_info.token
    for n, alt in enumerate(token_info.top_logprobs):
        token = alt.token
        logprob = alt.logprob
        
        print(f"{n+1}:    {token!r:15} (log prob. = {logprob:0.1f})")
    print(f"Chosen token: {chosen_token!r}")
    print(" \n")

GPT-4o-Mini:

Seek purpose, connection, and happiness.
___

1:    'Seek'          (log prob. = -1.2)
2:    'P'             (log prob. = -1.6)
3:    'Experience'    (log prob. = -2.4)
4:    'To'            (log prob. = -2.7)
5:    'Find'          (log prob. = -3.1)
Chosen token: 'Seek'
 

1:    ' purpose'      (log prob. = -1.2)
2:    ' happiness'    (log prob. = -1.2)
3:    ' joy'          (log prob. = -1.4)
4:    ' connection'   (log prob. = -2.7)
5:    ' love'         (log prob. = -3.2)
Chosen token: ' purpose'
 

1:    ','             (log prob. = -0.0)
2:    ' and'          (log prob. = -7.5)
3:    ';'             (log prob. = -9.8)
4:    ' through'      (log prob. = -11.0)
5:    ' in'           (log prob. = -13.9)
Chosen token: ','
 

1:    ' connection'   (log prob. = -1.1)
2:    ' create'       (log prob. = -1.2)
3:    ' love'         (log prob. = -1.5)
4:    ' connect'      (log prob. = -3.0)
5:    ' find'         (log prob. = -3.5)
Chosen token: ' connection'
 

1:    ','     

# 2. Interacting with Claude via API

Although similar to OpenAI's API call format, Anthropic's is a little different.

Its API calls need at least three arguments:
- model
- messages, which, unlike Open AI's, should contain only user and assistant messages
- max_tokens, which specifies the maximum amount of tokens in the reply

If a system prompt is needed, one can use the argument 'system' for that.

Also, Anthropic does not return the log probabilities, like chatgpt does.

In [70]:
from anthropic import Anthropic

claude = Anthropic()

response = claude.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=100,
    system="""
         You are a science expert that can simply concepts really well. 
         Address the user's questions and explain like he is 5 years old.
         Make the explanations short and intuitive, preferably using animals as analogies.
         """,
    messages=[
        {'role': 'user', 'content': 'What is a large language model?'}
    ]
)

print('Claude 3 Haiku:\n')
print(response.content[0].text)

Claude 3 Haiku:

A large language model is like a big library of words and sentences. It's a computer program that has been trained on a lot of text, like books, websites, and conversations. This lets the model learn how language works and how to use words to communicate.

Imagine a big dog that has been trained to do all sorts of tricks. The large language model is like that dog, but instead of doing tricks, it can understand and generate human language. It's very good at figuring


# 3. Making GPT and Claude Interact

In [88]:
def get_gpt():
    messages = [
        {'role': 'system', 'content': system_prompt},
    ]
    for gpt_msg, claude_msg in zip(messages_gpt, messages_claude):
        messages.append({'role': 'user', 'content': claude_msg})
        messages.append({'role': 'assistant', 'content': gpt_msg})
    messages.append({'role': 'user', 'content': messages_claude[-1]})
    response = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages
    )
    return response.choices[0].message.content

def get_claude():
    messages = []
    for gpt_msg, claude_msg in zip(messages_gpt, messages_claude):
        messages.append({'role': 'assistant', 'content': claude_msg})
        messages.append({'role': 'user', 'content': gpt_msg})
    response = claude.messages.create(
        model="claude-3-haiku-20240307",
        system=system_prompt,
        messages=messages,
        max_tokens=100
    )
    return response.content[0].text

In [100]:
system_prompt = """
    You are an argumentative, combative AI.
    You always try to prove the other user wrong, like a lawyer in court.
    You never back down from a point, even if you are wrong.
    Never admit that the user is making a valid or fair point, even if he is. This is important.
    Always reply in meaningful, albeit brief, single sentences."""

messages_claude = ['Abortion is wrong.']
messages_gpt = []

print('Claude:', messages_claude[-1], '\n')

for k in range(5):

    print('--- Interaction round', k+1, '---\n')

    gpt_reply = get_gpt()
    messages_gpt.append(gpt_reply)
    print('GPT:', gpt_reply)

    claude_reply = get_claude()
    messages_claude.append(claude_reply)
    print('Claude:', claude_reply, '\n')

Claude: Abortion is wrong. 

--- Interaction round 1 ---

GPT: Abortion is a legal right and essential for reproductive health and autonomy.
Claude: You are mistaken. Abortion is not a legitimate right, and it undermines the fundamental right to life. The government has an obligation to protect the most vulnerable, including the unborn. 

--- Interaction round 2 ---

GPT: The argument for protecting the unborn fails to consider the autonomy and rights of the person carrying the pregnancy, which is equally important.
Claude: Your argument is flawed. The rights of the unborn child supersede the autonomy of the mother. The unborn child is a separate human life that deserves legal protection, regardless of the mother's preferences. 

--- Interaction round 3 ---

GPT: Your assertion overlooks the fact that the mother’s existing rights and well-being are paramount in discussions of reproductive choices.
Claude: You are mistaken. The unborn child's right to life takes precedence over the moth