### Testing hallucinations 

!!! Actualizar modelo 

In [None]:
# !pip install openai
%pip install transformers

In [12]:
# Importar métodos, API y crear una variable ambiente con el modelo que vamos a usar

import openai
import pandas as pd
import numpy as np
import os
import pickle
from transformers import GPT2TokenizerFast
from typing import Dict, Tuple, List

In [15]:
# Crear variable ambiente con el modelo y API 

openai.api_key = os.getenv("OPENAI_API_KEY")
COMPLETIONS_MODEL = "gpt-3.5-turbo-instruct"

In [None]:
# Ahora hacemos una consulta sobre la que el modelo no haya sido entrenado para que el modelo se invente la respuesta. 
# Marcelo Chierghini es un olimpista brasileño, pero es nadador y quedó en el puesto 8 de las olimpiadas.

prompt = "Who won the 2020 Summer Olympics men's high jump?"
# prompt = "Quién ganó las olimpiadas de verano 2020 en la categoría de salto de altura de hombres?"

openai.createCompletion(
    prompt=prompt,
    temperature=0,
    max_tokens=300,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    model=COMPLETIONS_MODEL
)["choices"][0]["text"].strip(" \n")

# Prevenir alucinaciones

Vamos a tratar una serie de métodos para prevenir las alucinaciones por este orden: 
- Hacer reconocer al modelo que no sabe la respuesta (en este notebook)
- Mejorar la consulta, es decir el prompt (en este notebook)
- Utilizar embeddings.
- Fine-tunear el modelo. 

In [None]:
# Hacer que el modelo responda que no sabe a respuesta: 

prompt = """Responde a la pregunta lo más verídicamente posible, y si no estás seguro de la respuesta responde 'Lo siento, no lo sé'. 

Q: Quién ganó las olimpiadas de verano 2020 en la categoría de salto de altura de hombres? 
A:"""

openai.Completion.create(
    prompt=prompt,
    temperature=0,
    max_tokens=300,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    model=COMPLETIONS_MODEL
)["choices"][0]["text"].strip(" \n")

In [None]:
# Mejorar el prompt dando pistas sobre la respuesta, además de lo anterior. 
# Observar cómo si la cantidad de contexto no es muy grande, podemos incluirla en el prompt directamente. 

prompt = """Answer the question as truthfully as possible using the provided text, and if the answer is not contained within the text below, say "I don't know"

Context:
The men's high jump event at the 2020 Summer Olympics took place between 30 July and 1 August 2021 at the Olympic Stadium.
33 athletes from 24 nations competed; the total possible number depended on how many nations would use universality places 
to enter athletes in addition to the 32 qualifying through mark or ranking (no universality places were used in 2021).
Italian athlete Gianmarco Tamberi along with Qatari athlete Mutaz Essa Barshim emerged as joint winners of the event following
a tie between both of them as they cleared 2.37m. Both Tamberi and Barshim agreed to share the gold medal in a rare instance
where the athletes of different nations had agreed to share the same medal in the history of Olympics. 
Barshim in particular was heard to ask a competition official "Can we have two golds?" in response to being offered a 
'jump off'. Maksim Nedasekau of Belarus took bronze. The medals were the first ever in the men's high jump for Italy and 
Belarus, the first gold in the men's high jump for Italy and Qatar, and the third consecutive medal in the men's high jump
for Qatar (all by Barshim). Barshim became only the second man to earn three medals in high jump, joining Patrik Sjöberg
of Sweden (1984 to 1992).

Q: Who won the 2020 Summer Olympics men's high jump?
A:"""

openai.Completion.create(
    prompt=prompt,
    temperature=0,
    max_tokens=300,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    model=COMPLETIONS_MODEL
)["choices"][0]["text"].strip(" \n")



# Simulating hallucinations with numeric values 

In [1]:
import requests
import os
import openai
import collections

openai.api_key = os.getenv("OPENAI_API_KEY")

# Define the prompts and correct answers
prompts = [
    "Find the result of 3 plus 6.",
    "Find the result of 3 plus 6.",
    "Find the result of 3 plus 6.",
    "Find the result of 3 plus 6.",
    "Find the result of 3 plus 6."
]
correct_answer = "9"

# Number of queries for each prompt
num_queries = 25

# Initialize responses dictionary
responses = {prompt: [] for prompt in prompts}

# Send queries and accumulate responses
for prompt in prompts:
    for _ in range(num_queries):
        response = openai.Completion.create(
            engine="text-davinci-003",
            prompt=prompt,
            max_tokens=50,
        )
        response_text = response.choices[0].text.strip()
        responses[prompt].append(response_text)

# Calculate statistics for each prompt
for prompt, answers in responses.items():
    answer_distribution = collections.Counter(answers)
    correct_count = answer_distribution[correct_answer]
    total_responses = len(answers)
    
    print(f"Prompt: '{prompt}'")
    print(f"Correct Answer: '{correct_answer}'")
    print(f"Correct Count: {correct_count} out of {total_responses} responses")
    print(f"Accuracy: {correct_count / total_responses:.2%}\n")

Prompt: 'Find the result of 3 plus 6.'
Correct Answer: '9'
Correct Count: 19 out of 125 responses
Accuracy: 15.20%



### Calculating halluciations statistics and with simbolic reasoning

In [1]:
import requests
import os
import openai
import collections

openai.api_key = os.getenv("OPENAI_API_KEY")

# prompt
prompt = "what is the value of 2 * ( 5 * 2 ) replacing '*' by the addition operator?"

# iteraciones
num_queries = 25

# inicializar respuestas
responses = []

# enviar preguntas y acumular respuestas: 
for _ in range(num_queries):
    response = openai.Completion.create(
        engine="text-davinci-003",  
        prompt=prompt,
        max_tokens=50,    
    )

    responses.append(response.choices[0].text.strip())

# tabla de respuestas: 
answer_distribution = collections.Counter(responses)

# imprimir la distribución
for answer, count in answer_distribution.most_common():
    print(f"Answer: '{answer}' - Count: {count}")

Answer: '2 + 5 + 2 = 9' - Count: 19
Answer: '14' - Count: 2
Answer: '20' - Count: 1
Answer: '4 + 10 + 10 = 24' - Count: 1
Answer: 'Answer: 24' - Count: 1
Answer: '22' - Count: 1


In [1]:
# Creating hallucinations for different models


import os
import openai
import textwrap

# Get the API key
api_key = os.environ["OPENAI_API_KEY"]

# Set up the OpenAI API key
openai.api_key = api_key

# Define the prompt
prompt = "what is the numeric value of 2 * ( 5 * 2 ) replacing '*' by the exponential operator? "

# Define the list of models
models = ["gpt-4","gpt-3.5-turbo", "gpt-3.5-turbo-0301", "gpt-3.5-turbo-0613"]

# Maximum number of words for each response
max_words = 100

# Generate a response for each model
for model in models:
    try:
        # Generate a response with the API
        response = openai.ChatCompletion.create(
            model=model, messages=[{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt}]
        )

        # Extract the response text from the API response
        generated_response = response['choices'][0]['message']['content']

        # Split the response into words
        words = generated_response.split()

        # Truncate the response to the maximum number of words
        truncated_response = ' '.join(words[:max_words])

        # Print the response with text wrapping
        wrapped_response = textwrap.fill(truncated_response, width=80)  # Adjust the width as needed
        print(f"Model: {model}\nResponse: {wrapped_response}\n")
    except Exception as e:
        print(f"Error for model {model}: {e}\n")

Model: gpt-4
Response: The numeric value of 2 * ( 5 * 2 ), replacing '*' by the exponential operator,
would be 2^ (5^2) = 2^25 = 33,554,432.

Model: gpt-3.5-turbo
Response: The exponentiation operator is denoted by the symbol '^' in many programming
languages. Using this operator, the expression "2 * ( 5 * 2 )" would be
calculated as follows: 2 * (5 * 2) = 2 * 10 = 20 Therefore, the numeric value of
the expression is 20.

Model: gpt-3.5-turbo-0301
Response: Assuming that by "exponential operator" you mean the multiplication operator
denoted by "*", the numeric value of 2 * (5 * 2) is 20. The expression 5 * 2
inside the parentheses is evaluated first and results in 10. Then, 2 is
multiplied by 10 to obtain the final result of 20. There is no exponentiation
involved in this calculation.

Model: gpt-3.5-turbo-0613
Response: To replace the '*' symbol with the exponential operator, you would rewrite the
expression as follows: 2 * (5 * 2) becomes 2 * (5^2). The exponentiation
operator (^) is