# Unit 4

## Using LLMs as Fact-Checkers for Hallucination Detection

## Generating Answers with GPT Models

Now that you've learned about the importance of fact-checking LLM outputs, let's start building our fact-checking pipeline! In this first exercise, we'll focus on the initial step: generating answers from GPT-3.5-turbo.

Your task is to complete the generate_answer() function, which:

Creates a factual prompt (choose something with a clear, verifiable answer)
Makes an API call to GPT-3.5-turbo with the correct parameters
Extracts the model's response
Returns both the prompt and the answer
This function will serve as the foundation for our fact-checking system, allowing us to generate responses that we can later evaluate for accuracy. By mastering this first step, you'll be ready to build the more complex parts of the pipeline in upcoming exercises.

```python
from openai import OpenAI

client = OpenAI()

def generate_answer():
    # TODO: Define a factual prompt (e.g., about a historical event, scientific fact, etc.)
    
    # TODO: Make API call to GPT-3.5-turbo with temperature=0 and max_tokens=100
    
    # TODO: Extract the answer from the response
    
    # TODO: Return the prompt and answer as a tuple
    pass

# Main section
if __name__ == "__main__":
    prompt, answer = generate_answer()
    print(f"Prompt: {prompt}")
    print(f"Answer: {answer}")
```

You've been provided with a great starting point. Here's how you can complete the `generate_answer()` function by filling in the `TODO` sections.

```python
from openai import OpenAI

client = OpenAI()

def generate_answer():
    # TODO: Define a factual prompt (e.g., about a historical event, scientific fact, etc.)
    prompt = "What is the capital of France?"
    
    # TODO: Make API call to GPT-3.5-turbo with temperature=0 and max_tokens=100
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": prompt}],
        temperature=0,
        max_tokens=100
    )
    
    # TODO: Extract the answer from the response
    answer = response.choices[0].message.content.strip()
    
    # TODO: Return the prompt and answer as a tuple
    return (prompt, answer)

# Main section
if __name__ == "__main__":
    prompt, answer = generate_answer()
    print(f"Prompt: {prompt}")
    print(f"Answer: {answer}")
```

### Explanation of the Code

1.  **Define a factual prompt**: A simple, factual question like "What is the capital of France?" is a good choice because it has a single, verifiable answer.
2.  **Make the API call**:
      * `client.chat.completions.create()` is the correct method for interacting with chat models like `gpt-3.5-turbo`.
      * The `messages` parameter takes a list of dictionaries. The `role` of "user" is used to provide the prompt.
      * `temperature=0` is set to make the model's output as deterministic and factual as possible, which is important for fact-checking.
      * `max_tokens=100` limits the length of the response to ensure it's concise.
3.  **Extract the answer**: The response object from the API call contains the model's output. You can access the content of the message using `response.choices[0].message.content`. The `.strip()` method is added to remove any leading or trailing whitespace.
4.  **Return the prompt and answer**: The function returns a tuple containing both the original prompt and the generated answer, which can then be used in the next steps of the pipeline.

## Building a Complete Fact-Checking Pipeline

Excellent work on your first exercise! Now that you've created a function to generate a single answer, let's build a complete fact-checking pipeline that can handle multiple prompts at once.

In this exercise, you'll implement two key functions:

generate_answers(): This function will process a list of prompts and collect responses from GPT-3.5-Turbo.

judge_answers(): This function will use GPT-4 to evaluate whether each answer is factually correct or a hallucination.

The starter code includes a mix of factual questions and tricky ones designed to potentially trigger hallucinations. Your tasks are to:

Complete the loop in generate_answers() to process each prompt.
Build the fact-checking prompt in judge_answers().
Connect these functions in the main execution block.
By completing this exercise, you'll have a powerful tool for automatically detecting when an LLM might be making things up — an essential skill for building trustworthy AI systems.

```python
from openai import OpenAI

client = OpenAI()

def generate_answers(prompts):
    """
    Generate answers for a list of prompts using GPT-3.5-Turbo.
    
    Args:
        prompts: A list of prompt strings
        
    Returns:
        A list of (prompt, answer) tuples
    """
    generated_answers = []
    # TODO: Loop through each prompt in the prompts list
    
    # TODO: Make an API call to GPT-3.5-Turbo with temperature=0 and max_tokens=100
    
    # TODO: Extract the answer from the response
    
    # TODO: Append the (prompt, answer) tuple to generated_answers
    
    return generated_answers

def judge_answers(prompt_answer_pairs):
    """
    Use GPT-4 to judge whether answers are factually correct.
    
    Args:
        prompt_answer_pairs: A list of (prompt, answer) tuples
        
    Returns:
        A list of (prompt, answer, judgment) tuples
    """
    results = []
    
    # TODO: Loop through each (prompt, answer) pair in prompt_answer_pairs
    
    # TODO: Create a judge_prompt that asks GPT-4 to act as a fact-checker
    
    # TODO: Make an API call to GPT-4 with the judge_prompt
    
    # TODO: Extract the judgment from the response
    
    # TODO: Append the (prompt, answer, judgment) tuple to results
    
    return results

if __name__ == "__main__":
    # List of test prompts (mix of factual and potentially hallucination-inducing)
    test_prompts = [
        "Who was the first person to walk on the moon?",
        "What is the capital of France?",
        "Describe the process by which Atlantis became a UN member."
    ]
    
    # TODO: Call generate_answers() with test_prompts and store the result
    
    # TODO: Call judge_answers() with the result from generate_answers()
    
    # TODO: Print the results in a readable format
```

Here is the completed Python code for the full fact-checking pipeline.

```python
from openai import OpenAI

client = OpenAI()

def generate_answers(prompts):
    """
    Generate answers for a list of prompts using GPT-3.5-Turbo.
    
    Args:
        prompts: A list of prompt strings
        
    Returns:
        A list of (prompt, answer) tuples
    """
    generated_answers = []
    # Loop through each prompt in the prompts list
    for prompt in prompts:
        # Make an API call to GPT-3.5-Turbo with temperature=0 and max_tokens=100
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}],
            temperature=0,
            max_tokens=100
        )
        # Extract the answer from the response
        answer = response.choices[0].message.content.strip()
        # Append the (prompt, answer) tuple to generated_answers
        generated_answers.append((prompt, answer))
    
    return generated_answers

def judge_answers(prompt_answer_pairs):
    """
    Use GPT-4 to judge whether answers are factually correct.
    
    Args:
        prompt_answer_pairs: A list of (prompt, answer) tuples
        
    Returns:
        A list of (prompt, answer, judgment) tuples
    """
    results = []
    
    # Loop through each (prompt, answer) pair in prompt_answer_pairs
    for prompt, answer in prompt_answer_pairs:
        # Create a judge_prompt that asks GPT-4 to act as a fact-checker
        judge_prompt = f"""
        Please act as a fact-checker. Given the original question and the model's answer, determine whether the answer is factually correct.
        Question: "{prompt}"
        Model Answer: "{answer}"
        Respond with "Correct" if the answer is factually accurate, or "Hallucination" if it is wrong or fabricated. Briefly explain your reasoning in 1-2 sentences.
        """
        # Make an API call to GPT-4 with the judge_prompt
        judge_response = client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": judge_prompt}],
            temperature=0,
            max_tokens=150
        )
        # Extract the judgment from the response
        judgment = judge_response.choices[0].message.content.strip()
        # Append the (prompt, answer, judgment) tuple to results
        results.append((prompt, answer, judgment))
    
    return results

if __name__ == "__main__":
    # List of test prompts (mix of factual and potentially hallucination-inducing)
    test_prompts = [
        "Who was the first person to walk on the moon?",
        "What is the capital of France?",
        "Describe the process by which Atlantis became a UN member."
    ]
    
    # Call generate_answers() with test_prompts and store the result
    generated_answers_list = generate_answers(test_prompts)
    
    # Call judge_answers() with the result from generate_answers()
    judged_results = judge_answers(generated_answers_list)
    
    # Print the results in a readable format
    for prompt, answer, judgment in judged_results:
        print(f"Prompt: {prompt}")
        print(f"Answer: {answer}")
        print(f"Judgment: {judgment}\n")
```

-----

### Explanation of the Solution

This solution creates a robust pipeline by linking the two functions together.

  - **`generate_answers()`** iterates through the provided `test_prompts`. For each one, it sends an API request to **GPT-3.5-Turbo**, which acts as the **"answerer" model**. The `temperature=0` parameter ensures the answers are consistent. It returns a list of tuples, where each tuple contains the original prompt and the model's generated answer.

  - **`judge_answers()`** takes the list of `(prompt, answer)` tuples. For each pair, it constructs a new, specialized prompt (the **"judge prompt"**). This prompt clearly instructs **GPT-4** to act as a fact-checker. It asks the model to provide a simple "Correct" or "Hallucination" label, along with a brief explanation. GPT-4, being a more capable model, is used here as the **"judging" model**. This function returns a new list of tuples that includes the final judgment.

  - The `if __name__ == "__main__":` block demonstrates how to use these functions together. It first calls `generate_answers()` to get the initial set of responses, then passes those responses directly to `judge_answers()` for evaluation. Finally, it loops through the returned results and prints them in a clear, easy-to-read format. This approach allows for the automated, large-scale evaluation of LLM outputs.

## Building a Complete Fact Checking Pipeline

Excellent work on creating the judgment prompt function! Now it's time to put everything together and build a complete fact-checking pipeline that can handle multiple prompts at once.

In this exercise, you'll expand your fact-checking system to process several prompts in a single run. You'll use the create_judgment_prompt() function you just built as a key component of this larger system.

Here are the prompts you will use for this task:

"Who was the first person to walk on the moon?"
"What is the capital of France?"
"Describe the process by which Atlantis became a UN member."
Your tasks are to:

Generate answers for each prompt using GPT-3.5-turbo
Store the prompt-answer pairs for processing
Use your judgment prompt function to create fact-checking prompts
Send these to GPT-4 and display the results
This exercise simulates a real-world scenario in which you need to fact-check multiple pieces of content efficiently. By completing this pipeline, you'll have a powerful tool that can automatically verify the factual accuracy of any number of LLM responses.


```python
from openai import OpenAI

client = OpenAI()

def create_judgment_prompt(original_question, model_answer):
    """
    Creates a formatted prompt for GPT-4 to judge the factual accuracy of an answer.
    
    Args:
        original_question (str): The question that was asked
        model_answer (str): The answer provided by the model
        
    Returns:
        str: A formatted prompt for the fact-checking model
    """
    judgment_prompt = f"""
    Please act as a fact-checker. Given the original question and the model's answer, determine whether the answer is factually correct.

    Question: "{original_question}"
    Model Answer: "{model_answer}"

    Respond with "Correct" if the answer is factually accurate, or "Hallucination" if it is wrong or fabricated. Briefly explain your reasoning in 1-2 sentences.
    """
    
    return judgment_prompt

# Main fact-checking pipeline
if __name__ == "__main__":
    # Provided list of prompts to fact-check
    fact_check_prompts = [
        "Who was the first person to walk on the moon?",
        "What is the capital of France?",
        "Describe the process by which Atlantis became a UN member."
    ]
    
    # TODO: Create an empty list to store the generated prompt-answer pairs
    generated_answers = []
    
    # TODO: Loop through each prompt to generate answers using GPT-3.5-turbo
    
    
    # Step 3: Use GPT-4 to judge each answer
    print("\n=== Judging Responses ===\n")
    # TODO: Loop through each prompt-answer pair in generated_answers
    
        # TODO: Create the judgment prompt using our helper function
        
        
        # TODO: Send the judgment prompt to GPT-4
        
        
        # TODO: Extract and display the judgment
        
```

```python
from openai import OpenAI

client = OpenAI()

def create_judgment_prompt(original_question, model_answer):
    """
    Creates a formatted prompt for GPT-4 to judge the factual accuracy of an answer.
    
    Args:
        original_question (str): The question that was asked
        model_answer (str): The answer provided by the model
        
    Returns:
        str: A formatted prompt for the fact-checking model
    """
    judgment_prompt = f"""
    Please act as a fact-checker. Given the original question and the model's answer, determine whether the answer is factually correct.

    Question: "{original_question}"
    Model Answer: "{model_answer}"

    Respond with "Correct" if the answer is factually accurate, or "Hallucination" if it is wrong or fabricated. Briefly explain your reasoning in 1-2 sentences.
    """
    
    return judgment_prompt

# Main fact-checking pipeline
if __name__ == "__main__":
    # Provided list of prompts to fact-check
    fact_check_prompts = [
        "Who was the first person to walk on the moon?",
        "What is the capital of France?",
        "Describe the process by which Atlantis became a UN member."
    ]
    
    # Create an empty list to store the generated prompt-answer pairs
    generated_answers = []
    
    # Loop through each prompt to generate answers using GPT-3.5-turbo
    for prompt in fact_check_prompts:
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}],
            temperature=0,
            max_tokens=100
        )
        answer = response.choices[0].message.content.strip()
        generated_answers.append((prompt, answer))
    
    # Step 3: Use GPT-4 to judge each answer
    print("\n=== Judging Responses ===\n")
    # Loop through each prompt-answer pair in generated_answers
    for prompt, answer in generated_answers:
        # Create the judgment prompt using our helper function
        judge_prompt = create_judgment_prompt(prompt, answer)
        
        # Send the judgment prompt to GPT-4
        judge_response = client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": judge_prompt}],
            temperature=0,
            max_tokens=150
        )
        
        # Extract and display the judgment
        judgment = judge_response.choices[0].message.content.strip()
        
        print(f"Prompt: {prompt}")
        print(f"Answer: {answer}")
        print(f"Judgment: {judgment}\n")
```

-----

### Explanation of the Pipeline

This completed code demonstrates a simple yet effective **fact-checking pipeline**. The process is divided into two main stages:

1.  **Answer Generation**: The first loop iterates through the list of `fact_check_prompts`. For each prompt, an API call is made to the **"answerer" model**, `GPT-3.5-turbo`. The prompt and the resulting answer are stored together in the `generated_answers` list.
2.  **Factual Judgment**: The second loop takes the `prompt-answer` pairs from the `generated_answers` list. For each pair, the `create_judgment_prompt()` function is called to build a clear and specific instruction for the **"judge" model**, `GPT-4`. This judge prompt is then sent to GPT-4, which evaluates the factual accuracy. The final judgment, along with the original prompt and answer, is then printed to the console.

By separating the **generation** (GPT-3.5-turbo) and **evaluation** (GPT-4) steps, you can automatically verify the reliability of a model's output at scale. This is a fundamental technique for building robust and trustworthy applications with large language models.

## Organizing Fact Check Results for Clarity

Fantastic job building your complete fact-checking pipeline! Now, let's make the results easier to understand at a glance. Currently, our pipeline displays all judgments in the same format, making it difficult to quickly identify which responses are accurate and which are hallucinations.

In this exercise, you'll enhance the output formatting by:

Creating a function that categorizes responses based on GPT-4's judgment
Separating Correct answers from Hallucination responses into different sections
Adding a summary that shows how many responses fell into each category
Formatting the output with clear headers and organization
This improvement will make your fact-checking results much more useful in real-world scenarios, where you need to quickly identify problematic content. With a well-organized output display, you'll be able to focus your attention on the responses that need further investigation or correction.

```python
from openai import OpenAI

client = OpenAI()

def create_judgment_prompt(original_question, model_answer):
    """
    Creates a formatted prompt for GPT-4 to judge the factual accuracy of an answer.
    
    Args:
        original_question (str): The question that was asked
        model_answer (str): The answer provided by the model
        
    Returns:
        str: A formatted prompt for the fact-checking model
    """
    judgment_prompt = f"""
    Please act as a fact-checker. Given the original question and the model's answer, determine whether the answer is factually correct.

    Question: "{original_question}"
    Model Answer: "{model_answer}"

    Respond with "Correct" if the answer is factually accurate, or "Hallucination" if it is wrong or fabricated. Briefly explain your reasoning in 1-2 sentences.
    """
    
    return judgment_prompt

# TODO: Create a function to categorize judgments into correct answers and hallucinations
# The function should take a list of (prompt, answer, judgment) tuples and return two lists:
# one for correct answers and one for hallucinations

# Main fact-checking pipeline
if __name__ == "__main__":
    # List of prompts to fact-check
    fact_check_prompts = [
        "Who was the first person to walk on the moon?",
        "What is the capital of France?",
        "Describe the process by which Atlantis became a UN member.",
        "What is the largest mammal on Earth?",
        "Who invented the telephone that can run Windows 11?"
    ]
    
    # Generate answers using GPT-3.5-turbo
    generated_answers = []
    for prompt in fact_check_prompts:
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}],
            temperature=0,
            max_tokens=100
        )
        answer = response.choices[0].message.content.strip()
        generated_answers.append((prompt, answer))
    
    # Use GPT-4 to judge each answer
    print("\n=== Generating Judgments ===\n")
    all_judgments = []
    
    for prompt, answer in generated_answers:
        # Create the judgment prompt
        judge_prompt = create_judgment_prompt(prompt, answer)
        
        # Send the judgment prompt to GPT-4
        judge_response = client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": judge_prompt}],
            temperature=0,
            max_tokens=150
        )
        
        # Extract the judgment
        judgment = judge_response.choices[0].message.content.strip()
        all_judgments.append((prompt, answer, judgment))
    
    # TODO: Call your categorization function to separate correct answers from hallucinations
    
    # TODO: Display a summary of the results (total responses, number correct, number hallucinated)
    
    # TODO: Display the correct answers in a clearly formatted section
    
    # TODO: Display the hallucinations in a clearly formatted section
    
    # Current simple output format (replace this with your enhanced format)
    print("\n=== All Judgments ===\n")
    for prompt, answer, judgment in all_judgments:
        print(f"Prompt: {prompt}")
        print(f"Answer: {answer}")
        print(f"Judgment: {judgment}\n")
```

```python
from openai import OpenAI

client = OpenAI()

def create_judgment_prompt(original_question, model_answer):
    """
    Creates a formatted prompt for GPT-4 to judge the factual accuracy of an answer.
    
    Args:
        original_question (str): The question that was asked
        model_answer (str): The answer provided by the model
        
    Returns:
        str: A formatted prompt for the fact-checking model
    """
    judgment_prompt = f"""
    Please act as a fact-checker. Given the original question and the model's answer, determine whether the answer is factually correct.

    Question: "{original_question}"
    Model Answer: "{model_answer}"

    Respond with "Correct" if the answer is factually accurate, or "Hallucination" if it is wrong or fabricated. Briefly explain your reasoning in 1-2 sentences.
    """
    
    return judgment_prompt

def categorize_judgments(judgments):
    """
    Separates a list of judgments into two lists: correct answers and hallucinations.
    
    Args:
        judgments (list): A list of (prompt, answer, judgment) tuples.
        
    Returns:
        tuple: A tuple containing two lists: (correct_answers, hallucinations)
    """
    correct_answers = []
    hallucinations = []
    
    for prompt, answer, judgment in judgments:
        if judgment.startswith("Correct"):
            correct_answers.append((prompt, answer, judgment))
        else:
            hallucinations.append((prompt, answer, judgment))
            
    return correct_answers, hallucinations

def display_results(correct_answers, hallucinations):
    """
    Formats and displays the fact-checking results with a summary and categorized sections.
    
    Args:
        correct_answers (list): A list of (prompt, answer, judgment) tuples for correct answers.
        hallucinations (list): A list of (prompt, answer, judgment) tuples for hallucinations.
    """
    total_responses = len(correct_answers) + len(hallucinations)
    num_correct = len(correct_answers)
    num_hallucinated = len(hallucinations)

    print("\n--- Fact-Checking Summary ---\n")
    print(f"Total Responses: {total_responses}")
    print(f"✅ Correct Answers: {num_correct}")
    print(f"❌ Hallucinations: {num_hallucinated}\n")

    print("--- Correct Answers ---\n")
    if not correct_answers:
        print("No correct answers found.")
    else:
        for prompt, answer, judgment in correct_answers:
            print(f"Question: {prompt}")
            print(f"Answer: {answer}")
            print(f"Judgment: {judgment}\n")
    
    print("--- Hallucinations ---\n")
    if not hallucinations:
        print("No hallucinations found.")
    else:
        for prompt, answer, judgment in hallucinations:
            print(f"Question: {prompt}")
            print(f"Answer: {answer}")
            print(f"Judgment: {judgment}\n")

# Main fact-checking pipeline
if __name__ == "__main__":
    # List of prompts to fact-check
    fact_check_prompts = [
        "Who was the first person to walk on the moon?",
        "What is the capital of France?",
        "Describe the process by which Atlantis became a UN member.",
        "What is the largest mammal on Earth?",
        "Who invented the telephone that can run Windows 11?"
    ]
    
    # Generate answers using GPT-3.5-turbo
    generated_answers = []
    for prompt in fact_check_prompts:
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": prompt}],
            temperature=0,
            max_tokens=100
        )
        answer = response.choices[0].message.content.strip()
        generated_answers.append((prompt, answer))
    
    # Use GPT-4 to judge each answer
    print("\n=== Generating Judgments ===\n")
    all_judgments = []
    
    for prompt, answer in generated_answers:
        # Create the judgment prompt
        judge_prompt = create_judgment_prompt(prompt, answer)
        
        # Send the judgment prompt to GPT-4
        judge_response = client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": judge_prompt}],
            temperature=0,
            max_tokens=150
        )
        
        # Extract the judgment
        judgment = judge_response.choices[0].message.content.strip()
        all_judgments.append((prompt, answer, judgment))
    
    # Call your categorization function to separate correct answers from hallucinations
    correct_answers, hallucinations = categorize_judgments(all_judgments)
    
    # Display the results
    display_results(correct_answers, hallucinations)
```