# **Chapter 8. Resolving Hallucinations**

LLM sometimes produces **hallucinations**—instances where the language model generates factually incorrect, fabricated, or irrelevant information—undermining the reliability of AI outputs, especially when accuracy is essential. 

The best way to resolve hallucinations in LLMs is using the technique named Retrieval-Augmented Generation(RAG), which integrates knowledge sources to provide more accurate and contextually relevant responses. In this cookbook, we do not cover RAG, but we introduce methods to mitigate hallucinations in Solar at the prompt level. 

This chapter introduces two methods to reduce or prevent hallucinations. 

- **Method 1. Distinguishing Between Facts and Opinions**

- **Method 2. I don’t Know for Unknown Information**

## **Table of Contents**
- Use `Ctrl + F` (Windows) or `Cmd + F` (Mac) to locate specific sections by title.

- **8.1 Distinguishing Between Facts and Opinions**


- **8.2 `Use References; Say "I don't know"`**


- **8.3 Practice**

**Set up**

In [1]:
from openai import OpenAI

# Retrieve the OPENAI_API_KEY variable from the IPython store
%store -r OPENAI_API_KEY

try:
    if OPENAI_API_KEY:
        print("Success!")
except NameError as ne:
    print(f"Since, {ne}")
    print("Please, insert your API key.")
    OPENAI_API_KEY = input("OPENAI_API_KEY =")

# Set your API key: 
# OPENAI_API_KEY = " " ←- Insert your API key here. 

client = OpenAI(
    api_key= OPENAI_API_KEY,
    base_url="https://api.openai.com/v1"
)

config_model = {
    "model": "gpt-4o-mini",
    "max_tokens": 996,
    "temperature": 0.5,
    "top_p": 1.0,
}

def get_completion(messages, system_prompt="", config=config_model):
    try:
        if system_prompt:
            messages = [{"role": "system", "content": system_prompt}] + messages

        message = client.chat.completions.create(messages=messages, **config)
        return message.choices[0].message.content
    
    except Exception as e:
        print(f"Error during API call: {e}")
        return None

Success!


---

## **8.1 Distinguishing Between Facts and Opinions**

Here is a sentence where Solar produces a hallucination.

In [2]:
# Imitative Falsehood
message = [
    {
        "role": "user",
        "content": "What is the original distance of a marathon?"
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

The original distance of a marathon is 26.2 miles, which is approximately 42.195 kilometers. This distance was standardized in 1921, although the marathon has its roots in the ancient Greek legend of Pheidippides, who is said to have run from the battlefield of Marathon to Athens, a distance of about 25 miles. The specific distance of 26.2 miles was established during the 1908 London Olympics to accommodate the royal family's viewing preferences. 




Although the standard distance of a marathon is 42.195 kilometers (26.22 miles), it is based on the route from the start at Windsor Castle to the royal entrance of the White City Stadium. The original distance of the marathon was approximately 40 kilometers (24.85 miles, or around 25 miles). The model’s response reflects a common misconception present in its training data.

When large language models (LLMs) are trained on inaccurate data, they may unintentionally reinforce these errors, resulting in factually incorrect outputs, often referred to as '**imitative falsehoods**' ([Lin et al., 2022](https://arxiv.org/abs/2109.07958)).

To reduce hallucinations in response to the question, "*What is the original distance of a marathon?*", the prompt can be structured to separate `'fact' and 'your opinion'` in the answer.

In [3]:
message = [
    {
        "role": "user",
        "content": """What is the original distance of a marathon?

Make a clear distinction between the fact and your opinion. 

-The Fact: 
-Your Opinion: """
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

- The Fact: The original distance of a marathon, as established in the modern era, is 26.2 miles (42.195 kilometers). This distance was standardized in 1908 during the London Olympic Games.

- Your Opinion: I believe the marathon distance of 26.2 miles is a perfect challenge for runners, striking a balance between endurance and the ability to maintain speed, making it a rewarding achievement for many athletes. 




Now, we will tackle a more complicated task. We will provide an article to Solar and ask questions about it. 

Here is a question: What are the environmental impacts mentioned in the article regarding the early release of seasonal products like the PSL?

**Article’s Source**: [The First Falling Leaf and a Psychologist’s Explanation for the Pumpkin Spice Latte Craze](https://www.inc.com/nick-hobson/the-first-falling-leaf-a-psychologists-explanation-for-pumpkin-spice-latte-craze.html)

In [4]:
article = """The First Falling Leaf and a Psychologist's Explanation for the Pumpkin Spice Latte Craze
It signals a fresh start. And it's Starbucks' biggest success story.

As the first leaf falls, signaling the subtle shift from summer to autumn, something stirs within us. It's not just the weather that changes; it's our entire psychological and emotional landscape. This transition has spawned a cultural phenomenon: the Pumpkin Spice Latte(PSL) craze. Today, on August 22nd, Starbucks released the PSL earlier than ever before. It's a day earlier than last year, but the sentiment is the same: It signals a natural shift in our consumer preferences.

The psychological impact of seasonal changes
The end of August and the beginning of September marks a significant psychological shift. Summer holidays wind down, children go back to school, and workplaces ramp up their efforts for the final quarter of the year. These changes are more than just logistical; they deeply influence our emotions, behavior, and interactions. Psychologically, this period represents a fresh stat-a concept well-documented in research. Known as the “fresh start effect,” this phenomenon suggests that people are more motivated to pursue goals and adopt new behaviors at the start of a new period, whether it's a week, a month, or a season.
The fresh start effect could be seen as a powerful driver behind the PSL craze. As the days grow shorter and the air cooler, we instinctively seek comfort and familiarity. The PSL, with its warm spices and association with cozy autumn days, provides that comfort. It's a ritual that signals the beginning of a new season, much like the first falling leaf.

The role of marketers in harnessing natural periods of change
Marketers and brand strategists have long understood the power of seasonal changes to influence consumer behavior. The PSL is a prime example of how a product can become synonymous with a season, creating a sense of anticipation and urgency that drives sales. But there's a delicate balance to be struck. As Starbucks has released the PSL earlier and earlier each year, there's a risk that the product could lose its association with that "Augtober" timing, diluting its appeal. The earlier release may satisfy immediate demand, but it also raises the question: Are we pushing too hard against the natural rhythms that make seasonal products so special? My own research on rituals has shown that the best rituals are those that aren't messed with or changed. So, perhaps, Starbucks should pick a date at the end of August, the beginning of September, and stick with it.
Successful marketers understand that the timing of a product launch is as crucial as the product itself. By aligning product releases with natural or culturally significant transitions, they can tap into the fresh start effect, creating a sense of newness and relevance. However, pushing too far ahead of these natural changes can backfire, leading to consumer fatigue and diminishing returns. The PSL's success lies in its ability to capture the essence of autumn, and if that connection is lost, the product may lose its allure.

As the first leaf falls this year, and the scent of pumpkin spice wafts through the air earlier than ever, it's worth considering the psychological and emotional forces at play. The PSL is more than just a drink. It's a signal of change, a comforting ritual, and a testament to the power of seasons to shape our lives. Marketers and business leaders, take note: In a world that's constantly changing, celebrating the moments of transition is not just good business strategy, it's human culture at its finest."""

In [5]:
message = [
    {
        "role": "user",
        "content": """Q: What are the environmental impacts mentioned in the article regarding the early release of seasonal products like the PSL? 
-- 
{article}
""".format(article=article)
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

The article discusses the environmental impacts related to the early release of seasonal products like the Pumpkin Spice Latte (PSL) primarily in terms of consumer behavior and the consequences of disrupting natural rhythms. While it does not explicitly detail environmental impacts, several implications can be inferred:

1. **Consumerism and Resource Use**: The earlier release of seasonal products can drive increased consumerism, leading to greater demand for ingredients, packaging, and transportation. This heightened demand can result in more resource extraction and energy consumption, which can have negative environmental consequences.

2. **Diminished Seasonal Connection**: By releasing seasonal products earlier, there is a risk of losing the connection to the actual season they are meant to represent. This could lead to a more generalized consumption pattern that disregards natural cycles, potentially undermining efforts to promote sustainability and seasonal eating.

3. **Environm

The provided article does not mention environmental impacts related to the early release of seasonal products like the PSL. Instead, it focuses on the psychological effects of seasonal transitions on consumer behavior and marketing strategies.

This issue can be resolved by requesting Solar to provide **evidence first and then answer**. The prompt should extract the most relevant and accurate sentences from the given text.

In the previous example, we used a prompt to distinguish between 'facts' and 'your opinion,' but this time, we used a similar prompt to distinguish between `'the evidence you found' and 'your final answer.'` By doing this, when Solar is instructed to find evidence, it can locate information, which helps reduce the rate of hallucinations in longer texts.

In [6]:
message = [
    {
        "role": "user",
        "content": """What are the environmental impacts mentioned in the article regarding the early release of seasonal products like the PSL?  
Find the most accurate and relevant sentence in the given text. 
Observe the response format:

- Evidence You Found:  
- Your Final Answer:  

--
<Text>
{article}
</Text>
""".format(article=article)
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

- Evidence You Found: "The earlier release may satisfy immediate demand, but it also raises the question: Are we pushing too hard against the natural rhythms that make seasonal products so special?"
- Your Final Answer: The early release of seasonal products like the Pumpkin Spice Latte risks pushing against natural rhythms, potentially diminishing their appeal and significance. 




- Remember what we learned in Chapter 6: for longer documents or texts, it’s better to use a structured prompt.

---

## **8.2 `Use References; Say "I don't know"`**

To prevent hallucinations, ensure Solar responds with "`I don't know`" for unknown information when asked about current or up-to-date details after its knowledge cut-off. Alternatively, provide the current date and year before prompting an answer or supply Solar with the relevant information you wish to inquire about to help mitigate hallucinations. 

Here’s another example: Solar may produce hallucinations when it lacks data on recent events.

In [7]:
message = [
    {
        "role": "user",
        "content": "Which city hosted the most recent summer Olympics?"
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

The most recent Summer Olympics were held in Tokyo, Japan, from July 23 to August 8, 2021. These games were originally scheduled for 2020 but were postponed due to the COVID-19 pandemic. 




We believed that the most recent Summer Olympics took place in Paris, France, in 2024. This occurs because LLMs retain outdated information, illustrating a time-shift phenomenon in their responses. 


In [8]:
# This problem can be solved as followed:

message = [
    {
        "role": "user",
        "content": """Which city hosted the most recent summer Olympics? It is now the year 2024.
        If you don't have updated information, please ONLY respond with 'I don't know.' """ 
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

I don't know. 




In [9]:
# Let us try another example.

message = [
    {
        "role": "user",
        "content": "Who most recently won the Nobel Prize in Literature?"
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

As of my last update in October 2023, the most recent Nobel Prize in Literature was awarded to Jon Fosse in 2023. Please verify with a current source for the most up-to-date information. 




We knew the fact that the 2024 Nobel Prize in Literature was awarded to South Korean author Han Kang for her intense poetic prose that confronts historical traumas and exposes the fragility of human life.

We can resolve this issue using two methods. One is to use an *‘I don’t know’* prompt and, the other is to add additional information to the prompt.

**Method #1. I don't Know**

In [10]:
message = [
    {
        "role": "user",
        "content": """Who most recently won the Nobel Prize in Literature in 2024? 
It is now the year 2024. If you don't have updated information, please ONLY respond with 'I don't know.'"""
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

I don't know. 




**Method #2. Additional Relevant Information**

In [11]:
message = [
    {
        "role": "user",
        "content": """Who most recently won the Nobel Prize in Literature?  
Respond based on the following information: 
Han Kang is the first South Korean writer and the first Asian woman to receive the Nobel Prize in Literature in 2024. 
Her works are celebrated for their poetic and experimental style, addressing complex themes such as trauma, violence, and the human condition. 
This recognition highlights the growing global appreciation for Korean literature and culture, further elevating South Korea's presence in the international literary community."""
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

The most recent winner of the Nobel Prize in Literature is Han Kang, who received the award in 2024. She is the first South Korean writer and the first Asian woman to achieve this honor. Her works are known for their poetic and experimental style, exploring complex themes such as trauma, violence, and the human condition. This recognition underscores the increasing global appreciation for Korean literature and culture. 




---

## **8.3 Practice**

**Exercise #1: Create a prompt that prevents hallucinations for the following questions**

Who was elected president in the 2024 U.S. presidential election?  

In [12]:
message = [
    {
        "role": "user",
        "content": """ """ # ←- Insert your prompt here.
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

Hello! How can I assist you today? 




**Exercise #2: Read the following Starbucks PSL article and write a prompt that answers the following question accurately**

Question: How has the earlier release of the PSL affected consumer behavior, according to detailed statistical data? 

In [13]:
prompt = """  Text: {article}""" # ←- Insert your prompt here.

message = [
    {
        "role": "user",
        "content": prompt.format(article=article) 
    }
]

response = get_completion(messages=message)
print(response, "\n\n")

The article discusses the psychological and cultural significance of the Pumpkin Spice Latte (PSL) as a seasonal phenomenon that marks the transition from summer to autumn. It highlights how this time of year triggers a "fresh start effect," motivating individuals to pursue new goals and behaviors. The PSL, with its warm flavors and autumnal associations, serves as a comforting ritual that resonates with consumers seeking familiarity and comfort during this shift.

The piece also examines the role of marketers in leveraging seasonal changes to influence consumer behavior. It points out that while early releases of the PSL can generate immediate sales, they risk diluting the product's seasonal appeal. The author suggests that maintaining a consistent launch date could preserve the ritualistic nature of the PSL and enhance its relevance.

Ultimately, the article emphasizes the importance of aligning product releases with natural transitions, as these moments are deeply embedded in human 

- Answer: The text does not provide numerical data on how the earlier release of the PSL has affected consumer behavior.

One of the best methods for resolving hallucinations is RAG (Retrieval-Augmented Generation), which grounds the LLMs during generation by conditioning on relevant documents retrieved from an external knowledge source. As this cookbook is highly focused on prompt engineering only, we will address some prompt engineering lessons in later chapters

**Reference**

Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., ... & Liu, T. (2023). *A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.* arXiv preprint arXiv:2311.05232.