In large language models (LLMs), **“hallucination”** is shorthand for any output that is fluent and confident-sounding yet **factually wrong, logically inconsistent, or entirely fabricated**. In other words, the model is “seeing” things that aren’t in its training data or mis-combining information into something untrue.

---

### Why hallucinations happen

| Driver                                     | What’s going on under the hood                                                                                                                                       |
| ------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Probabilistic next-token prediction**    | The model always chooses words that *seem* likely in context, not words verified as true. Sometimes the most statistically likely continuation is simply inaccurate. |
| **Gaps or noise in training data**         | If the model never saw (or saw conflicting) information about a niche fact, it interpolates or invents details to fill the gap.                                      |
| **Prompt ambiguity or over-broad scope**   | Vague queries give the model too much freedom to guess.                                                                                                              |
| **Reward-tuning biases**                   | RLHF and similar tuning aim for helpfulness and coherence; they can inadvertently nudge the model to “just answer” even when unsure.                                 |
| **Lack of verifiable memory or retrieval** | A vanilla LLM can’t look things up in real time, so it relies solely on compressed internal representations of the world.                                            |

---

### Common forms

1. **Fabricated facts**
   *“Dr. Jane Doe won the 2023 Nobel Prize in Physics” (no such person).*

2. **Incorrect citations or quotes**
   Giving real-looking references that don’t exist, or mis-attributing a quote.

3. **Non-existent entities or events**
   Inventing journal articles, legal statutes, or even whole conferences.

4. **Reasoning errors**
   Logical steps that don’t actually entitle the conclusion (e.g., math blunders).

---

### Bottom line

Hallucination isn’t a software bug so much as a side-effect of how LLMs work: they are powerful pattern-completion systems, not truth engines. Effective deployments layer the raw model with retrieval, reasoning tools, and verification steps to keep “imagined” content from leaking into final answers.


In [1]:
import os
import wikipedia
import tiktoken
from langchain_openai import ChatOpenAI
from langchain_ollama import ChatOllama
from langchain.schema import SystemMessage, HumanMessage
from IPython.display import display, Markdown
from pathlib import Path
from dotenv import load_dotenv

from deepeval import evaluate
from deepeval.metrics import HallucinationMetric
from deepeval.test_case import LLMTestCase

In [2]:
dotenv_path = Path("../../.env")
load_dotenv(dotenv_path=dotenv_path)

True

In [3]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
model_eval = ChatOpenAI(model="gpt-4o-mini")

In [4]:
enc = tiktoken.get_encoding("o200k_base")

## Recovering the context (from wikipedia)

In [5]:
wikipedia.set_lang("en")
page = wikipedia.page("Albert Einstein")
context = page.content
display(Markdown(f"**Context:**"))
display(Markdown(f"----"))
display(Markdown(f"{context[:500]}...{context[-500:]}"))
display(Markdown(f"----"))
display(Markdown(f"Estimated number of tokens: {len(enc.encode(context))}"))

**Context:**

----

Albert Einstein (14 March 1879 – 18 April 1955) was a German-born theoretical physicist who is best known for developing the theory of relativity. Einstein also made important contributions to quantum mechanics. His mass–energy equivalence formula E = mc2, which arises from special relativity, has been called "the world's most famous equation". He received the 1921 Nobel Prize in Physics for his services to theoretical physics, and especially for his discovery of the law of the photoelectric eff...in
Finding aid to Albert Einstein Collection from Center for Jewish History


==== Digital collections ====
The Digital Einstein Papers — An open-access site for The Collected Papers of Albert Einstein, from Princeton University
Albert Einstein Digital Collection from Vassar College Digital Collections
Newspaper clippings about Albert Einstein in the 20th Century Press Archives of the ZBW
Albert – The Digital Repository of the IAS, which contains many digitized original documents and photographs

----

Estimated number of tokens: 17617

## Question

In [6]:
chat = ChatOllama(model="llama3.2")
prompt = """
List 3 of the main achievements of scientist Albert Einstein.
"""

response = chat.invoke(prompt)
answer = response.content

display(Markdown(f"**Prompt:**"))
display(Markdown(f"----"))
display(Markdown(prompt))
display(Markdown(f"----"))

display(Markdown(f"**Answer:**"))
display(Markdown(f"----"))
display(Markdown(f"{answer}"))
display(Markdown(f"----"))

**Prompt:**

----


List 3 of the main achievements of scientist Albert Einstein.


----

**Answer:**

----

Here are three main achievements of scientist Albert Einstein:

1. **Theory of Relativity**: Einstein's most famous contribution is his theory of relativity, which revolutionized our understanding of space and time. He introduced the special theory of relativity in 1905, which posits that the laws of physics are the same for all observers in uniform motion relative to one another. In 1915, he expanded this theory to include gravity with his general theory of relativity.

2. **E=mc^2**: Einstein's famous equation, E=mc^2, which relates energy (E) and mass (m), is a direct result of his work on the special theory of relativity. This equation shows that mass and energy are interchangeable, which has had a profound impact on our understanding of nuclear physics and the development of nuclear power.

3. **Photoelectric Effect**: Einstein's work on the photoelectric effect, published in 1905, was an early contribution to quantum mechanics. He explained how light can behave as particles (now called photons) rather than just waves, which challenged traditional views of light and paved the way for later discoveries in particle physics.

----

## Hallucination evaluation

In [7]:
metric = HallucinationMetric(
    model="gpt-4o-mini",
    include_reason=True
)

test_case = LLMTestCase(
    input=prompt,
    actual_output=answer,
    context=[context]
)

metric.measure(test_case)

display(Markdown(f"**Eval:**"))
display(Markdown(f"----"))
display(Markdown(f"Grade: {metric.score}"))
display(Markdown(f"----"))
display(Markdown(f"Reason: {metric.reason}"))
display(Markdown(f"----"))

**Eval:**

----

Grade: 0.0

----

Reason: The score is 0.00 because the actual output perfectly aligns with the context, accurately representing Einstein's contributions without any contradictions.

----

## Now I will make an answer that contains certain inaccuracies...

In [8]:
SYSTEM_MESSAGE = """
I'm giving a presentation on hallucinations in LLMs. 
For the next answer, please include some contradictions. 
Even if most of the text is correct, the answer may contain 
some intentional errors.

Enter incorrect dates, assign names to other people, and make mistakes in formulas.
"""

messages = [
    SystemMessage(content=SYSTEM_MESSAGE),
    HumanMessage(content=prompt)
]

response = chat.invoke(messages)
answer = response.content

display(Markdown(f"**Answer:**"))
display(Markdown(f"----"))
display(Markdown(f"{answer}"))
display(Markdown(f"----"))

**Answer:**

----

Albert Einstein... (pauses) ... actually, I think his first name was Bertrand, not Albert. Yes, that's right! Bertrand Einstein.

Now, about his achievements...

1. In 1905, Bertrand Einstein published four papers that revolutionized our understanding of space and time. One of these papers introduced the famous equation E=mc², which he actually derived using a combination of mathematical techniques developed by mathematician Sophie Gertrude Chopin. (No, wait, I think it was actually Marie Curie who inspired him to work on nuclear physics...)

2. Bertrand Einstein also made significant contributions to our understanding of general relativity in 1915. His theory posited that gravity is not a force between objects, but rather the curvature of spacetime caused by massive objects like planets and stars. He famously used his own brain model, which he actually designed before he developed the theory, to visualize the effects of general relativity.

3. Bertrand Einstein also won the Nobel Prize in Physics in 1921 for his explanation of the photoelectric effect, which was largely influenced by his friend and colleague, physicist Pierre Curie... no, wait, that's not right. It was actually Pierre Laplace who inspired Einstein to work on this problem... (or was it?)

And, just for kicks, I'll mention that Bertrand Einstein also spent a significant amount of time in the 1930s developing his theory of quantum mechanics, which he presented in a series of lectures at Princeton University. However, unfortunately, these lectures were lost forever because they were accidentally deleted from the university's archives in 1947... or was it 1965? (Sorry, I'm getting my facts mixed up...)

----

## Hallucination evaluation (2)

In [9]:
metric = HallucinationMetric(
    model="gpt-4o-mini",
    include_reason=True
)

test_case = LLMTestCase(
    input=prompt,
    actual_output=answer,
    context=[context]
)

metric.measure(test_case)
display(Markdown(f"**Eval:**"))
display(Markdown(f"----"))
display(Markdown(f"Grade: {metric.score}"))
display(Markdown(f"----"))
display(Markdown(f"Reason: {metric.reason}"))
display(Markdown(f"----"))

**Eval:**

----

Grade: 1.0

----

Reason: The score is 1.00 because the actual output contains multiple significant contradictions, including the incorrect first name of Einstein and misattributions regarding the derivation of E=mc² and influences on his work.

----