# Vectara Hallucination Corrector

In spite of the amazing power of LLMs, they still do hallucinate. In some cases, where creativity is required, hallucinations are okay or even necessary, but in most enterprise use-cases a trusted response is needed.

HHEM (Hughes Hallucination Evaluation Model) is a model that was built specifically to help LLM practitioners measure hallucinations. It is available for use on [Huggingface Hub](https://huggingface.co/vectara/hallucination_evaluation_model), and a public [leaderboard](https://huggingface.co/spaces/vectara/leaderboard) shows the likelihood of various LLMs (both commercial and open source) to hallucinate.

VHC (Vectara Hallucination Corrector) is the next step in the fight against hallucinations. It allows you to take the generated response and generate a corrected one.

Let's demonstrate this via an example:

In [1]:
examples = [
    {
        "query": "what is the name of the king?",
        "contexts": ["King Arthur took the sword out of the stone and waved it high."],
        "answer": "The name of the King is Johannes"
    },
    {
        "query": "Where did the conference take place?",
        "contexts": [
            "The annual tech summit was held at the San Francisco Moscone Center this year.",
            "Attendees flew in from across North America and Europe."
        ],
        "answer": "It took place at the Berlin International Congress Center"
    },
    {
        "query": "Who painted the Mona Lisa?",
        "contexts": [
            "Leonardo da Vinci completed the Mona Lisa in the early 16th century.",
            "The painting is housed in the Louvre Museum in Paris.",
            "It is famed for the subject’s enigmatic smile.",
            "Art historians credit da Vinci’s sfumato technique for its realism."
        ],
        "answer": "It was painted by Michelangelo"
    },
    {
      "query": "What is the capital city of Australia and how many states does it have?",
      "contexts": [
        "Australia is a federation comprising six states and two major mainland territories.",
        "Its largest city by population is Sydney.",
        "Its national parliament is seated in the Australian Capital Territory, in the city of Canberra.",
        "The city was selected as a compromise between Sydney and Melbourne."
      ],
      "answer": "The capital of Australia is Melbourne. Australia has 5 states."
    },
    {
        "query": "What is the GDP for Spain in 2023?",
        "contexts": [
            "Japan's population was 125,124,989 in 2022 and 124,516,650 in 2023.",
            """
| Country | GDP 2022 (USD billion) | GDP 2023 (USD billion) |
| ------- | ---------------------- | ---------------------- |
| Japan   | 4,237.53               | 4,213.17               |
| Germany | 4,085.68               | 4,500                  |
| Spain   | 1,418.92               | 1,620.09               |
""",
            "It's hotter in Spain than in Germany in the summer",
        ],
        "answer": "Spain's GDP in 2023 was $2.5B"
    }
]

In [2]:
import requests
import os
import json

In [3]:
if not os.getenv('VECTARA_API_KEY'):
    raise EnvironmentError("VECTARA_API_KEY environment variable is not set.")

session = requests.Session()

def call_vhc(query, answer, contexts):
    """Calls the Vectara Hallucination Corrector (VHC) endpoint synchronously."""
    payload = {
        "generated_text": answer,
        "query": query,
        "documents": [{"text": c} for c in contexts],
        "model_name": "vhc-large-1.0"
    }
    headers = {
        "Content-Type": "application/json",
        "Accept": "application/json",
        "x-api-key": os.getenv("VECTARA_API_KEY")
    }

    # Perform the POST request
    response = session.post(
        "https://api.vectara.io/v2/hallucination_correctors/correct_hallucinations",
        json=payload,
        headers=headers,
        timeout=10  # optional
    )
    # Raise exception for HTTP errors (4xx/5xx)
    response.raise_for_status()

    data = response.json()
    corrected_text = data.get("corrected_text", "")
    corrections = data.get("corrections", [])

    if not corrected_text.strip():
        print(f"VHC returned empty corrected_text for query={query}, retrying…")
        raise ValueError("VHC returned empty corrected_text, retrying…")

    return corrected_text, corrections

In [4]:
for ex in examples:
    corrected, corrections = call_vhc(ex['query'], ex['answer'], ex['contexts'])
    print(f"The query is: {ex['query']}")
    print(f"The original response is: {ex['answer']}")
    print(f"The corrected response is: {corrected}")
    print("Corrections:")

    for c in corrections:
        print(json.dumps(c, indent=2))
    print("\n")

The query is: what is the name of the king?
The original response is: The name of the King is Johannes
The corrected response is: The name of the King is Arthur.
Corrections:
{
  "original_text": "The name of the King is Johannes",
  "corrected_text": "The name of the King is Arthur.",
  "explanation": "The source states that the king is Arthur, but the response claims the king's name is Johannes, which directly contradicts the source."
}


The query is: Where did the conference take place?
The original response is: It took place at the Berlin International Congress Center
The corrected response is: It took place at the San Francisco Moscone Center.
Corrections:
{
  "original_text": "It took place at the Berlin International Congress Center",
  "corrected_text": "It took place at the San Francisco Moscone Center.",
  "explanation": "The response states the conference took place at the Berlin International Congress Center, but the source specifies it was held at the San Francisco Moscon