# Chain-of-Verification Recipe - Prompt Engineering
Chain-of-Verification (CoVe) is a **prompt engineering technique to reduce hallucinations!** An LLM generates a baseline response to a user query, but this might contain errors. CoVe helps by creating a plan comprising of verification questions that are used to validate the information. This process results in more accurate answers than the initial response. The final answer is revised based on these validations. **[ Link to Paper](https://arxiv.org/pdf/2309.11495.pdf)**

**Check out the open-source tool used here! 🚀 [AIConfig Github Repo](https://github.com/lastmile-ai/aiconfig)**

[Link to Colab](https://colab.research.google.com/drive/1h_Cneit5S2wI4nVPKI8AWGzTadFHwDk3#scrollTo=4MiNxiJc9GPI)

In [None]:
# Install AIConfig package
!pip install python-aiconfig

In [3]:
# Import required modules from AIConfig and other dependencies
import openai
import json
import pandas as pd
from aiconfig import AIConfigRuntime, CallbackManager, InferenceOptions
from IPython.display import display, Markdown

# Use your OpenAI Key
import os
os.environ['OPENAI_API_KEY'] = userdata.get('openai_key')

**The cell below defines the CoVe prompt template config.**

Alternatively, you can also download the config [here](https://github.com/lastmile-ai/aiconfig/blob/main/cookbooks/Chain-of-Verification/cove_template_config.json) and load the config with

`config = AIConfigRuntime.load('cove_template_config.json')`.

In [9]:
# @title
cove_template_config = {
  "name": "Chain-of-Verification (CoVe)  Template",
  "schema_version": "latest",
  "metadata": {
    "models": {
      "gpt-4": {
        "model": "gpt-4",
        "top_p": 1,
        "temperature": 0,
        "presence_penalty": 0,
        "frequency_penalty": 0
      }
    },
    "parameters": {
      "baseline_prompt": "Name 25 politicians who were born in New York City, New York. ",
      "verification_question": "Where was {{entity}} born? "
    }
  },
  "prompts": [
    {
      "name": "baseline_response_gen",
      "input": "{{baseline_prompt}}",
      "metadata": {
        "model": {
          "name": "gpt-4",
          "settings": {
            "system_prompt": ""
          }
        },
        "parameters": {},
        "remember_chat_context": False
      }
    },
    {
      "name": "verification",
      "input": "{{verification_question}}",
      "metadata": {
        "model": {
          "name": "gpt-4",
          "settings": {
            "system_prompt": "{{entity}}"
          }
        },
        "parameters": {
          "entity": "George Pataki"
        },
        "remember_chat_context": False
      }
    },
    {
      "name": "final_response_gen",
      "input": "Cross-check the provided list of verification data with the original baseline response that is supposed to accurately answer the baseline prompt. \n\nBaseline prompt: {{baseline_prompt}} \nBaseline response: {{baseline_response_gen.output}}\nVerification data: {{verification_results}}",
      "metadata": {
        "model": {
          "name": "gpt-4",
          "settings": {
            "system_prompt": "For each entity from the baseline response, verify that the entity met the criteria asked for in the baseline prompt based on the verification data. \n\nOutput Format: \n\n### Revised Response \nThis is the revised response after running chain-of-verification. \n(Please output the revised response after the cross-check.)\n\n### Failed Entities \nThese are the entities that failed the cross-check and are no longer included in revised response. \n(List the entities that failed the cross-check with a concise reason why)"
          }
        },
        "parameters": {
          "verification_results": "Theodore Roosevelt was born in New York City, New York on October 27, 1858. Franklin D. Roosevelt was born in Hyde Park, New York on January 30, 1882. Alexander Hamilton was born in Charlestown, Nevis on January 11, 1755. John Jay was born in New York City, New York on December 12, 1745. DeWitt Clinton was born in Little Britain, New York on March 2, 1769. William H. Seward was born in Florida, New York on May 16, 1801. Charles Evans Hughes was born in Glens Falls, New York on April 11, 1862. Nelson Rockefeller was born in Bar Harbor, Maine on July 8, 1908. Robert F. Wagner Jr. was born in Manhattan, New York on April 20, 1910. Bella Abzug was born in New York City, New York on July 24, 1920. Shirley Chisholm was born in Brooklyn, New York on November 30, 1924. Geraldine Ferraro was born in Newburgh, New York on August 26, 1935. Eliot Spitzer was born in The Bronx, New York on June 10, 1959. Michael Bloomberg was born in Boston, Massachusetts on February 14, 1942. Andrew Cuomo was born in New York City, New York on December 6, 1957. Bill de Blasio was born in Manhattan, New York on May 8, 1961. Charles Rangel was born in Harlem, New York City on June 11, 1930. Daniel Patrick Moynihan was born in Tulsa, Oklahoma on March 16, 1927. Jacob Javits was born in New York City, New York on May 18, 1904. Al Smith was born in New York City, New York on December 30, 1873. Rudy Giuliani was born in Brooklyn, New York on May 28, 1944. George Pataki was born in Peekskill, New York on June 24, 1945. Kirsten Gillibrand was born in Albany, New York on December 9, 1966. Chuck Schumer was born in Brooklyn, New York on November 23, 1950. Alexandria Ocasio-Cortez was born in The Bronx, New York City, New York on October 13, 1989."
        },
        "remember_chat_context": False
      }
    }
  ]
}


## 1. Baseline Response
Prompt LLM with user question that generates a list. The baseline response from the LLM might contain inaccuracies that we can verify.

**Prompt: Name 20 programming languages that were developed in the United States.**

In [18]:

config = AIConfigRuntime.create(**cove_template_config) # loads config (see code above)
config.callback_manager = CallbackManager([])

inference_options = InferenceOptions() # setup streaming

In [19]:
# <<TODO>>: Update baseline_prompt but ensure it is structured in a way that outputs a list of entities where each can be verified.
baseline_prompt = "Name 20 programming languages that were developed in the United States. Include the developer name in parantheses."

# Run baseline prompt to generate initial response which might contain errors
async def run_baseline_prompt(baseline_prompt):
    config.update_parameter("baseline_prompt", baseline_prompt)
    config.save()

    await config.run("baseline_response_gen", options=inference_options) # run baseline prompt
    return config.get_output_text("baseline_response_gen")

baseline_response = await run_baseline_prompt(baseline_prompt)

1. C (Dennis Ritchie, Bell Labs)
2. C++ (Bjarne Stroustrup, Bell Labs)
3. Java (James Gosling, Sun Microsystems)
4. Python (Guido van Rossum, Python Software Foundation)
5. JavaScript (Brendan Eich, Netscape Communications)
6. Ruby (Yukihiro Matsumoto, Ruby community)
7. Swift (Apple Inc.)
8. Go (Robert Griesemer, Rob Pike, and Ken Thompson, Google Inc.)
9. Perl (Larry Wall)
10. PHP (Rasmus Lerdorf)
11. Rust (Graydon Hoare, Mozilla Foundation)
12. TypeScript (Microsoft)
13. C# (Microsoft)
14. Objective-C (Brad Cox and Tom Love, Stepstone)
15. Lua (Roberto Ierusalimschy, Waldemar Celes, and Luiz Henrique de Figueiredo, PUC-Rio)
16. Dart (Google)
17. Kotlin (JetBrains)
18. Groovy (James Strachan, Guillaume Laforge, Jochen Theodorou, Paul King, Cedric Champeau)
19. R (Ross Ihaka and Robert Gentleman, University of Auckland)
20. Julia (Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman, Julia Computing)


## 2. Setup and Test Verification Question
Given both query and baseline response, generate a verification
question that could help to self-analyze if there are any mistakes in the original response. We will use one verification question here.

**Verification Prompt: Where was this coding language developed: {{entity}}?**

In [20]:
#  <<TODO>>: Update verification question that takes in entity as a parameter
# verification_question = "Where was {{entity}} born?"
verification_question =  "Where was this coding language developed: {{entity}}?"

# Run verification on a single entity from baseline response to test
async def run_single_verification(verification_question, entity):
    params = {"entity": entity}
    config.update_parameter("verification_question", verification_question)
    config.save()

    verification_completion = await config.run("verification", params, options=inference_options)
    return verification_completion

#  <<TODO>>: Update with an entity from the baseline response
verification_completion = await run_single_verification(verification_question, "clojure")

Clojure was developed in the United States.

## 3. Execute Verifications
Answer each verification question for each entity from the the baseline response. Save the verification results in a single string.

In [None]:
# Extracts entity names from a given baseline response by processing each line with regex.
# TODO: Update regex if the format of the baseline response changes. (ex. not a numbered list)
def gen_entities_list(baseline_response):
  rows = baseline_response.split('\n')
  entities = []

  for row in rows:
      if not row.strip():
          continue
      entities.append(pd.Series(row).str.extract(r'(\d+\.\s)([^,]*)')[1].values[0])

  return entities

# Run verification question for each entity and concatenates returned verifications into a single string.
async def gen_verification_results(entities):
  verification_data = ""
  for n in entities:
      params = {
          "verification_question": verification_question,
          "entity": n
      }
      verification_completion = await config.run("verification", params, options=inference_options)
      single_verification_text = config.get_output_text("verification")
      verification_data += " " + single_verification_text
      print("\n")

  return verification_data


entities = gen_entities_list(baseline_response)
verification_data = await gen_verification_results(entities)

The C programming language was developed at Bell Labs in the United States.



## 4. Generate Revised Response
Given the discovered inconsistencies (if any), generate a revised response incorporating the verification results.

In [None]:
# Generated the revised response using verification data
params = {"verification_results": verification_data}
revised_response = await config.run("final_response_gen", params)

# Display with Markdown
display(Markdown(config.get_output_text("final_response_gen")))