# Chain-of-Verification (CoVe) Pipeline
 Given a user query, a LLM generates
a baseline response that may contain inaccuracies, e.g. factual hallucinations. To improve this, CoVe first generates a
plan of a set of verification questions to ask, and then executes that plan by answering them and hence
checking for agreement. We find that individual verification questions are typically answered with
higher accuracy than the original accuracy of the facts in the original longform generation. Finally,
the revised response takes into account the verifications. Steps as outlined from this [paper](https://arxiv.org/pdf/2309.11495.pdf).
1. Generate Baseline Response
2. Plan Verification
3. Execute Verification
4. Generate Final Verified Response

**Workflow**
1. Started with prompts in [AI Workbook](https://lastmileai.dev/workbooks/clon69opk00c1qrfiighw3k92).
2. Downloaded AIConfig (prompts, model params) from workbook - [cove_demo_config.json](https://drive.google.com/file/d/1GXahbgGCV_HReL3hWVZ2L5tVXn_3Iinf/view?usp=sharing)
3. Create pipeline for chain-of-verification in this notebook.

In [None]:
# Install
!pip install python-aiconfig
!pip install openai==0.28.1
from google.colab import userdata

import openai
import os

openai.api_key = userdata.get('openai_api_key')

In [2]:
# Load AI Config
from aiconfig.Config import AIConfigRuntime

config_file_path = "cove_config.json"
config = AIConfigRuntime.from_config(config_file_path)




## 1. Baseline Response
Prompt LLM with user question. The baseline response from the LLM might contain inaccuracies that we will want to verify. The user prompt is from AIConfig, named
`baseline_response_gen`.

**Prompt: Name 25 politicians who were born in NY, New York.**

In [3]:
from aiconfig.default_parsers.parameterized_model_parser import InferenceOptions

params = {}
inference_options = InferenceOptions()

baseline_response_completion = await config.run("baseline_response_gen", params, inference_options)
baseline_response = config.get_output_text("baseline_response_gen")

1. Theodore Roosevelt, (26th President of the United States)
2. Franklin D. Roosevelt, (32nd President of the United States)
3. Alexander Hamilton, (First Secretary of the Treasury)
4. John Jay, (First Chief Justice of the United States)
5. DeWitt Clinton, (6th Governor of New York)
6. William H. Seward, (Secretary of State under Abraham Lincoln)
7. Charles Evans Hughes, (11th Chief Justice of the United States)
8. Nelson Rockefeller, (41st Vice President of the United States)
9. Robert F. Wagner Jr., (Mayor of New York City)
10. Bella Abzug, (U.S. Representative)
11. Shirley Chisholm, (First African American woman elected to Congress)
12. Geraldine Ferraro, (First female Vice Presidential candidate from a major party)
13. Eliot Spitzer, (54th Governor of New York)
14. Michael Bloomberg, (108th Mayor of New York City)
15. Andrew Cuomo, (56th Governor of New York)
16. Bill de Blasio, (109th Mayor of New York City)
17. Charles Rangel, (U.S. Representative)
18. Daniel Patrick Moynihan, (U

## 2. Plan Verification
Given both query and baseline response, generate a list of verification
questions that could help to self-analyze if there are any mistakes in the original response. We will use one verification question here. The verification prompt is from AIConfig, named `verification`.

**Verification Prompt: Where was {{name}} born?**

In [4]:
params = {"name":"Theodore Roosevelt"}
verification_completion = await config.run("verification", params)
verification_response = config.get_output_text("verification")
print(verification_response)

Theodore Roosevelt was born in New York City, New York on October 27, 1858.


In [5]:
# Set the remember_chat_context to False
config.set_metadata("remember_chat_context", False, "verification")


## 3. Execute Verifications
Answer each verification question in turn for the baseline response.  

**1. Run Verification Prompt for each politician from baseline prompt list.**

**2. Save outputs from all verification prompts into single text to be used as context.**

In [6]:
# Get indiviual names from baseline response
import pandas as pd
import time

rows = baseline_response.split('\n')
names = []

for row in rows:
    if not row.strip():
        continue
    names.append(pd.Series(row).str.extract(r'(\d+\.\s)([^,]*)')[1].values[0])


# Execute verification question for each name
verification_list = ""

for n in names:
    params = {"name": n}
    verification_completion = await config.run("verification", params)
    verification_text = config.get_output_text("verification")
    verification_list += " " + verification_text


In [7]:
print(verification_list)

 Theodore Roosevelt was born in New York City, New York on October 27, 1858. Franklin D. Roosevelt was born in Hyde Park, New York on January 30, 1882. Alexander Hamilton was born in Charlestown, Nevis on January 11, 1755. John Jay was born in New York City, New York on December 12, 1745. DeWitt Clinton was born in Little Britain, New York on March 2, 1769. William H. Seward was born in Florida, New York on May 16, 1801. Charles Evans Hughes was born in Glens Falls, New York on April 11, 1862. Nelson Rockefeller was born in Bar Harbor, Maine on July 8, 1908. Robert F. Wagner Jr. was born in Manhattan, New York on April 20, 1910. Bella Abzug was born in New York City, New York on July 24, 1920. Shirley Chisholm was born in Brooklyn, New York on November 30, 1924. Geraldine Ferraro was born in Newburgh, New York on August 26, 1935. Eliot Spitzer was born in The Bronx, New York on June 10, 1959. Michael Bloomberg was born in Boston, Massachusetts on February 14, 1942. Andrew Cuomo was bor

## 4. Generate Final Verified Response
Given the discovered inconsistencies (if any), generate a revised response incorporating the verification results.

**Run final verification prompt which cross-checks original output from baseline prompt with the verification context**

In [8]:
# Set the remember_chat_context to False
config.set_metadata("remember_chat_context", False, "final_response_gen")

params = {"verification_list": verification_list}

final_verified_response = await config.run("final_response_gen", params)
output = config.get_output_text("final_response_gen")
print(output)

Politicians born in NY, New York:

1. Theodore Roosevelt, (26th President of the United States)
2. John Jay, (First Chief Justice of the United States)
3. Robert F. Wagner Jr., (Mayor of New York City)
4. Bella Abzug, (U.S. Representative)
5. Shirley Chisholm, (First African American woman elected to Congress)
6. Eliot Spitzer, (54th Governor of New York)
7. Andrew Cuomo, (56th Governor of New York)
8. Bill de Blasio, (109th Mayor of New York City)
9. Charles Rangel, (U.S. Representative)
10. Jacob Javits, (U.S. Senator)
11. Al Smith, (42nd Governor of New York)
12. Rudy Giuliani, (107th Mayor of New York City)
13. Chuck Schumer, (U.S. Senator)
14. Alexandria Ocasio-Cortez, (U.S. Representative)

Politicians not born in NY, New York, along with their birthplaces:

1. Franklin D. Roosevelt, Hyde Park, New York
2. Alexander Hamilton, Charlestown, Nevis
3. DeWitt Clinton, Little Britain, New York
4. William H. Seward, Florida, New York
5. Charles Evans Hughes, Glens Falls, New York
6. Nel

We can see that CoVe identified and corrected the mistakes in the baseline prompt's response. 14 of the original 25 politicians were actually born in NY, New York.