## Imports

In [3]:
%pip install langchain
%pip install langchain-openai





[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


Collecting langchain-openai
  Downloading langchain_openai-0.1.6-py3-none-any.whl (34 kB)
Collecting openai<2.0.0,>=1.24.0
  Downloading openai-1.28.0-py3-none-any.whl (320 kB)
     -------------------------------------- 320.1/320.1 kB 6.6 MB/s eta 0:00:00
Installing collected packages: openai, langchain-openai
  Attempting uninstall: openai
    Found existing installation: openai 1.11.1
    Uninstalling openai-1.11.1:
      Successfully uninstalled openai-1.11.1
Successfully installed langchain-openai-0.1.6 openai-1.28.0
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip


## Functions

In [9]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.output_parsers import PydanticOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field

# Define the prompt templates
pronoun_prompt = PromptTemplate(
    input_variables=["text"],
    template="""
Your task is to replace all the pronouns in the following text with the nouns they refer to:

<text>
{text}
</text>

The goal is to make the text more explicit and clear by replacing potentially ambiguous pronouns like "he", "she", "it", "they", "them", etc. with the specific nouns or names they refer to.

For example:
Original: John went to the store. He bought some milk.
Pronoun replaced: John went to the store. John bought some milk.

Here are the steps to complete this task:

1. Carefully read the provided text and identify all the pronouns 
2. For each pronoun, look back in the text to determine which noun or name it is referring to
3. If the pronoun is part of a direct quote, do not replace it
4. Replace each pronoun with the most recent noun or name it refers to
5. If a pronoun does not have a clear referent noun or name, do not replace it
6. Repeat this process until all the pronouns with clear referents have been replaced

    """,
)

parse_prompt = PromptTemplate(
    input_variables=["text"],
    template="""

    Please parse the following text into a list of individual facts:

<text>
{text}
</text>

Read the text carefully. Your task is to break it down into the key facts it contains. Parse out each individual fact into a separate sentence, even if that means splitting up or rewording the original sentences. The goal is to have a clear, concise list of the core facts contained in the text.

Output the parsed facts in a numbered list, with each fact written as a complete sentence on its own line. Use <facts> tags to demarcate the start and end of the list.
    """,
)

compare_prompt = PromptTemplate(
    input_variables=["context_list", "answer_list"],
    template="""

You will be comparing facts between a context and an answer to determine which facts are shared and which are unique to each.

Here is the context:

<context>

{context_list}

</context>

And here is the answer: 

<answer>

{answer_list}

</answer>

Carefully analyze the facts presented in the context and answer, focusing on the semantic meaning rather than the exact wording.

Then, output a dictionary with the following keys and corresponding lists of facts as values:

1. "facts_in_both": A list of facts that are present in both the context and the answer

2. "facts_only_in_answer": A list of facts that are only present in the answer 

3. "facts_only_in_context": A list of facts that are only present in the context

Remember, the facts do not need to be worded identically to be considered the same. Focus on whether the core meaning is shared or unique.

Provide your results in this format:

{{
    "facts_in_both": [
        "Fact 1 present in both",
        "Fact 2 present in both"
    ],
    "facts_only_in_answer": [
        "Fact 1 only in answer",
        "Fact 2 only in answer"  
    ],
    "facts_only_in_context": [
        "Fact 1 only in context",
        "Fact 2 only in context"
    ]
}}

"""
)


class ComparisonResult(BaseModel):
    facts_in_both: list[str] = Field(default_factory=list, description="List of facts present in both context and answer")
    facts_only_in_answer: list[str] = Field(default_factory=list, description="List of facts only present in the answer")
    facts_only_in_context: list[str] = Field(default_factory=list, description="List of facts only present in the context")

def process_data(context, answer):
    # Replace pronouns in the context and answer
    context_replace_pronouns = pronoun_chain.run(text=context)
    answer_replace_pronouns = pronoun_chain.run(text=answer)

    # Parse the context and answer into lists of strings
    context_list = parse_chain.run(text=context_replace_pronouns)
    answer_list = parse_chain.run(text=answer_replace_pronouns)

    # Compare the context and answer statements
    comparison_result = parser.parse(compare_chain.run(context_list=context_list, answer_list=answer_list))

    return {
        "context_replace_pronouns": context_replace_pronouns,
        "answer_replace_pronouns": answer_replace_pronouns,
        "context_list": context_list,
        "answer_list": answer_list,
        "comparison_result": comparison_result,
    }


def calculate_metrics(comparison_result):
    facts_in_both_count = len(comparison_result.facts_in_both)
    facts_only_in_answer_count = len(comparison_result.facts_only_in_answer)
    facts_only_in_context_count = len(comparison_result.facts_only_in_context)

    total_answer_facts = facts_in_both_count + facts_only_in_answer_count
    total_context_facts = facts_in_both_count + facts_only_in_context_count

    groundedness = facts_in_both_count / total_answer_facts * 100 if total_answer_facts > 0 else 0
    thoroughness = facts_in_both_count / total_context_facts * 100 if total_context_facts > 0 else 0

    return {
        "groundedness": groundedness,
        "thoroughness": thoroughness,
    }



In [10]:
# Set up the language model and chains
model = OpenAI(temperature=0)
pronoun_chain = LLMChain(llm=model, prompt=pronoun_prompt)
parse_chain = LLMChain(llm=model, prompt=parse_prompt)
parser = PydanticOutputParser(pydantic_object=ComparisonResult)
compare_prompt_with_instructions = PromptTemplate(
    template=compare_prompt.template + "\n{format_instructions}",
    input_variables=["context_list", "answer_list"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)
compare_chain = LLMChain(llm=model, prompt=compare_prompt_with_instructions)

# Example usage
context = "The quick brown fox jumps over the rock because he's happy. He was born in 2005. The hedgehog was born in 2010, but she's even happier than him."
answer = "The quick brown fox was born in 2005, and the hedgehog in 2010. The quick brown fox is not as happy as the hedgehog"

result = process_data(context, answer)

metrics = calculate_metrics(result["comparison_result"])

print("Context with replaced pronouns:")
print(result["context_replace_pronouns"])

print("\nAnswer with replaced pronouns:")
print(result["answer_replace_pronouns"])

print("\nContext list:")
print(result["context_list"])

print("\nAnswer list:")
print(result["answer_list"])

print("\nComparison result:")
print(result["comparison_result"])

print("\nMetrics:")
print(f"Groundedness: {metrics['groundedness']:.2f}%")
print(f"Thoroughness: {metrics['thoroughness']:.2f}%")

Context with replaced pronouns:

The quick brown fox jumps over the rock because the fox is happy. The fox was born in 2005. The hedgehog was born in 2010, but the hedgehog is even happier than the fox.

Answer with replaced pronouns:

The quick brown fox was born in 2005, and the hedgehog in 2010. The quick brown fox is not as happy as the hedgehog.

Context list:

<facts>
1. The quick brown fox jumps over the rock.
2. The fox is happy.
3. The fox was born in 2005.
4. The hedgehog was born in 2010.
5. The hedgehog is even happier than the fox.
</facts>

Answer list:

<facts>
1. The quick brown fox was born in 2005.
2. The hedgehog was born in 2010.
3. The quick brown fox is not as happy as the hedgehog.
</facts>

Comparison result:
facts_in_both=['The quick brown fox was born in 2005.', 'The hedgehog was born in 2010.'] facts_only_in_answer=['The quick brown fox is not as happy as the hedgehog.'] facts_only_in_context=['The quick brown fox jumps over the rock.', 'The fox is happy.', '

## Run on one pair of statements

In [11]:
context = "To boil pasta, first bring a large pot of salted water to a rolling boil over high heat.."
answer = "To boil pasta, begin by filling a large pot with water, making sure there's enough to fully submerge the pasta. Bring the water to a rolling boil over high heat, then add salt to enhance the pasta's flavor. Once the water is boiling, carefully add the pasta, stirring gently to prevent sticking. Cook the pasta according to the package instructions or until it reaches your desired level of tenderness, usually around 8-12 minutes. To check for doneness, taste a piece of pasta—it should be tender but still slightly firm (al dente)."

result = process_data(context, answer)

metrics = calculate_metrics(result["comparison_result"])

print("Context with replaced pronouns:")
print(result["context_replace_pronouns"])

print("\nAnswer with replaced pronouns:")
print(result["answer_replace_pronouns"])

print("\nContext list:")
print(result["context_list"])

print("\nAnswer list:")
print(result["answer_list"])

print("\nComparison result:")
print(result["comparison_result"])

print("\nMetrics:")
print(f"Groundedness: {metrics['groundedness']:.2f}%")
print(f"Thoroughness: {metrics['thoroughness']:.2f}%")

Context with replaced pronouns:

To boil pasta, first bring a large pot of salted water to a rolling boil over high heat.

Answer with replaced pronouns:

To boil pasta, begin by filling a large pot with water, making sure there's enough water to fully submerge the pasta. Bring the water to a rolling boil over high heat, then add salt to enhance the pasta's flavor. Once the water is boiling, carefully add the pasta, stirring gently to prevent the pasta from sticking. Cook the pasta according to the package instructions or until the pasta reaches your desired level of tenderness, usually around 8-12 minutes. To check for doneness, taste a piece of pasta—it should be tender but still slightly firm (al dente).

Context list:

<facts>
1. To boil pasta, you need to bring a large pot of salted water to a rolling boil.
2. The water should be brought to a rolling boil over high heat.
3. The pot should be large.
4. The water should be salted.
5. The heat should be high.
</facts>

Answer list:



## Run on a list of dictionaries - return DF

In [28]:
data_list = [
    {
        'context': 'The quick brown fox jumps over the rock because he\'s happy. He was born in 2005. The hedgehog was born in 2010, but she\'s even happier than him.',
        'answer': 'The quick brown fox was born in 2005, and the hedgehog in 2010. The quick brown fox is not as happy as the hedgehog'
    },
    {
        'context': 'The sun is a star at the center of our solar system. It is about 93 million miles away from Earth. The sun is a hot ball of glowing gases that provides light and warmth to Earth.',
        'answer': 'The sun is a star located approximately 93 million miles from Earth. It is the source of light and heat for our planet. The sun is not a solid object, but rather a sphere of hot glowing gases.'
    },
    {
        'context': 'Birds are warm-blooded vertebrates that lay eggs and have feathers, wings, and beaks. There are over 10,000 species of birds worldwide. Some common bird species include sparrows, pigeons, and parrots.',
        'answer': 'Birds are a diverse group of animals with feathers and wings. They are warm-blooded egg-laying vertebrates. The number of bird species globally exceeds 10,000. Pigeons, parrots, and sparrows are among the most familiar bird types.'
    },
    {
        'context': 'The Eiffel Tower is a wrought-iron lattice tower located on the Champ de Mars in Paris, France. It was constructed from 1887 to 1889 and stands at a height of 324 meters. The tower is named after Gustave Eiffel, whose company designed and built it.',
        'answer': 'The Eiffel Tower, found in Paris, France, is a lattice tower made of wrought iron. Built between 1887 and 1889, it reaches a height of 324 meters. Gustave Eiffel\'s company was responsible for the tower\'s design and construction, hence its name.'
    },
    {
        'context': 'The Great Wall of China is a series of fortifications and walls built across the historical northern borders of ancient Chinese states and Imperial China. The most well-known sections were built during the Ming dynasty, which ruled from 1368 to 1644.',
        'answer': 'The Great Wall of China, a series of walls and fortifications, was constructed along the northern borders of ancient Chinese states and Imperial China. The Ming dynasty, which lasted from 1368 to 1644, is responsible for the construction of the most famous sections of the wall.'
    }
]

result_df = process_data(data_list)

[2, 1, 3]
[4, 1, 2]
[8, 1, 3]
[5, 8, 1]
[4, 0, 0]


In [29]:
result_df

Unnamed: 0,context,answer,context_replace_pronouns,answer_replace_pronouns,context_list,answer_list,classifications,groundedness,thoroughness
0,The quick brown fox jumps over the rock becaus...,"The quick brown fox was born in 2005, and the ...",The quick brown fox jumps over the rock becaus...,"The quick brown fox was born in 2005, and the ...","[The quick brown fox jumps over the rock., The...","[The quick brown fox was born in 2005., The he...","[[The quick brown fox was born in 2005., The h...",66.666667,40.0
1,The sun is a star at the center of our solar s...,The sun is a star located approximately 93 mil...,The sun is a star at the center of our solar s...,The sun is a star located approximately 93 mil...,[The sun is a star at the center of our solar ...,"[The sun is a star., The sun is located approx...","[[The sun is a star., The sun is located appro...",80.0,66.666667
2,Birds are warm-blooded vertebrates that lay eg...,Birds are a diverse group of animals with feat...,Birds are warm-blooded vertebrates that lay eg...,Birds are a diverse group of animals with feat...,"[Birds are warm-blooded vertebrates., Birds la...",[Birds are a diverse group of animals with fea...,"[[Birds are warm-blooded vertebrates., Birds l...",88.888889,72.727273
3,The Eiffel Tower is a wrought-iron lattice tow...,"The Eiffel Tower, found in Paris, France, is a...",The Eiffel Tower is a wrought-iron lattice tow...,"The Eiffel Tower, found in Paris, France, is a...",[The Eiffel Tower is a wrought-iron lattice to...,"[The Eiffel Tower is found in Paris, France., ...",[[The Eiffel Tower is located on the Champ de ...,38.461538,83.333333
4,The Great Wall of China is a series of fortifi...,"The Great Wall of China, a series of walls and...",The Great Wall of China is a series of fortifi...,"The Great Wall of China, a series of walls and...",[The Great Wall of China is a series of fortif...,[The Great Wall of China is a series of walls ...,[[The Great Wall of China is a series of walls...,100.0,100.0
