# Prompt-Response Alignment Score Example

This notebook demonstrates how to use the Prompt-Response Alignment Metric to evaluate how well responses align with their prompts.

In [1]:
from examples.llm_aware_metrics.code.alignment_score import PromptAlignmentMetric
from llm_metrics.semantic_similarity_metrics import BERTScore

<br>

## Example Data
Example prompts and responses with varying degrees of alignment

In [2]:
prompt1 = "What are the main causes of climate change?"

# Well-aligned response
response1 = "The main causes of climate change include greenhouse gas emissions from burning fossil fuels, deforestation, and industrial processes."

# Partially aligned response
response2 = "Climate change is a serious issue. We need to reduce pollution and plant more trees."

# Poorly aligned response
response3 = "The weather has been quite unusual lately. Yesterday it rained all day."

## Using the Alignment Metric

In [7]:
# Initialize the metric
base_metric = BERTScore(model_type="microsoft/deberta-xlarge-mnli")
alignment_metric = PromptAlignmentMetric(base_metric)

In [11]:
# Compare well-aligned responses
score1 = alignment_metric.calculate_with_prompt(
    response1,
    response2,
    prompt1
)

print(f"Alignment score for well-aligned responses:\n{score1}")

Alignment score for well-aligned responses:
{'precision': 0.6514176527659098, 'recall': 0.5639405051867167, 'f1': 0.600588838259379}


In [12]:
# Compare with poorly aligned response
score2 = alignment_metric.calculate_with_prompt(
    response1,
    response3,
    prompt1
)

print(f"Alignment score with poorly aligned response:\n{score2}")

Alignment score with poorly aligned response:
{'precision': 0.5123771925767263, 'recall': 0.4340182642141978, 'f1': 0.4655380845069885}


## Analyzing Components of the Score

In [13]:
# Let's break down the components of the alignment score
def analyze_alignment(response1, response2, prompt):
    # Get individual components
    alignment1 = alignment_metric.calculate_prompt_alignment(prompt, response1)
    alignment2 = alignment_metric.calculate_prompt_alignment(prompt, response2)
    response_similarity = base_metric.calculate(response1, response2)
    
    print(f"Response 1 Prompt Alignment: {alignment1:.3f}")
    print(f"Response 2 Prompt Alignment: {alignment2:.3f}")
    print(f"Response Similarity: {response_similarity:.3f}")

print("Analysis of well-aligned responses:")
analyze_alignment(response1, response2, prompt1)

print("\nAnalysis with poorly aligned response:")
analyze_alignment(response1, response3, prompt1)

Analysis of well-aligned responses:


TypeError: unsupported format string passed to dict.__format__