## Check templates

In [1]:
import jinja2

In [2]:
environment = jinja2.Environment(loader=jinja2.FileSystemLoader('prompts'))
template = environment.get_template('movierecaps_coherence.jinja')

In [5]:
print(template.render(
    claims=["The movie was thrilling from start to finish."],
    question="Inception is a mind-bending thriller that keeps viewers on the edge of their seats."  
))

You are an expert coherence evaluator for video question answering systems. Your task is to evaluate the internal logical coherence of a model's response by comparing the claims it makes against each other. A coherent response should not contain claims that contradict one another or are excessively redundant.

## Input Information
*Question:* Inception is a mind-bending thriller that keeps viewers on the edge of their seats.
*Claims Extracted from Model Response:*
['The movie was thrilling from start to finish.']

## Evaluation Task
For EACH claim in the model response, evaluate its logical consistency by checking against:
- All other claims in the same response.
- The goal is to identify internal contradictions, not to check against external facts.

## Scoring Rubric
For each claim, assign ONE of these labels:
- CONSISTENT (CO): The claim is logically consistent with all other claims in the response. It introduces new information or builds on previous claims without conflict.
- REDUND

In [6]:
environment = jinja2.Environment(loader=jinja2.FileSystemLoader('prompts'))
template = environment.get_template('movierecaps_factuality.jinja')

In [7]:
print(template.render(
    claims=["The movie was thrilling from start to finish."],
    question="Inception is a mind-bending thriller that keeps viewers on the edge of their seats.", 
    facts=["Inception was released in 2010.", "The director of Inception is Christopher Nolan." ],
    context="Inception is a science fiction film that explores the concept of dream invasion."
))

You are an expert factuality evaluator for video question answering systems. Your task is to evaluate the factual accuracy of claims made in a model’s response by comparing them against ground truth atomic facts and dialogue from the same video segment.

## Input Information
*Question:* Inception is a mind-bending thriller that keeps viewers on the edge of their seats.
*Claims Extracted from Model Response:*
['The movie was thrilling from start to finish.']
*Ground Truth Atomic Facts from Video Segment:*
['Inception was released in 2010.', 'The director of Inception is Christopher Nolan.']
*SRT Dialogue Context:*
Inception is a science fiction film that explores the concept of dream invasion.

## Evaluation Task
For EACH claim in the model response, evaluate its factual accuracy by checking against:
1. The ground truth atomic facts
2. The SRT dialogue context

## Scoring Rubric
For each claim, assign ONE of these labels:
- *SUPPORTED (S)*: The claim is directly supported by the ground 