In [1]:
from data.samples import get_goal_driven_examples
from llm_explain.llm.propose import create_proposer_diff_prompt_body
from llm_explain.models.diff import explain_diff

### Comparing datasets of airline reviews.

In this dataset, we are comparing the United Airline (U.S., in English) vs. reviews for Air France (mostly in French), and the goal is to understand the difference in aspect of services.

If we do not add any constraints to our system, our system will say that the later "is more often in French", which is not very useful.

We will show the prompt we used, and the difference between using our system with or without the goal-constraint.

In [2]:
dataset = get_goal_driven_examples(seed=42)
print(dataset["context"])
print(dataset["constraint"])

I'm trying to decide which airline (United or Air France) to fly on, I want to understand the difference between aspects of the service.
The predicate should be about aspects of the service, and does NOT mention airline names (United or Air France), positive or negative classes, or language (French or English). Be specific, for example, 'has a positive sentiment' is not a good predicate, but 'complains about flight delays' is a good predicate.


### The prompt that includes the goal

(pay attention to the last part of the prompt in the next cell, starting from "Here is some context about the text x_samples")

In [3]:
prompt = create_proposer_diff_prompt_body(x_samples=dataset['X'], y=dataset['Y'], constraint=dataset['constraint'], context=dataset['context'], num_explanations=3)
print(prompt)
                                          


Here are two sets of text x_samples.

Some x_samples from the negative class:
Negative class sample.0:  on United Airlines, cabin lighting was well adjusted
Negative class sample.1:  on United Airlines, safety instructions were clear
Negative class sample.2:  on United Airlines, seat recline mechanism worked smoothly
Negative class sample.3:  on United Airlines, plane looked new and modern inside
Negative class sample.4:  on United Airlines, temperature remained comfortable
Negative class sample.5:  on United Airlines, convenient flight times and connections
Negative class sample.6:  on United Airlines, wifi connection was reliable
Negative class sample.7:  on United Airlines, plenty of legroom in economy
Negative class sample.8:  on United Airlines, minimal turbulence during the flight
Negative class sample.9:  on United Airlines, smooth takeoff and landing
Negative class sample.10:  on United Airlines, entertainment system worked great
Negative class sample.11:  on United Airlines, 

### Explaining differences WITH the constraints based on the goals

In [4]:
args = {
    "proposer_num_rounds": 5,
    "proposer_num_explanations_per_round": 3,
    "proposer_precise": False,
    **dataset,
}
result = explain_diff(**args)

print(result)

Printing top 3 explanations:
Explanation: comments on the excellence of in-flight hospitality
Accuracy: 0.8571428571428572

Explanation: expresses satisfaction with the dining experience
Accuracy: 0.8214285714285714

Explanation: expresses satisfaction with the dining experience
Accuracy: 0.8214285714285714




### Explaining differences WITHOUT the constraints based on the goals

In [5]:
del dataset["constraint"]
del dataset["context"]

args = {
    "proposer_num_rounds": 5,
    "proposer_num_explanations_per_round": 3,
    "proposer_precise": False,
    **dataset,
}
result = explain_diff(**args)

print(result)

Printing top 3 explanations:
Explanation: uses French language
Accuracy: 0.8928571428571428

Explanation: includes phrases in French
Accuracy: 0.8928571428571428

Explanation: is in French
Accuracy: 0.8928571428571428


