# Self-Ask Exploration

The goal is to test out self-ask on complex questions from QAMPARI and RoMQA.

We'll start with QAMPARI, getting all of the complex questions from the dev set.

## Results

Created a prompt using hand selected questions from the training data.
Used the prompt on 100 questions of each type and automatically evaluated whether the quesiton type was correct.
Then manually annotated the decomposition and whether the GT question type was correct.

Results
```
                          | Type Correct %      Decomp Correct % | Auto Type Correct %
---------------------------------------------------------------------------------------
simple                    |    96.00                100.00       |       96.00
composition               |    85.00                 83.00       |       77.00
intersection              |    80.81                 80.81       |       61.62
```

In [1]:
import jsonlines
import json
import numpy as np
import random
import openai
from collections import defaultdict

import multiqa_utils.openai_utils as ou
import multiqa_utils.qampari_utils as qu
import multiqa_utils.decomposition_utils as du

%load_ext autoreload
%autoreload 2

In [2]:
output_dir = "/scratch/ddr8143/multiqa/qampari_data/decomposition_v0/"

**Load & Sort QAMPARI Data**

In [3]:
# This will use the fxn default, but at the time of writing, these were the defaults
#qmp_dev_data_path = "/scratch/ddr8143/multiqa/downloads/data/qampari/dev_data.jsonl"
#qmp_train_data_path = "/scratch/ddr8143/multiqa/downloads/data/qampari/train_data.jsonl"
qmp_dev = qu.load_wikidata_dev_data()
qmp_train = qu.load_wikidata_train_data()
print("Num Dev Wikidata Qs:", len(qmp_dev))
print("Num Train Wikidata Qs:", len(qmp_train))

Num Dev Wikidata Qs: 815
Num Train Wikidata Qs: 56075


In [4]:
for i in range(10):
    print(i, qmp_train[i]['question_text'])

0 Which movie, clip, TV show etc. had Chezhiyan as director of photography?
1 Which movie, clip, TV show etc. had Andrew Droz Palermo as director of photography?
2 Which movie, clip, TV show etc. had Nizar Shafi as director of photography?
3 Which movie, clip, TV show etc. had Steven Soderbergh as director of photography?
4 Which movie, clip, TV show etc. had Haris Savides as director of photography?
5 Which movie, clip, TV show etc. had Philip H. Lathrop as director of photography?
6 Which movie, clip, TV show etc. had Claude Lelouch as director of photography?
7 Which movie, clip, TV show etc. had Russ Meyer as director of photography?
8 Which movie, clip, TV show etc. had Herman Schopp as director of photography?
9 Which movie, clip, TV show etc. had Peter Hyams as director of photography?


In [5]:
for i in range(10):
    print(i, qmp_dev[i]['question_text'])

0 What manga was drawn by Ryoichi Ikegami?
1 Harmony Korine was both screenwriter and director of what movie?
2 Where did administrators of the UN Development Programme attend school?
3 Who directed a film that had P. Balachandran as a screenwriter?
4 The Russian Empire has what ships registered to it?
5 What player was selected in the draft by the Seattle Storm?
6 Where was a Bishop of Bradford taught?
7 What movies did Scott Z. Burns screenwrite?
8 What piece of literature did David G. Hartwell edit?
9 For which movie did Mani Ratnam work on the script and serve as producer?


## Create Final Prompts with Train Data

**Sample equal number of samples from each type for prompt**

In [6]:
# Example of how I created the fixed lists below
qtype_id_lists = qu.split_dataset_by_question_type(qmp_train)
inds_to_use = qu.random_sample_n_per_type(qtype_id_lists, 10)

wikidata_simple:          28574, first 5: [0, 1, 2, 3, 4]
wikidata_comp:            25200, first 5: [28574, 28575, 28576, 28577, 28578]
wikidata_intersection:    2301, first 5: [53774, 53775, 53776, 53777, 53778]
wikidata_simple:            10, first 5: [28381, 18396, 5283, 6085, 17607]
wikidata_comp:              10, first 5: [47899, 46792, 43017, 40833, 30750]
wikidata_intersection:      10, first 5: [54560, 55473, 54972, 53785, 54033]


In [7]:
# Fix the inds for train data to make prompts ("random" sampling variant)
# Train Data
inds_to_use = {
    "wikidata_simple": [16043, 4484, 11212, 7078, 22726, 22999, 13441, 26081, 8947, 25243],
    "wikidata_comp": [49871, 53056, 40308, 36854, 44983, 40413, 44919, 51530, 45160, 38873],
    "wikidata_intersection": [55083, 54453, 54839, 53887, 55449, 55315, 55284, 54935, 54931, 55950],
}

In [8]:
# Dev Data
dev_inds_to_use = {
    "wikidata_simple": [152, 189, 76, 536, 591, 197, 76, 478, 258, 463, 573, 501, 648, 591, 401, 392, 404, 158, 790, 238],
    "wikidata_comp": [515, 522, 723, 154, 678, 671, 202, 79, 343, 719, 584, 62, 767, 107, 802, 662, 379, 488, 618, 569],
    "wikidata_intersection": [735, 380, 582, 92, 71, 178, 365, 410, 596, 677, 553, 633, 600, 1, 780, 611, 652, 673, 63, 222],
}

In [9]:
## Load my manual decompositions to use for prompt creation
manual_train_decomp_file = f"{output_dir}manual_decompositions_train.json"
#json.dump(official_train_decomp, open(manual_train_decomp_file, 'w+'))
manual_train_decomp = {int(k): v for k, v in json.load(open(manual_train_decomp_file)).items()}

In [14]:
## An old prompt example here
#qu.qdata_to_print_prompt_v0(qmp_train, manual_train_decomp, num_each=3, include_simple=False, shuffle=True)

## But the actual prompt we want to use is:
curr_best_prompt = du.get_qmp_decomp_prompt_base_v1(qmp_train, manual_train_decomp)
print(curr_best_prompt)

Instructions:
Choose the question type out of: composition or intersection.

Simple questions only require one piece of information. Example:

“Question: Which software, art, etc. has Don Broco as performer?
Question Type: simple.
Explanation: This is a simple question because we only need to know what Don Broco has performed in.”

Composition questions require getting one answer and then getting more information about that answer. Example:

“Question: What are the dates of death of persons that were a member of the political party Australian Labor Party (Anti-Communist)?
Question Type: composition.
Explanation: This is a complosition question because we need to know who were members of the political party and then we need to get additional information about each of them.”

Intersection questions require getting answers to two questions and then combining them. Example:

“Question: Which film has M. G. Ramachandran as a member of its cast and has J. Jayalalithaa as a member of its cast

## Use GPT3 API

**Setup Using GPT3 Prompt**

In [11]:
ou.setup_apikey()

>> API key set


In [13]:
# Example
res, res_text = ou.prompt_openai(
    "Now that we know Elon Musk is an adventurer, I think his next project is",
    #engine='code-davinci-002',
    engine='text-davinci-003',
)
print(res_text)

 going to be something that pushes the boundaries of exploration and discovery. He could be working on a project to explore the depths of the ocean, or to develop a new type of spacecraft that could take us to the outer reaches of our solar system. He could also be working on a project to develop new technologies that could help us better understand the universe around us. Whatever it is, I'm sure it will be something that will be groundbreaking and inspiring.


In [None]:
# Next goal: sample the data and run it through, evaluate it and then report the results!

**Sample the data to evaluate**

**Then we can make individual prompts like so:**

In [22]:
# Choose the question to evaluate
to_eval_dev_inds_file = f"{output_dir}dev_indices_100random_each_for_decomp.json"
to_eval_dev_inds = json.load(open(to_eval_dev_inds_file))
ev_ind = to_eval_dev_inds['wikidata_comp'][0]

# Load the necessary data
#qmp_train = qu.load_wikidata_train_data()
#qmp_dev = qu.load_wikidata_dev_data()
manual_train_decomp = du.load_manual_train_decomp('qampari')

# Create the prompt
qdata = qmp_dev[ev_ind]
qprompt = du.get_qmp_decomp_prompt_v1(qmp_train, manual_train_decomp, qdata)

# Run the query
_, res_text = ou.prompt_openai(
    qprompt,
    engine='code-davinci-002',
)
print("Res text:", res_text, "\n------------")
q_output = du.prompt_output_to_eval_dict(qdata, qprompt, res_text, manual_check=True)

Res text:  composition.
Question 1: What movies were written by Charles Edward Pogue?
Question 2: Who directed [ANS1]? 
------------
>> Pred type correct: composition

Question: Who directed the movie that was written by Charles Edward Pogue?
Question Type: composition.
Question 1: What movies were written by Charles Edward Pogue?
Question 2: Who directed [ANS1]?



Is the decomposition correct? ('T' for yes) >> 
Any notes on this example? >> 


----------------------



In [24]:
### Let's compare between codex and text-davinci-003

In [30]:
qmp_dev = qu.load_dev_data()

In [31]:
len(qmp_dev)

1000

In [None]:
du.decompose_with_prompt(
    data_to_decompose=qmp_dev,
    outfile=f"{output_dir}decomposed_codex_dev_full_v1.jsonl",
    engine="code-davinci-002",
    progress_increment=10,
    rate_limit=5,
)

In [3]:
qmp_train_simple = [d for d in qu.load_train_data() if 'simple' in d['qid']]

In [54]:
prompt_ex = ou.get_simple_prompt_v1(qmp_train_simple[0])
_, res_text = ou.prompt_openai(prompt_ex, engine='code-davinci-002')

In [56]:
print(res_text)

 To Let (film), Kalloori, Thenmerku Paruvakaatru, A Little Dream, Paradesi (2013 film)
Answer Type: movie, clip, TV show
Pages: Chezhiyan


In [57]:
q_output = du.prompt_output_to_eval_dict(qmp_train_simple[0], prompt_ex, res_text, manual_check=True)

>> Pred type wrong: intersection when gt was simple

Question: Which movie, clip, TV show etc. had Chezhiyan as director of photography?
Sampled Answers: To Let (film), Kalloori, Thenmerku Paruvakaatru, A Little Dream, Paradesi (2013 film)
Answer Type: movie, clip, TV show
Pages: Chezhiyan



Any notes on this example? >> 


----------------------



In [4]:
results = du.loadjsonl(f"{output_dir}ans_type_codex_train_full_v1.jsonl")
len(results)

## This is what we're currently using

In [3]:
ou.setup_apikey()

>> API key set


In [4]:
qmp_train_simple = [d for d in qu.load_train_data() if 'simple' in d['qid']]

In [6]:
ou.process_with_prompt(
    data_to_process=qmp_train_simple,
    outfile=f"{output_dir}ans_type_codex_train_full_v1.jsonl",
    engine="code-davinci-002",
    progress_increment=10,
    rate_limit=5,
)

Initial data len: 28574
  - after loading: 8711 new len: 19863
>> [0.0] elem 0 / 19,863
Raw time: 0.9112896919250488
  -> Total time: 15.92078185081482
Raw time: 0.7652661800384521
  -> Total time: 15.77501916885376
Raw time: 0.8572874069213867
  -> Total time: 15.864985942840576
Raw time: 1.4910893440246582
  -> Total time: 16.504010915756226
Raw time: 1.6692101955413818
  -> Total time: 16.676008462905884
Raw time: 1.0338151454925537
  -> Total time: 16.04398536682129
Raw time: 0.8742802143096924
  -> Total time: 15.877997875213623
Raw time: 0.6670444011688232
  -> Total time: 15.678998470306396
Raw time: 1.0471737384796143
  -> Total time: 16.055996417999268
Raw time: 1.112807273864746
  -> Total time: 16.126007080078125
>> [2.7] elem 10 / 19,863
Raw time: 0.7657840251922607
  -> Total time: 15.777994871139526
Raw time: 1.0000569820404053
  -> Total time: 16.012028217315674
Raw time: 0.46544814109802246
  -> Total time: 15.478972911834717
Raw time: 0.5580527782440186
  -> Total time

KeyboardInterrupt: 

**And finally we can put it all together to evaluate a tiny test set**

In [83]:
decomp_dev_100random_file = f"{output_dir}decomposed_dev_100random_prompt_v1.jsonl"

In [84]:
ddtiny_file = decomp_dev_100random_file + "__tiny_test"

In [85]:
to_eval_dev_inds_tiny = {k: v[:2] for k, v in to_eval_dev.items()}
to_eval_dev_inds_tiny

{'wikidata_simple': [29, 108],
 'wikidata_comp': [606, 352],
 'wikidata_intersection': [296, 773]}

In [86]:
du.decompose_indmap_with_prompt(
    to_eval_dev_inds=to_eval_dev_inds_tiny,
    outfile=ddtiny_file,
    progress_increment=1,
)

Starting to decompose wikidata_simple, 2 queries to go
>> elem 0
>> elem 1
Starting to decompose wikidata_comp, 2 queries to go
>> elem 0
>> elem 1
Starting to decompose wikidata_intersection, 2 queries to go
>> elem 0
>> elem 1
Fnished Decomp & Wrote: /scratch/ddr8143/multiqa/qampari_data/decomposition_v0/decomposed_dev_100random_prompt_v1.jsonl__tiny_test


In [106]:
stats, new_out = du.process_prompt_outputs(ddtiny_file, manual_check=False)

In [107]:
print("And finally, the result will be:")
stats

And finally, the result will be:


{'incorrect_ids': {},
 'num_total': {'simple': 2, 'composition': 2, 'intersection': 2},
 'num_correct_type': {'simple': 2, 'composition': 2, 'intersection': 2},
 'correct_type_percent': {'simple': 100.0,
  'composition': 100.0,
  'intersection': 100.0}}

### Now lets annotate the decomposition of the full output of the 100 samples

In [111]:
# First get the overall stats
stats, _ = du.process_prompt_outputs(decomp_dev_100random_file, manual_check=False)
stats['correct_type_percent']

{'simple': 96.0, 'composition': 77.0, 'intersection': 61.61616161616162}

In [115]:
stats_full.keys()

dict_keys(['failed_queries', 'incorrect_ids', 'num_total', 'num_correct_type', 'correct_type_percent', 'num_correct_decomp', 'correct_decomp_percent'])

In [116]:
stats_full['correct_type_percent']

{'simple': 96.0, 'composition': 77.0, 'intersection': 61.61616161616162}

In [117]:
stats_full['correct_decomp_percent']

{'simple': 0.0, 'composition': 75.0, 'intersection': 61.61616161616162}

In [123]:
ma_updates = {k: 0 for k in stats_full['correct_decomp_percent'].keys()}
for ma in new_out_decomp_dev_100random:
    if not ma['pred_type_correct']:
        if ma['note'] == "GT Wrong":
            ma_updates[ma['gt_type']] += 1

In [132]:
for k, v in ma_updates.items():
    total_right_type = ma_updates[k] + stats_full['num_correct_type'][k]
    if k == "simple":
        total_right_decomp = stats_full['num_total']['simple']
    else:
        total_right_decomp = ma_updates[k] + stats_full['num_correct_decomp'][k]
    total = stats_full['num_total'][k]
    print(f"{k:25} | Type Correct: {total_right_type * 100.0 / total:0.2f} Decomp Correct: {total_right_decomp * 100.0 / total:0.2f} | Auto Type Correct: {stats_full['correct_type_percent'][k]:0.2f}")

simple                    | Type Correct: 96.00 Decomp Correct: 100.00 | Auto Type Correct: 96.00
composition               | Type Correct: 85.00 Decomp Correct: 83.00 | Auto Type Correct: 77.00
intersection              | Type Correct: 80.81 Decomp Correct: 80.81 | Auto Type Correct: 61.62


## Explore Prompt Formats Using Dev Data

**Directly Use 4-shot Self-Ask Prompt**

In [None]:
# Taken from: https://github.com/ofirpress/self-ask/blob/main/self-ask_plus_search-engine_demo.ipynb
base_prompt = ['''Question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali 

Question: When was the founder of craigslist born?
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952

Question: Who was the maternal grandfather of George Washington?
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball 

Question: Are both the directors of Jaws and Casino Royale from the same country? 
Are follow up questions needed here: Yes. 
Follow up: Who is the director of Jaws? 
Intermediate Answer: The director of Jaws is Steven Spielberg. 
Follow up: Where is Steven Spielberg from? 
Intermediate Answer: The United States. 
Follow up: Who is the director of Casino Royale? 
Intermediate Answer: The director of Casino Royale is Martin Campbell. 
Follow up: Where is Martin Campbell from? 
Intermediate Answer: New Zealand. 
So the final answer is: No

Question: ''', 
'''
Are follow up questions needed here:''', ]

In [None]:
# Also taken from: https://github.com/ofirpress/self-ask/blob/main/self-ask_plus_search-engine_demo.ipynb
# But then modified

#def promptf(question, prompt, intermediate = "\nIntermediate answer:", followup = "Follow up:", finalans= '\nSo the final answer is:'):
INTERMEDIATE = "\nIntermediate answer:"
FOLLOWUP = "Follow up:"
FINALANS = "\nSo the final answer is:"
def printprompt(qid, devqs, prompt):
    question = devqs[qid]['question_text']
    cur_prompt = prompt[0] +  question + prompt[1]

    print(cur_prompt, end ='')

    """
    ret_text = call_gpt(cur_prompt, intermediate)

    while followup in get_last_line(ret_text):

      
      cur_prompt += ret_text
      question = extract_question(ret_text)
      external_answer = get_answer(question)

      if external_answer is not None:
        cur_prompt += intermediate + ' ' + external_answer + '.'
        print(intermediate + ' ' + yellowfy(external_answer) + '.', end='' )
        ret_text = call_gpt(cur_prompt, intermediate)
      else:
        #We only get here in the very rare case that Google returns no answer.
        cur_prompt += intermediate
        print(intermediate + ' ')
        gpt_answer = call_gpt(cur_prompt, ['\n'+followup, finalans])
        cur_prompt += gpt_answer

    
    if finalans not in ret_text:
      cur_prompt += finalans
      print(finalans, end = '')
      ret_text = call_gpt(cur_prompt, '\n')

    return cur_prompt + ret_text
    """

In [None]:
printprompt(0, qmp_comp_devd, base_prompt)

**Now Choose Subset to Try and Try Them**

In [None]:
for i in range(30):
    offset = 200
    if "wikidata_comp" not in qmp_comp_devd[i+offset]['qid']:
        continue
    print(i+offset, qmp_comp_devd[i+offset]['question_text'])

In [None]:
inds_to_try = [
    0,  # 0. screenwriter and director
    3,  # 1. where taught
    10, # 2. competed and won
    27, # 3. film steinbeck wrote
    32, # 4. institiution where educated
    44, # 5. graduated from two places
    59, # 6. which one did peerson win
    232,# 7. company produced wirtten
    255,# 8. composer for movie produced
    347,# 9. objects person designed depicted what
]

In [None]:
for i in inds_to_try:
    print(qmp_comp_devd[i]['question_text'])

In [None]:
test_ind = 0
print(qmp_comp_devd[inds_to_try[test_ind]]['question_text'])
print([a['answer_text'] for a in qmp_comp_devd[inds_to_try[test_ind]]['answer_list']])

print()

printprompt(inds_to_try[0], qmp_comp_devd, base_prompt)

In [6]:
inds_to_type = {
    "composition": [228, 229, 230, 231, 23, 24, 27, 28, 50, 51, 53, 54, 57],
    "intersection": [222, 226, 20, 21, 22, 25, 26, 55, 56, 58],
    "filter": [223, 225, 227, 29, 52, 59]
}

**Make a better prompt**

In [7]:
# Can we make a prompt out of these?
for ii in sorted(inds_to_type['filter']):#range(10):
    i = ii + 0
    print(i, qmp_comp_devd[i]['qid'])
    print(qmp_comp_devd[i]['question_text'])
    print([a['answer_text'] for a in qmp_comp_devd[i]['answer_list']])
    print()

29 374__wikidata_intersection__dev
Which PGA Champsionship did Jack Nicklaus win?
['1980 PGA Championship', '1975 PGA Championship', '1971 PGA Championship', '1973 PGA Championship', '1963 PGA Championship']

52 418__wikidata_intersection__dev
Which Monaco Grand Prix was won by Michael Schumacher?
['1999 Monaco Grand Prix', '1994 Monaco Grand Prix', '1997 Monaco Grand Prix', '2001 Monaco Grand Prix', '1995 Monaco Grand Prix']

59 103__wikidata_intersection__dev
Which Monaco Grand Prix did Ayrton Senna win?
['1993 Monaco Grand Prix', '1989 Monaco Grand Prix', '1987 Monaco Grand Prix', '1990 Monaco Grand Prix', '1991 Monaco Grand Prix', '1992 Monaco Grand Prix']

223 129__wikidata_intersection__dev
In which FA Cup Final did Blackburn Rovers F.C. compete?
['1882 FA Cup Final', '1960 FA Cup Final', '1891 FA Cup Final', '1890 FA Cup Final', '1884 FA Cup Final', '1885 FA Cup Final', '1928 FA Cup Final', '1886 FA Cup Final']

225 329__wikidata_intersection__dev
Which Super Bowl did the San Fr

In [54]:
"""
[{
        'question_text': '',
        'question_type': '',
        'subquestions': [
            '',
            '',
        ]
    }],
"""
decomposed_qs_intersection = {
    0: [{
        'question_text': 'Harmony Korine was both screenwriter and director of what movie?',
        'question_type': 'intersection',
        'subquestions': [
            'Harmony Korine was the screenwriter of what movie?',
            'Harmony Korine was the director of what movie?',
        ]
    }],
    22: [{ # intersection
        "question_text": "Who was both a graduate from Ananda College and University of Ceylon?",
        "question_type": "intersection",
        "subquestions": [
            "Who graduated from Ananda College?",
            "Who graduated from University of Ceylon?",
        ],
    }],
    25: [{
        'question_text': 'Which movie had K. S. L. Swamy as its director and Vijaya Bhaskar as its musical composer?',
        'question_type': 'intersection',
        'subquestions': [
            'Which movie has K. S. L. Swamy as its director?',
            'Which movie has Vijaya Bhaskar as its musical composer?',
        ]
    }],
    29: [{
        "question_text": "Which PGA Champsionship did Jack Nicklaus win?",
        "question_type": "intersection",
        "subquestions": [
            "Which PGA Champsionship have been played?",
            "What games has Jack Nicklaus won?"
        ]
    }],
    90: [{ # intersection (but I see filter too)
        "question_text": "What Superbowls did the Washington Football Team play in?",
        "question_type": "intersection",
        "subquestions": [
            "Which Superbowls were played?",
            "Which games did the Washington Football Team play in?"
        ]
    }],
    95: [{
        'question_text': 'What music was composed by Devi Sri Prasad and produced by Dil Raju?',
        'question_type': 'intersection',
        'subquestions': [
            'What music was composed by Devi Sri Prasad?',
            'What music was produced by Dil Raju?',
        ]
    }],
    380: [{ # intersection
        "question_text": "Which competition had Vitória F.C. and S.L. Benfica as participants?",
        "question_type": "intersection",
        "subquestions": [
            "Which competition had Vitória F.C. as a participant?",
            "Which competition had S.L. Benfica as a participant?",

        ],
    }],
    385: [{
        'question_text': 'What movie did Irwin Allen both direct and produce?',
        'question_type': 'intersection',
        'subquestions': [
            'What movie did Irwin Allen direct?',
            'What movie did Irwin Allen produce?',
        ]
    }],
}

"""
[{
            'question_text': '',
            'question_type': '',
            'subquestions': [
                '',
                '',
            ]
        }],
"""
decomposed_qs_composition = {
    11: [{
        'question_text': 'Joe Pasternak produced a motion picture that was directed by who?',
        'question_type': 'composition',
        'subquestions': [
            'What motion pictures did Joe Pasternak produce?',
            'Who directed it?',
        ]
    }],
    110: [{
        'question_text': 'Who directed the TV show that Tom Kauffman worked on as a screenwriter?',
        'question_type': 'composition',
        'subquestions': [
            'What TV shows did Tom Kauffman work on as a screenwriter?',
            'Who directed it?',
        ]
    }],
    120: [{
        'question_text': 'Where did former Bishops of Warrington go to school?',
        'question_type': 'composition',
        'subquestions': [
            'Who were the former Bishops of Warrington?',
            'Where did they go to school?',
        ]
    }],
    150: [{ # compositional
        "question_text": "Who was credited as director for a movie penned by Peter Baynham?",
        "question_type": "composition",
        "subquestions": [
            "What movies were penned by Peter Baynham?",
            "Who was credited the director for it?"
        ],
    }],
    221: [{
        'question_text': 'Where did an employee of University of Pennsylvania Law School receive their education?',
        'question_type': 'composition',
        'subquestions': [
            'Who was an employee of University of Pennsylvania Law School?',
            'Where did they receive their education?',
        ]
    }],
}

# many of these are intersections that are better answered as filters
decomposed_qs_filter = {
    29: [{
        "question_text": "Which PGA Champsionship did Jack Nicklaus win?",
        "question_type": "filter",
        "subquestions": [
            "Which PGA Champsionship have been played?",
            "Did Jack Nicklaus win it?"
        ]
    }],
    52: [{
        "question_text": "Which Monaco Grand Prix was won by Michael Schumacher?",
        "question_type": "filter",
        "subquestions": [
            "Which Monaco Grand Prix have been held?",
            "Did Michael Schumacher win it?"
        ]
    }],
    59: [{
        "question_text": "Which Monaco Grand Prix did Ayrton Senna win?",
        "question_type": "filter",
        "subquestions": [
            "Which Monaco Grand Prix have been held?",
            "Did Ayrton Senna win it?"
        ]
    }],
    90 : [{ # intersection (but I see filter too)
        "question_text": "What Superbowls did the Washington Football Team play in?",
        "question_type": "filter",
        "subquestions": [
            "Which Superbowls have been played?",
            "Did the Washington Football Team play in it?"
        ]
    }],
    155: [{
        'question_text': 'Which FA Cup Final featured the Tottenham Hotspurs as competitors?',
        'question_type': 'filter',
        'subquestions': [
            'Which FA Cup Finals were played?',
            'Did it feature the Tottenham Hotsurs as a competitor?',
        ]
    }],
    223: [{
        "question_text": "In which FA Cup Final did Blackburn Rovers F.C. compete?",
        "question_type": "filter",
        "subquestions": [
            "Which FA Cup Finals have been played?",
            "Did Blackburn Rovers F.C. compete in it?"
        ]
    }],
}

In [None]:
qmp_comp_devd[155]['qid']

In [50]:
# Then, lets focus on pure intersections for now as the first trial q is a pure intersection
"""
Question: Who lived longer, Muhammad Ali or Alan Turing?
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali 
"""
#Is this a composition, filter or intersection question: {qtype}

def qdata_to_prompt(qdata, qdecompose):
    #Question 2: {subqs2}""".format(
    return """
Question: {init_q}
Can this be decomposed: Yes.
Is this a composition or intersection question: {qtype}.
Question 1: {subqs1}
Question 2: {subqs2}
So the final answers are: {answer_list}.""".format(
        init_q=qdata['question_text'],
        qtype=qdecompose['question_type'],
        subqs1=qdecompose['subquestions'][0],
        subqs2=qdecompose['subquestions'][1],
        answer_list=", ".join(list(set([a['answer_text'] for a in qdata['answer_list']]))[:5]),
    )

In [None]:
qind = 22
print(qdata_to_prompt(qmp_comp_devd[qind], decomposed_qs_intersection[qind][0]))

In [None]:
# Intersection Prompt
test_qind = 385
for qind in [
    0, 
    22, 
    25, 
    95,
    385
]:
    if qind == test_qind:
        continue
    print(qdata_to_prompt(qmp_comp_devd[qind], decomposed_qs_intersection[qind][0]))
print()
print("Question: " + qmp_comp_devd[test_qind]['question_text'])
print("Can this be decomposed:")

In [None]:
# Composition Prompt
test_qind = 11
for qind in [
    11, 
    110, 
    120, 
    150,
    221
]:
    if qind == test_qind:
        continue
    print(qdata_to_prompt(qmp_comp_devd[qind], decomposed_qs_composition[qind][0]))
print()
print("Question: " + qmp_comp_devd[test_qind]['question_text'])
print("Can this be decomposed:")

In [None]:
# Combined: get q type too
test_qind = 25
curr_qs_comp = {**decomposed_qs_composition, **decomposed_qs_intersection}
for qind in [
    0, 
    11, 
    110, 
    22, 
    120, 
    25, 
]:
    if qind == test_qind:
        continue
    print(qdata_to_prompt(qmp_comp_devd[qind], curr_qs_comp[qind][0]))
print()
print("Question: " + qmp_comp_devd[test_qind]['question_text'])
print("Can this be decomposed:")

In [31]:
# Filter prompt
test_qind = 223
curr_qs_comp = {**decomposed_qs_composition, **decomposed_qs_intersection, **decomposed_qs_filter}
for qind in [
    29,
    52,
    90,
    155,
    223,
]:
    if qind == test_qind:
        continue
    print(qdata_to_prompt(qmp_comp_devd[qind], curr_qs_comp[qind][0]))
print()
print("Question: " + qmp_comp_devd[test_qind]['question_text'])
print("Can this be decomposed:")


Question: Which PGA Champsionship did Jack Nicklaus win?
Can this be decomposed: Yes.
Is this a composition, filter or intersection question: filter.
Question 1: Which PGA Champsionship have been played?
Question 2: Did Jack Nicklaus win it?
So the final answers are: 1980 PGA Championship, 1971 PGA Championship, 1975 PGA Championship, 1963 PGA Championship, 1973 PGA Championship.

Question: Which Monaco Grand Prix was won by Michael Schumacher?
Can this be decomposed: Yes.
Is this a composition, filter or intersection question: filter.
Question 1: Which Monaco Grand Prix have been held?
Question 2: Did Michael Schumacher win it?
So the final answers are: 1999 Monaco Grand Prix, 1997 Monaco Grand Prix, 1994 Monaco Grand Prix, 2001 Monaco Grand Prix, 1995 Monaco Grand Prix.

Question: What Superbowls did the Washington Football Team play in?
Can this be decomposed: Yes.
Is this a composition, filter or intersection question: filter.
Question 1: Which Superbowls have been played?
Questio

In [55]:
# Combined: get q type too
test_qind = 40
curr_qs_comp = {**decomposed_qs_composition, **decomposed_qs_intersection}#, **decomposed_qs_filter}
for qind in sorted([
    # Intersection
    0, 
    22, 
    29, # FILTER
    #25, 
    95,
    385,
    # Composition
    11, 
    110, 
    120, 
    150,
    221,
    # Filter
    #29,
    #52,
    #90,
    #155,
    #223,
]):
    if qind == test_qind:
        continue
    print(qdata_to_prompt(qmp_comp_devd[qind], curr_qs_comp[qind][0]))
    
for test_qind in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 52, 74, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 184, 205, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 360, 361, 383]:
    #test_qind = i+40
    #if "int" not in qmp_comp_devd[test_qind]['qid']:
    #    continue
    print()
    print(test_qind)
    print("Question: " + qmp_comp_devd[test_qind]['question_text'])
    print("Can this be decomposed:")


Question: Harmony Korine was both screenwriter and director of what movie?
Can this be decomposed: Yes.
Is this a composition or intersection question: intersection.
Question 1: Harmony Korine was the screenwriter of what movie?
Question 2: Harmony Korine was the director of what movie?
So the final answers are: Gummo, The Beach Bum, Mister Lonely, Spring Breakers, Julien Donkey-Boy.

Question: Joe Pasternak produced a motion picture that was directed by who?
Can this be decomposed: Yes.
Is this a composition or intersection question: composition.
Question 1: What motion pictures did Joe Pasternak produce?
Question 2: Who directed it?
So the final answers are: Richard Wallace, Erich Schönfelder, Norman Taurog, Charles Walters, Richard Thorpe.

Question: Who was both a graduate from Ananda College and University of Ceylon?
Can this be decomposed: Yes.
Is this a composition or intersection question: intersection.
Question 1: Who graduated from Ananda College?
Question 2: Who graduated f

In [None]:
test_qind = 221
for qind in [
    11, 
    110, 
    120, 
    150,
    221
]:
    if qind == test_qind:
        continue
    print(qdata_to_prompt(qmp_comp_devd[qind], decomposed_qs_composition[qind][0]))
print()
print("Question: " + qmp_comp_devd[test_qind]['question_text'])
print("Can this be decomposed:")

In [None]:
def comp_ans(gta_str, preda_str):
    gta = list(set([a.replace(' ', '').lower() for a in gta_str.split(", ")]))
    preda = set(list([a.replace(' ', '').lower() for a in preda_str.split(", ")]))
    print("Num pred answers:", len(preda))
    num_in = 0
    for a in gta:
        if a in preda:
            num_in += 1
            
    print(f"Num GT pred: {num_in} / {len(gta)}")
    print(f"Num pred not GT: {len(preda) - num_in}")

In [None]:
comp_ans(
    gta_str="University of Pennsylvania, University of Oklahoma, New College of Florida, University of Pennsylvania Law School, University of Toronto, Alfred University, Southwestern College, University of Pennsylvania, Stanford Law School, Merton College, St Antony's College, Harvard Law School, Harvard University, Yale University, Reading High School, Smith College, Harvard University, Yale Law School, University of Pennsylvania Law School, University of Chicago Law School, University of Pennsylvania Law School, Lower Merion High School, Massachusetts Institute of Technology, Islamic Azad University, University of Pennsylvania Law School, Northwestern University School of Law, Harvard University, University of California, Harvard Law School, University of North Carolina at Chapel Hill, University of Chicago, Yale Law School, University of Pennsylvania Law School, Yale Law School, Princeton University, Somerville College, Cornell University, University of Oklahoma, University of Michigan Law School, University of Pennsylvania Law School, University of Michigan, University of Pennsylvania Law School, University of Pennsylvania Law School",
    preda_str="Stanford University, Harvard University, Yale University, University of California, Berkeley, University of Virginia, Columbia University, University of Michigan, Duke University, University of Pennsylvania, Georgetown University, Cornell University, Northwestern University",
)