In [11]:
from tqdm import tqdm
import argparse
from utils import *
from freebase import *
from propagation import *


parser = argparse.ArgumentParser()
parser.add_argument("--dataset", type=str,
                    default="cwq", help="choose the dataset from {cwq, webqsp, grailqa, simpleqa, webquestions}.")
parser.add_argument("--limit", type=int,
                    default=10000, help="the max length of the approximation LLMs input.")               
parser.add_argument("--max_length", type=int,
                    default=1000, help="the max length of LLMs output.")
parser.add_argument("--max_retry", type=int,
                    default=10, help="the maximum amount of retry if failed.")
parser.add_argument("--temperature", type=float,
                    default=0., help="the temperature")
parser.add_argument("--depth", type=int,
                    default=3, help="the depth of propagation.")
parser.add_argument("--width", type=int,
                    default=3, help="the number of relations kept.")
parser.add_argument("--llm", type=str,
                    default="llama-3", help="choose base LLM model from {llama-2, llama-3, gpt-3.5-turbo, gpt-4}.")
parser.add_argument("--openai_api_key", type=str,
                    default="", help="if the LLM is gpt-3.5-turbo or gpt-4, you need add your own openai api key.")
parser.add_argument('--verbose', action='store_true', help="print LLM input and output.")
args = parser.parse_args(["--verbose"])
# args = parser.parse_args("")


datas, question_string = prepare_dataset(args.dataset)

In [15]:
# data = datas[119]
# question = data[question_string]
# topics = get_topics(data['topic_entity'])
question = "What is the state where the team whose fight song is \"Renegade\" is from?"
topics = {'m.06c78r': 'Renegade'}
paths = {topics[topic]: {} for topic in topics}
print(question)

What is the state where the team whose fight song is "Renegade" is from?


In [16]:
for topic in topics:
    topic_name = topics[topic]
    for l in range(1, args.depth+1):
        if l == 1:
            relations = get_relations(question, topic, topic_name, args)
            entities = get_entities({topic: topic_name}, relations, topic)
        else:
            relations = get_relations_distant(question, topic, topic_name, relations, paths[topic_name], args)
            entities = get_entities_distant(paths[topic_name], relations, topic)
        [paths[topic_name].update({r: {"entities": entities[i]}}) for i, r in enumerate(relations)]
        paths = propagate(question, topic_name, relations, paths, args)
    # clean paths
    [paths[topic_name].update({r: paths[topic_name][r]['fact']}) for r in paths[topic_name]]

Given the question, we have the topic of the question and its relations.

question: What is the state where the team whose fight song is "Renegade" is from?
topic: Renegade

Based on the question, please select top 3 relations from the options below to explore about the topic to answer the question and just return top 3 selected relations in a numbered list without explanation.
options: music.composition.composer, music.composition.recordings, music.composition.recorded_as_album, sports.fight_song.sports_team


Only return relations from the ones in the options given.
Here are the top 3 relations to explore:

1. sports.fight_song.sports_team
2. music.composition.recordings
3. music.composition.recorded_as_album
Given the question, we have 3 facts about its topic and related relation that may helpful to answer the question.

question: What is the state where the team whose fight song is "Renegade" is from?
topic: Renegade

Based on the question, please summarize each following fact whil

In [17]:
facts = construct_facts(paths, topics, args, True)
prompt = question_prompt.format(facts, question) 
response = run_llm(prompt, args)
output = {"question": question, "result": response, "paths": paths}

Based on the given the facts and your own knowledge, please the answer the question as simple as possible and only return all the possible answers in a numbered list. 

facts: 
Here are some facts about topic Renegade that may related to the question.
1. The Renegade has multiple recordings.

2. The Renegade is an album.

3. The Renegade is the fight song of the Pittsburgh Steelers.
	3.1. The Pittsburgh Steelers' fight song is not "Black and Yellow", "Here We Go", or "Steelers Polka".
		3.1.1. (Not relevant to the question, so omitted)
	3.2. The Pittsburgh Steelers is located in Pittsburgh.
		3.2.1. The Pittsburgh Steelers is in Pittsburgh.
		3.2.2. Pittsburgh is in Pennsylvania.
	3.3. (Not relevant to the question)

4. "Renegade" is a 1979 hit song recorded by the American rock band Styx. It was on their Pieces of Eight album. It reached #16 on the Billboard Hot 100 in the spring of 1979.
The song is a first-person narrative of an outlaw, captured for a bounty, who recognizes that he 

In [None]:
save_2_jsonl("lmp_{}_{}_3hop.jsonl".format(args.dataset, args.llm), output)

In [None]:
paths

In [None]:
get_propagate_list(topic_name, paths[topic_name], args.limit)