# TRACE 

This is a KG-enhanced reader model that employs Knowledge Graphs (KGs) to imporve generation performance: https://arxiv.org/abs/2406.11460.  

In [1]:
%pip install -q python-terrier accelerate pyterrier_t5
%pip install -q pyterrier_dr 
%pip install -q pyterrier_caching 
# %pip install -q pyterrier-rag

[0mNote: you may need to restart the kernel to use updated packages.
[0mNote: you may need to restart the kernel to use updated packages.
[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
import pyterrier as pt
import pyterrier_rag as ptr 

  from .autonotebook import tqdm as notebook_tqdm


## Retrieval Setup

Lets get a BM25 retriever from PyTerrier. We could also have used a fast Pisa retriever, via [PyTerrier_Pisa](https://github.com/terrierteam/pyterrier_pisa).

In [None]:
sparse_index = pt.Artifact.from_hf('pyterrier/raghpq-terrier')
bm25 = sparse_index.bm25(include_fields=['docno', 'text', 'title'])




Java started (triggered by TerrierIndex.index_ref) and loaded: pyterrier.java.colab, pyterrier.java, pyterrier.java.24, pyterrier.terrier.java [version=5.11 (build: craig.macdonald 2025-01-13 21:29), helper_version=0.0.8]


15:48:58.386 [main] WARN org.terrier.structures.BaseCompressingMetaIndex -- Structure meta reading data file directly from disk (SLOW) - try index.meta.data-source=fileinmem in the index properties file. 1.1 GiB of memory would be required.


# Reader Setup

We're going to use a Llama3-8B-Instruct model through Launcher API as the reader model for generating the final answer.

In [None]:
from fastchat.model import get_conversation_template
from pyterrier_rag.backend import OpenAIBackend 
from pyterrier_rag.prompt import PromptTransformer
from pyterrier_rag.readers import Reader

LAUNCHER_API_KEY = "put_your_key_here"  # replace with your key 

# let us define the prompt 
system_message = r"""
You are a question answering system that answers based strictly on the provided structured knowledge triples, without using any prior knowledge.

You will be given a list of reasoning paths (each is a sequence of knowledge triples), which together support answering the question.

Instructions:
1. Combine relevant facts across the triples logically.
2. If multiple paths are available, prioritize the top-3 scoring ones.
3. Provide a concise, factual answer. Do NOT explain your reasoning.
4. Do NOT say things like "Based on the context..." or "The chain indicates..."
5. Never hallucinate - only answer what is directly entailed by the reasoning paths.
"""
prompt_text = """
Question: {{ query }}

Reasoning Paths:
{{ qcontext }}

Answer:"""

template = get_conversation_template("meta-llama-3.1-sp")
prompt = PromptTransformer(
    conversation_template=template,
    system_message=system_message,
    instruction=prompt_text,
    api_type="openai"
)

# Now we define the LLM backend used in the reader model 
llm_backend = OpenAIBackend(
    model_id="llama-3-8b-instruct", 
    api_key=LAUNCHER_API_KEY,
    generation_args={"temperature":0.0, "max_tokens":256},
    base_url="http://api.terrier.org/v1"
)

# finally we can obtain a reader 
reader = Reader(llm_backend, prompt)

## TRACE model

TRACE model requires converting each retrieved document into a set of knowledge triples and then constructs KG-based reasoning chains from these triples to identify useful information, we will use the `KnowledgeGraphExtractor` and `ReasoningChainGenerator` to achieve this. 

In [5]:
from pyterrier_rag import KnowledgeGraphExtractor, ReasoningChainGenerator 
from pyterrier_dr import E5 
from pyterrier_caching import ScorerCache # optional 

kg_extractor = KnowledgeGraphExtractor(llm_backend) 
reasoning_chain_generator = ReasoningChainGenerator(
    llm_backend, 
    E5(), 
    dataset="hotpotqa",
    verbose=True
)

cache_path="/nfs/pyterrier/cache" 
kg_cache = ScorerCache(cache_path, kg_extractor, group=None, key="docno", value="knowledge_graph", pickle=True) # Cache the extraced KG triples for later use 
trace = kg_cache >> reasoning_chain_generator >> reader  

bm25_trace = (bm25%10) >> trace # TRACE pipeline with BM25 retrieval 

Initialized ReasoningChainGenerator with E5.base() ranking model


In [6]:
trace 

0,1,2
query,str,Query text
docno,str,(External Document ID) String ID of document in collection

0,1
path,/nfs/pyterrier/cache
group,query
key,docno
value,knowledge_graph
pickle,True
verbose,False

0,1,2
docno,str,(External Document ID) String ID of document in collection
text,str,Document text

0,1
dataset,hotpotqa
batch_size,10
num_exemplars,3
verbose,False

0,1,2
qid,str,(Query ID) ID of query in frame
prompt,,

0,1,2
qid,str,(Query ID) ID of query in frame
prompt,,
qanswer,str,Answer to the query

0,1,2
docno,str,(External Document ID) String ID of document in collection
text,str,Document text
knowledge_graph,,

0,1,2
query,str,Query text
docno,str,(External Document ID) String ID of document in collection
knowledge_graph,,

0,1,2
qid,str,(Query ID) ID of query in frame
query,str,Query text

0,1
model_name,intfloat/e5-base-v2
batch_size,32
text_field,text
verbose,False
device,cpu

0,1,2
qid,str,(Query ID) ID of query in frame
query,str,Query text
query_vec,np.array,Dense query vector

0,1
output_field,qanswer

0,1,2
prompt,,

0,1
input_field,prompt
output_field,qanswer
logprobs_field,
batch_size,4
max_new_tokens,
stop_sequences,
num_responses,1

0,1,2
qid,str,(Query ID) ID of query in frame
prompt,,

0,1,2
qid,str,(Query ID) ID of query in frame
prompt,,
qanswer,str,Answer to the query

0,1,2
prompt,,
qanswer,str,Answer to the query

0,1,2
qid,str,(Query ID) ID of query in frame
query,str,Query text
qcontext,str,Context to the query

0,1
instruction,<function prompt.<locals>.render at 0x7f75897019e0>
model_name_or_path,
system_message,"You are a question answering system that answers based strictly on the provided structured knowledge triples, without using any prior knowledge. You will be given a list of reasoning paths (each is a sequence of knowledge triples), which together support answering the question. Instructions: 1. Combine relevant facts across the triples logically. 2. If multiple paths are available, prioritize the top-3 scoring ones. 3. Provide a concise, factual answer. Do NOT explain your reasoning. 4. Do NOT say things like ""Based on the context..."" or ""The chain indicates..."" 5. Never hallucinate - only answer what is directly entailed by the reasoning paths."
conversation_template,"Conversation(name='one_shot', system_template='{system_message}', system_message='\nYou are a question answering system that answers based strictly on the provided structured knowledge triples, without using any prior knowledge.\n\nYou will be given a list of reasoning paths (each is a sequence of knowledge triples), which together support answering the question.\n\nInstructions:\n1. Combine relevant facts across the triples logically.\n2. If multiple paths are available, prioritize the top-3 scoring ones.\n3. Provide a concise, factual answer. Do NOT explain your reasoning.\n4. Do NOT say things like ""Based on the context..."" or ""The chain indicates...""\n5. Never hallucinate - only answer what is directly entailed by the reasoning paths.\n', roles=('Human', 'Assistant'), messages=[['Human', 'Got any creative ideas for a 10 year old’s birthday?'], ['Assistant', ""Of course! Here are some creative ideas for a 10-year-old's birthday party:\n1. Treasure Hunt: Organize a treasure hunt in your backyard or nearby park. Create clues and riddles for the kids to solve, leading them to hidden treasures and surprises.\n2. Science Party: Plan a science-themed party where kids can engage in fun and interactive experiments. You can set up different stations with activities like making slime, erupting volcanoes, or creating simple chemical reactions.\n3. Outdoor Movie Night: Set up a backyard movie night with a projector and a large screen or white sheet. Create a cozy seating area with blankets and pillows, and serve popcorn and snacks while the kids enjoy a favorite movie under the stars.\n4. DIY Crafts Party: Arrange a craft party where kids can unleash their creativity. Provide a variety of craft supplies like beads, paints, and fabrics, and let them create their own unique masterpieces to take home as party favors.\n5. Sports Olympics: Host a mini Olympics event with various sports and games. Set up different stations for activities like sack races, relay races, basketball shooting, and obstacle courses. Give out medals or certificates to the participants.\n6. Cooking Party: Have a cooking-themed party where the kids can prepare their own mini pizzas, cupcakes, or cookies. Provide toppings, frosting, and decorating supplies, and let them get hands-on in the kitchen.\n7. Superhero Training Camp: Create a superhero-themed party where the kids can engage in fun training activities. Set up an obstacle course, have them design their own superhero capes or masks, and organize superhero-themed games and challenges.\n8. Outdoor Adventure: Plan an outdoor adventure party at a local park or nature reserve. Arrange activities like hiking, nature scavenger hunts, or a picnic with games. Encourage exploration and appreciation for the outdoors.\nRemember to tailor the activities to the birthday child's interests and preferences. Have a great celebration!""]], offset=2, sep_style=<SeparatorStyle.ADD_COLON_SINGLE: 1>, sep='\n### ', sep2=None, stop_str='###', stop_token_ids=None)"
api_type,openai
output_field,prompt
input_fields,"['query', 'qcontext']"
expects_logprobs,False
answer_extraction,<bound method PromptTransformer.answer_extraction of <pyterrier_rag.prompt._base.PromptTransformer object at 0x7f768d670800>>
raw_instruction,False

0,1,2
prompt,,
qid,str,(Query ID) ID of query in frame
query,str,Query text
qcontext,str,Context to the query


# Evaluation 

Lets make a simple RAG pipeline for comparison. 

In [7]:
from pyterrier_rag.prompt import Concatenator

system_message2 = r"""You are an expert Q&A system that is trusted around the world.
        Always answer the query using the provided context information,
        and not prior knowledge.
        rules to follow:
        1. Not directly reference the given context in your answer
        2. Avoid statements like 'Based on the context, ...' or
        'The context information ...' or anything along those lines."""

prompt_text2 = """
Question: {{ query }}

Context information is:{{ qcontext }}

Answer:"""

template = get_conversation_template("meta-llama-3.1-sp")
prompt = PromptTransformer(
    conversation_template=template,
    system_message=system_message2,
    instruction=prompt_text2,
    api_type="openai"
)

# >>Concatenator()
bm25_reader = Reader(llm_backend, prompt=prompt)
bm25 = (bm25%10)>>Concatenator()>>bm25_reader

Now lets run a quick experiment using HotPotQA, comparing vanilla RAG model and TRACE. 

In [9]:
dataset = pt.get_dataset('rag:hotpotqa')
pt.Experiment(
    [bm25, bm25_trace],
    dataset.get_topics('dev').head(50),
    dataset.get_answers('dev'),
    [ptr.measures.F1, ptr.measures.EM],
    batch_size=25,
    verbose=True,
    precompute_prefix=True,
    names=['RAG', 'TRACE'],
    baseline=0
)

Precomputing results of 50 topics on shared pipeline component TerrierRetr(BM25)
  warn("precompute_prefix with batch_size is very experimental. Please report any problems")
pt.Experiment precomputation: 100%|██████████| 2/2 [00:03<00:00,  1.68s/batches]
pt.Experiment: 100%|██████████| 4/4 [04:24<00:00, 66.08s/batches]


Unnamed: 0,name,F1,EM,F1 +,F1 -,F1 p-value,EM +,EM -,EM p-value
0,RAG,0.164343,0.02,,,,,,
1,TRACE,0.281303,0.18,13.0,11.0,0.055328,8.0,0.0,0.003635
