# Experiment with retrieval

## Stages of querying

However, there is more to querying than initially meets the eye. Querying consists of three distinct stages:

- **Retrieval** is when you find and return the most relevant documents for your query from your Index. As previously discussed in indexing, the most common type of retrieval is "top-k" semantic retrieval, but there are many other retrieval strategies.
- **Postprocessing** is when the Nodes retrieved are optionally reranked, transformed, or filtered, for instance by requiring that they have specific metadata such as keywords attached.
- **Response synthesis** is when your query, your most-relevant data and your prompt are combined and sent to your LLM to return a response.

https://docs.llamaindex.ai/en/stable/examples/retrievers/ensemble_retrieval/


In [1]:
from llama_index.core import get_response_synthesizer, VectorStoreIndex
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor
from llama_index.core import StorageContext, load_index_from_storage

from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import tiktoken
enc = tiktoken.get_encoding("cl100k_base") # this is the encoding for GPT 3 and 4

llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large", embed_batch_size=512)

In [2]:
# rebuild storage context
storage_context_user_mauals = StorageContext.from_defaults(
    persist_dir="data/embeddings/user_manuals_512_text_splitter_oAI_large_embed"
)
# load index
index_user_manuals = load_index_from_storage(storage_context_user_mauals)

storage_context_forum_content = StorageContext.from_defaults(persist_dir="data/embeddings/512_text_splitter_oAI_large_embed")
# load index
index_forum_content = load_index_from_storage(storage_context_forum_content)

In [3]:
# configure retriever
retriever_user_manuals = VectorIndexRetriever(
    index=index_user_manuals,
    similarity_top_k=5,
)

retriever_forum_content = VectorIndexRetriever(
    index=index_forum_content,
    similarity_top_k=5,
)

In [4]:
from llama_index.core import PromptTemplate

template = (
    "You are an expert for BMW Cars. You will get technical questions regarding the 3er and 4er series, or you will solve the Problem, the User has regarding their BMW Car. \n"
    "You will ground your answer based on Chunks of Information from the User Manual and the Forum Content to answer the User's Question. \n"
    "---------------------\n"
    "Chunks of content from the User Manual (The information may be presented in a bad formatting with mistakes in the words): \n"
    "{user_maual_content}\n"
    "---------------------\n"
    "Chunks of content from the Forum Content (This Information is based on content from a Car Forum, so the Information may be incorrect and presented with bad grammar): \n"
    "{forum_content}\n"
    "---------------------\n"
    'Given this information, please answer the question or ask for the specific Information you need: """{query_str}"""\n'
)
prompt_tmpl = PromptTemplate(template)

## Make my own query Pipeline

because the QueryPipeline of LlamaIndex is to confusing

- first get the top 5 chunks of both indexes
- create a prompt from the prompt template
- generate answer with llm


In [5]:
def create_string_from_nodes(nodes):
    content = ""
    for i, n in enumerate(nodes):
        content += f"<START_CHUNK_{i}>"
        content += n.node.text
        content += f"<END_CHUNK_{i}>"
    return content

In [16]:
# make my own
user_query = """# Trouble with manual transmission

## Question of username#2110

Posted on 2023-03-15 12:58:00-04:00

Hi everyone, my 1996 320i e36 manual transmission is slightly hard to operate
and i recently found some golden dust in the gear oil (atf dexron II + bardhal
t&d), probably coming from the synchronizer rings.  
I have the feeling it might be the clutch not declutching properly.  
How can i test it?  
Opinions?  
What could cause the clutch malfunction?  
In the past I renewed the input and output cylinders, but for some time,
before realizing the failure, I might have drive not desengaging totally the
clutch, could the dust be coming from that period?  
Another minor malfunction appears while starting the car, i mean when I start
moving, it shakes for a little while.  
I also noticed that when I'm still, engine running, when engaging the first
gear i can hear a little bump from the differencial, kinda motorbike like...
it's always been like that, Is that normal?"""

user_maunal_nodes = retriever_user_manuals.retrieve(user_query)
forum_content_nodes = retriever_forum_content.retrieve(user_query)

# you can create text prompt (for completion API)
prompt = prompt_tmpl.format(
    user_maual_content=create_string_from_nodes(user_maunal_nodes),
    forum_content=create_string_from_nodes(forum_content_nodes),
    query_str=user_query,
)
print("the scores of the retrieved nodes are:")
for node in user_maunal_nodes:
    print(node.score)
for node in forum_content_nodes:
    print(node.score)
print(f"Total prompt length is {len(enc.encode(prompt))} tokens")
res = llm.complete(prompt)
print(res.text)

the scores of the retrieved nodes are:
0.572191879507313
0.5596483450904651
0.5537543356519133
0.5518693217530177
0.5470154969750757
0.7149026234089527
0.7017397502556473
0.7001378485854379
0.6964518343737596
0.6934841027005193
Total prompt length is 5160 tokens
Based on the information provided, it seems like there could be multiple issues affecting the operation of your manual transmission in your 1996 320i e36. Here are some insights based on the chunks of information:

1. **Clutch Malfunction**: The golden dust found in the gear oil could indeed be coming from the synchronizer rings, indicating potential wear. The feeling that the clutch is not declutching properly could be a sign of clutch issues. To test the clutch, you can perform a simple test by engaging the clutch and trying to shift gears smoothly. If there is resistance or grinding, it could indicate a problem with the clutch.

2. **Clutch Engagement**: The fact that you might have driven without fully disengaging the clutc

In [17]:
print(forum_content_nodes[0].text)

Thus, I bought the right fluid and filled it into the right reservoir,
correct? (Please see attached image).  
Please correct me if I'm wrong.  
  
Thank you in advance for trying to help, 82Eye!

### Post of username#6

  
it's good to make sure it is filled but it has nothing to do with either the
clutch or the transmission.

### Post of username#513

Thanks for your quick response, 82eye!  
Thus, what could be the cause of the problem of hard to shift gears after the
engine is started?

### Post of username#6

check your brake res off the brake booster. it takes the dot 4. the brakes and
clutch share a res. it's possible you are low on fluid there, but more likely
you are in need of a new clutch.  
  
the trans oil is filled from underneath the car, there are two plugs in the
side of the trans to drain and fill. i would expect more trouble than hard
shifting if it's low.

### Post of username#1616

Clutch pilot bearing is siezed/bound up and dragging so input shaft is not
disengagin

## try out the QueryPipeline
I was not able to make this work.

In [33]:
# configure response synthesizer
response_synthesizer = get_response_synthesizer()
response_synthesizer.get_prompts()

{'text_qa_template': SelectorPromptTemplate(metadata={'prompt_type': <PromptType.QUESTION_ANSWER: 'text_qa'>}, template_vars=['context_str', 'query_str'], kwargs={}, output_parser=None, template_var_mappings={}, function_mappings={}, default_template=PromptTemplate(metadata={'prompt_type': <PromptType.QUESTION_ANSWER: 'text_qa'>}, template_vars=['context_str', 'query_str'], kwargs={}, output_parser=None, template_var_mappings=None, function_mappings=None, template='Context information is below.\n---------------------\n{context_str}\n---------------------\nGiven the context information and not prior knowledge, answer the query.\nQuery: {query_str}\nAnswer: '), conditionals=[(<function is_chat_model at 0x000001A81F732F20>, ChatPromptTemplate(metadata={'prompt_type': <PromptType.CUSTOM: 'custom'>}, template_vars=['context_str', 'query_str'], kwargs={}, output_parser=None, template_var_mappings=None, function_mappings=None, message_templates=[ChatMessage(role=<MessageRole.SYSTEM: 'system'>

In [34]:
print(response_synthesizer.get_prompts()["text_qa_template"].get_template())

Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, answer the query.
Query: {query_str}
Answer: 


In [35]:
print(response_synthesizer.get_prompts()["refine_template"].get_template())

The original query is as follows: {query_str}
We have provided an existing answer: {existing_answer}
We have the opportunity to refine the existing answer (only if needed) with some more context below.
------------
{context_msg}
------------
Given the new context, refine the original answer to better answer the query. If the context isn't useful, return the original answer.
Refined Answer: 


In [36]:
# assemble query engine
query_engine = RetrieverQueryEngine(
    retriever=retriever_user_manuals,
    response_synthesizer=response_synthesizer,
    node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)],
)

## Build a Query Pipeline

To chain together two different indexes and to customize the prompts a Custom Query Pipeline is used.  

In [37]:
from llama_index.core import PromptTemplate

template = (
    "We have provided context information below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given this information, please answer the question: {query_str}\n"
)
prompt_tmpl = PromptTemplate(template)

# you can create text prompt (for completion API)
prompt = prompt_tmpl.format(context_str=..., query_str=...)

# or easily convert to message prompts (for chat API)
messages = prompt_tmpl.format_messages(context_str=..., query_str=...)

In [38]:
from llama_index.core.query_pipeline import QueryPipeline
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-3.5-turbo")

# sequential chain
p = QueryPipeline(chain=[prompt_tmpl, llm], verbose=True)

# DAG
p = QueryPipeline(verbose=True)
p.add_modules({"prompt_tmpl": prompt_tmpl, "llm": llm})
p.add_link("prompt_tmpl", "llm")

# run pipeline
p.run(prompt_key1="<input1>", ...)

SyntaxError: positional argument follows keyword argument (2256392507.py, line 14)

In [None]:
from llama_index.core.query_pipeline import QueryPipeline
from llama_index.llms.openai import OpenAI

# sequential chain
p = QueryPipeline(chain=[prompt_tmpl, llm], verbose=True)

# DAG
p = QueryPipeline(verbose=True)
p.add_modules({"prompt_tmpl": prompt_tmpl, "llm": llm})
p.add_link("prompt_tmpl", "llm")

# run pipeline
p.run(prompt_key1="<input1>", ...)

SyntaxError: positional argument follows keyword argument (4192935487.py, line 13)

In [49]:
from llama_index.core.query_pipeline import QueryPipeline
from llama_index.llms.openai import OpenAI

# define modules
template = (
    "We have provided context information below. \n"
    "---------------------\n"
    "{context_str}"
    "\n---------------------\n"
    "Given this information, please answer the question: {query_str}\n"
)
prompt_tmpl = PromptTemplate(template)
llm = OpenAI(model="gpt-3.5-turbo")


# define query pipeline
p = QueryPipeline(verbose=True)
p.add_modules(
    {
        "retriever_forum_content": retriever_forum_content,
        "prompt_tmpl": prompt_tmpl,
        "llm": llm,
    }
)
p.add_link("retriever_forum_content", "prompt_tmpl", dest_key="context_str")
p.add_link("prompt_tmpl", "llm")

In [60]:
print(retriever_forum_content.as_query_component().input_keys)
print(retriever_forum_content.as_query_component().output_keys)
print(prompt_tmpl.as_query_component().input_keys)
print(prompt_tmpl.as_query_component().output_keys)
print(llm.as_query_component().input_keys)
print(llm.as_query_component().output_keys)

required_keys={'input'} optional_keys=set()
required_keys={'output'}
required_keys={'query_str', 'context_str'} optional_keys=set()
required_keys={'prompt'}
required_keys={'messages'} optional_keys=set()
required_keys={'output'}


In [50]:
p.run(input="The alternator of my E36 is not working. What should I do?")

[1;3;38;2;155;135;227m> Running module retriever_forum_content with input: 
input: The alternator of my E36 is not working. What should I do?

[0m[1;3;38;2;155;135;227m> Running module prompt_tmpl with input: 
context_str: [NodeWithScore(node=TextNode(id_='612e444b-c2b2-4dd2-ba79-ce08884598f6', embedding=None, metadata={'thread_id': '2477166', 'category': '14-1991-1999-(E36)', 'thread_title': 'Replaced my alternator and...

[0m

ValueError: Required keys {'query_str', 'context_str'} are not present in input keys {'context_str'}