LLM application through query routing and long-context, a type of reflection flow based on the user query
===

RAG routing is based on the complexity of the query and the context. The routing follows these rules:

1. Simple query, direct answer, no routing.
2. Answerable question, but complex query, then routing for query decomposition.
3. Route to long-context query for unanswerable questions.

Origin paper: [Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach](https://arxiv.org/abs/2407.16833)

Implementation: LLama-Index


In [100]:
from rich.pretty import pprint as pp
from icecream import ic
import nest_asyncio
nest_asyncio.apply()

In [101]:
llm="models/gemini-1.5-flash"
embeds="models/text-embedding-004"
chunk_size=1024
data_url="https://hermesworld.com/de/karriere/jobs/Junior-Manager-mwd-HR-Controlling-de-j4364.html"
rerank_top_k=5
similarity_top_k=5
num_multi_steps=4
verbose=True
streaming=False

## Token counting

In [103]:
from llama_index.core.callbacks import TokenCountingHandler
import tiktoken

token_counter = TokenCountingHandler(
    tokenizer=tiktoken.encoding_for_model("gpt-3.5-turbo").encode
)

In [104]:
def print_token_counter(counter: TokenCountingHandler):
    pp(
        (
            "Embedding Tokens: ",
            counter.total_embedding_token_count,
            "LLM Prompt Tokens: ",
            counter.prompt_llm_token_count,
            "LLM Completion Tokens: ",
            counter.completion_llm_token_count,
            "Total LLM Token Count: ",
            counter.total_llm_token_count,
        )
    )

    counter.reset_counts()


print_token_counter(token_counter)

## LLama-Index setting

In [105]:
from llama_index.core import Settings
from llama_index.embeddings.gemini import GeminiEmbedding
from llama_index.llms.gemini import Gemini
from llama_index.core.callbacks import CallbackManager

Settings.llm = Gemini(model=llm, temperature=0)
Settings.embed_model = GeminiEmbedding(model_name=embeds)
Settings.callbacks = CallbackManager([token_counter])

In [106]:
from llama_index.readers.web import SimpleWebPageReader

documents = SimpleWebPageReader(html_to_text=True).load_data([data_url])
pp(documents)

## Setup step methods

### Top K RAG

In [107]:
from llama_index.core import VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.postprocessor import SentenceTransformerRerank

splitter = SentenceSplitter(chunk_size=chunk_size)
vector_index = VectorStoreIndex.from_documents(documents, transformations=[splitter])
# Background of retrieval: First is the indexing to create origin nodes scored (NodeWithScore), contents of nodes will be transformed into smaller chunks(scored nodes).
vector_query_engine = vector_index.as_query_engine(
    similarity_top_k=similarity_top_k,
    node_postprocessors=[
        SentenceTransformerRerank(top_n=rerank_top_k, model="BAAI/bge-reranker-base")
    ],
)

### Multi-step query

In [108]:
from llama_index.core.indices.query.query_transform.base import (
    StepDecomposeQueryTransform,
)
from llama_index.core.query_engine import MultiStepQueryEngine
from llama_index.core import get_response_synthesizer

synthesizer = get_response_synthesizer(
    response_mode="tree_summarize", streaming=streaming
)
step_decompose_transform = StepDecomposeQueryTransform(verbose=verbose)
multi_steps_query_engine = MultiStepQueryEngine(
    query_engine=vector_query_engine,
    query_transform=step_decompose_transform,
    response_synthesizer=synthesizer,
    num_steps=num_multi_steps,
)

### Long context query

In [109]:
from llama_index.core.llms.llm import LLM
from llama_index.core.query_engine import CustomQueryEngine
from typing import Any
from langchain import hub
from llama_index.core import PromptTemplate
from llama_index.core import get_response_synthesizer

prompt = PromptTemplate(hub.pull("hwchase17/llama-rag").template)


class VanillaQueryEngine(CustomQueryEngine):
    """RAG String Query Engine."""

    llm: LLM
    context:str

    def __call__(self, *args: Any, **kwds: Any) -> str:
        return self.custom_query(*args, **kwds)

    def custom_query(self, query_str: str) -> str:
        full_query = prompt.format(context=self.context, question=query_str)
        return str(self.llm.complete(full_query))


synthesizer = get_response_synthesizer(
    response_mode="tree_summarize", streaming=streaming
)
lc_query_engine = VanillaQueryEngine(response_synthesizer=synthesizer, llm=Settings.llm, context=documents[0].text)

## Query with methods

### Simple query for try

In [110]:
query = "Give a briefing of the job description and company information."

### Top K RAG

In [111]:
vc_res = vector_query_engine.query(query)
ic(vc_res.response)

ic| vc_res.response: ('The job description is for a (Junior) Manager (m/w/d) HR Controlling '
                      'position at Hermes Germany GmbH in Hamburg. The role involves creating '
                      'reports and presentations for internal and external stakeholders, supporting '
                      'HR projects, and contributing to the development of quality control '
                      'processes. The ideal candidate will have a degree in business, social '
                      'sciences, or human resource management, as well as experience in '
                      'internships, work-study programs, or part-time jobs. They should also be '
                      'comfortable working with data and have a strong interest in modern HR '
                      'practices. Hermes Germany is a leading logistics company that offers a '
                      'variety of benefits, including flexible work hours, a comprehensive training '
                      'program, and dis

'The job description is for a (Junior) Manager (m/w/d) HR Controlling position at Hermes Germany GmbH in Hamburg. The role involves creating reports and presentations for internal and external stakeholders, supporting HR projects, and contributing to the development of quality control processes. The ideal candidate will have a degree in business, social sciences, or human resource management, as well as experience in internships, work-study programs, or part-time jobs. They should also be comfortable working with data and have a strong interest in modern HR practices. Hermes Germany is a leading logistics company that offers a variety of benefits, including flexible work hours, a comprehensive training program, and discounts on various products and services. \n'

In [112]:
print_token_counter(token_counter)

### Multi-step query

In [113]:
ms_res = multi_steps_query_engine.query(query)
ic(ms_res.response)

[1;3;33m> Current query: Give a briefing of the job description and company information.
[0m[1;3;38;5;200m> New query: New question: **What is the job description and company information?** 

[0m

[1;3;33m> Current query: Give a briefing of the job description and company information.
[0m[1;3;38;5;200m> New query: New question: **What is the company information for Hermes Germany GmbH?** 

[0m[1;3;33m> Current query: Give a briefing of the job description and company information.
[0m[1;3;38;5;200m> New query: None 

[0m

ic| ms_res.response: ('The job is for a management position in human resources controlling at a '
                      'subsidiary of the Hermes Group. The position requires experience in finance, '
                      'controlling, and human resources. The team focuses on data and analytics '
                      'related to employees. The company offers a variety of services, including '
                      'package shop partnerships and becoming a contract partner. 
                     ')


'The job is for a management position in human resources controlling at a subsidiary of the Hermes Group. The position requires experience in finance, controlling, and human resources. The team focuses on data and analytics related to employees. The company offers a variety of services, including package shop partnerships and becoming a contract partner. \n'

In [114]:
def show_multi_steps(ms_res):
    sub_qa = ms_res.metadata["sub_qa"]
    tuples = [(t[0], t[1].response) for t in sub_qa]
    pp(tuples)

show_multi_steps(ms_res)

### Long context query

In [115]:
lc_res = lc_query_engine(query)
ic(lc_res)

ic| lc_res: ('Hermes Germany GmbH is looking for a (Junior) Manager (m/w/d) HR Controlling '
             'to join their team in Hamburg. The role involves creating reports and '
             'presentations for internal and external stakeholders, supporting HR '
             'processes and projects, and contributing new ideas and improvements. The '
             'ideal candidate will have a degree in business, social sciences, or human '
             'resource management, as well as some experience in internships, work-study '
             'programs, or part-time jobs. They should also be comfortable working with '
             'data and large numbers, have an interest in modern HR practices, and be '
             'strong in problem-solving, self-organization, and goal-orientation. Hermes '
             'offers a variety of benefits, including leading technologies, flexible work '
             'hours, a comprehensive training program, and discounts on otto.de and other '
             '

'Hermes Germany GmbH is looking for a (Junior) Manager (m/w/d) HR Controlling to join their team in Hamburg. The role involves creating reports and presentations for internal and external stakeholders, supporting HR processes and projects, and contributing new ideas and improvements. The ideal candidate will have a degree in business, social sciences, or human resource management, as well as some experience in internships, work-study programs, or part-time jobs. They should also be comfortable working with data and large numbers, have an interest in modern HR practices, and be strong in problem-solving, self-organization, and goal-orientation. Hermes offers a variety of benefits, including leading technologies, flexible work hours, a comprehensive training program, and discounts on otto.de and other shops. \n'

## Setup route query engine

### Tools

In [116]:
from llama_index.core.tools import QueryEngineTool

vector_query_engine_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description="Useful for questions that can be easily answered.",
)

multi_steps_query_engine_tool = QueryEngineTool.from_defaults(
    query_engine=multi_steps_query_engine,
    description="Useful for answerable but difficult questions that can be addressed by breaking down the question into multiple steps.",
)

lc_query_engine_tool = QueryEngineTool.from_defaults(
    query_engine=lc_query_engine,
    description="Useful for the unanswerable questions that require full context to obtain results.",
)

query_engine_tools = [
    vector_query_engine_tool,
    multi_steps_query_engine_tool,
    lc_query_engine_tool,
]

### Final route query engine

In [117]:
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors.llm_selectors import LLMSingleSelector
from llama_index.core.response_synthesizers import TreeSummarize

router_query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=query_engine_tools,
    summarizer=TreeSummarize(
        streaming=streaming,
        use_async=True,
        verbose=verbose,
    ),
    verbose=verbose,
)

#### Simple question

In [118]:
simply_query = "What kind of content do you have?"

In [119]:
route_res = router_query_engine.query(simply_query)
ic(route_res.response)

[1;3;38;5;200mSelecting query engine 0: The question 'What kind of content do you have?' can be answered directly with a list of content types..
[0m

ic| route_res.response: ('The content includes information about services, career opportunities, the '
                         'company, and sustainability. 
                        ')


'The content includes information about services, career opportunities, the company, and sustainability. \n'

In [120]:
lc_res = lc_query_engine.query(simply_query)
ic(lc_res)

ic| lc_res: Response(response='The content is about a job opening for a (Junior) Manager '
                              '(m/w/d) HR Controlling at Hermes Germany GmbH. 
            ',
                     source_nodes=[],
                     metadata=None)


Response(response='The content is about a job opening for a (Junior) Manager (m/w/d) HR Controlling at Hermes Germany GmbH. \n', source_nodes=[], metadata=None)

#### Normal question

In [121]:
normal_query = "Give a briefing of the job description and the company information."

In [122]:
route_res = router_query_engine.query(normal_query)
ic(route_res.response)

[1;3;38;5;200mSelecting query engine 1: The question requires breaking down into two parts: job description and company information. While both are answerable, they might require some research or effort..
[0m[1;3;33m> Current query: Give a briefing of the job description and the company information.
[0m[1;3;38;5;200m> New query: New question: **What is the job description and company information?** 

[0m[1;3;33m> Current query: Give a briefing of the job description and the company information.
[0m[1;3;38;5;200m> New query: New question: **What are the responsibilities of the (Junior) Manager (m/w/d) HR Controlling position at Hermes Germany GmbH?** 

[0m[1;3;33m> Current query: Give a briefing of the job description and the company information.
[0m[1;3;38;5;200m> New query: New question: **What are the requirements for the (Junior) Manager (m/w/d) HR Controlling position at Hermes Germany GmbH?** 

[0m[1;3;33m> Current query: Give a briefing of the job description and t

In [None]:
lc_res = lc_query_engine.query(normal_query)
ic(lc_res)

ic| lc_res: Response(response='Hermes Germany GmbH is looking for a (Junior) Manager '
                              '(m/w/d) HR Controlling to join their team in Hamburg. The '
                              'role involves creating reports and presentations for '
                              'internal and external stakeholders, supporting HR processes '
                              'and projects, and contributing new ideas. The ideal '
                              'candidate will have a degree in business, social sciences, '
                              'or human resource management, as well as some work '
                              'experience. Hermes offers a variety of benefits, including '
                              'flexible work hours, a comprehensive training program, and '
                              'discounts on otto.de and other shops. 
            ',
                     source_nodes=[],
                     metadata=None)


Response(response='Hermes Germany GmbH is looking for a (Junior) Manager (m/w/d) HR Controlling to join their team in Hamburg. The role involves creating reports and presentations for internal and external stakeholders, supporting HR processes and projects, and contributing new ideas. The ideal candidate will have a degree in business, social sciences, or human resource management, as well as some work experience. Hermes offers a variety of benefits, including flexible work hours, a comprehensive training program, and discounts on otto.de and other shops. \n', source_nodes=[], metadata=None)

#### Long and complex question

In [None]:
long_complex_query='What do you know? Please give me a briefing based on your knowledge, avoiding prior information, provide details as much as possible.'

In [None]:
route_res = router_query_engine.query(long_complex_query)
ic(route_res.response)

[1;3;38;5;200mSelecting query engine 2: The question asks for a briefing based on knowledge, avoiding prior information, and providing details. This suggests a need for a comprehensive understanding of the context, which aligns with choice 3..
[0m

ic| route_res.response: ('Hermes Germany GmbH is looking for a (Junior) Manager (m/w/d) HR Controlling '
                         'to join their team in Hamburg. The position is full-time and requires '
                         'experience in Finance, Controlling, and Human Resources. The team focuses on '
                         'data and analytics related to employees, such as analyzing the gender quota '
                         'in leadership positions and tracking sick leave trends. The role involves '
                         'creating reports, presentations, and supporting HR projects. The ideal '
                         'candidate has a degree in business, social sciences, or human resource '
                         'management, and prior experience in internships, work-study programs, or '
                         'part-time jobs. They should be comfortable working with data, interested in '
                         'modern HR practices, and possess strong problem-solving, 

'Hermes Germany GmbH is looking for a (Junior) Manager (m/w/d) HR Controlling to join their team in Hamburg. The position is full-time and requires experience in Finance, Controlling, and Human Resources. The team focuses on data and analytics related to employees, such as analyzing the gender quota in leadership positions and tracking sick leave trends. The role involves creating reports, presentations, and supporting HR projects. The ideal candidate has a degree in business, social sciences, or human resource management, and prior experience in internships, work-study programs, or part-time jobs. They should be comfortable working with data, interested in modern HR practices, and possess strong problem-solving, self-organization, and goal-oriented skills. Hermes offers a variety of benefits, including leading technologies, flexible work hours, a comprehensive training program, discounts on otto.de and other shops, and subsidized public transportation. \n'

In [None]:
lc_res = lc_query_engine.query(long_complex_query)
ic(lc_res)

ic| lc_res: Response(response='Hermes Germany GmbH is looking for a (Junior) Manager '
                              '(m/w/d) HR Controlling to join their team in Hamburg. The '
                              'position is full-time and requires experience in Finance, '
                              'Controlling, and Human Resources. The team focuses on data '
                              'and analytics related to employees, such as analyzing the '
                              'gender quota in leadership positions and tracking sick '
                              'leave trends. The role involves creating reports, '
                              'presentations, and supporting HR projects. The ideal '
                              'candidate has a degree in business, social sciences, or '
                              'human resource management, and prior experience in '
                              'internships, work-study programs, or part-time jobs. They '
                           

Response(response='Hermes Germany GmbH is looking for a (Junior) Manager (m/w/d) HR Controlling to join their team in Hamburg. The position is full-time and requires experience in Finance, Controlling, and Human Resources. The team focuses on data and analytics related to employees, such as analyzing the gender quota in leadership positions and tracking sick leave trends. The role involves creating reports, presentations, and supporting HR projects. The ideal candidate has a degree in business, social sciences, or human resource management, and prior experience in internships, work-study programs, or part-time jobs. They should be comfortable working with data, interested in modern HR practices, and possess strong problem-solving, self-organization, and goal-oriented skills. Hermes offers a variety of benefits, including leading technologies, flexible work hours, a comprehensive training program, discounts on otto.de and other shops, and subsidized public transportation. \n', source_no