# Building a Router from Scratch

In this tutorial, we show you how to build an LLM-powered router module that can route a user query to submodules.

Routers are a simple but effective form of automated decision making that can allowing you to perform dynamic retrieval/querying over your data.

In LlamaIndex, this is abstracted away with our [Router Modules](https://gpt-index.readthedocs.io/en/latest/core_modules/query_modules/router/root.html).

To build a router, we'll walk through the following steps:
- Crafting an initial prompt to select a set of choices
- Enforcing structured output (for text completion endpoints)
- Try integrating with a native function calling endpoint.

And then we'll plug this into a RAG pipeline to dynamically make decisions on QA vs. summarization.

## 1. Setup a Basic Router Prompt

At its core, a router is a module that takes in a set of choices. Given a user query, it "selects" a relevant choice.

For simplicity, we'll start with the choices as a set of strings.

In [1]:
from llama_index import PromptTemplate

choices = [
    "Useful for questions related to apples", 
    "Useful for questions related to oranges"
]

choices_str = "\n\n".join(choices)

router_prompt0 = PromptTemplate(
    "Some choices are given below. It is provided in a numbered "
    "list (1 to {num_choices}), "
    "where each item in the list corresponds to a summary.\n"
    "---------------------\n"
    "{context_list}"
    "\n---------------------\n"
    "Using only the choices above and not prior knowledge, return the top choices "
    "(no more than {max_outputs}, but only select what is needed) that "
    "are most relevant to the question: '{query_str}'\n"
)

Let's try this prompt on a set of toy questions and see what the output brings.

In [2]:
from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")

In [3]:
def get_formatted_prompt(query_str):
    fmt_prompt = router_prompt0.format(
        num_choices=len(choices),
        max_outputs=2,
        context_list=choices_str,
        query_str=query_str
    )
    return fmt_prompt

In [4]:
query_str = "Can you tell me more about the amount of Vitamin C in apples"
fmt_prompt = get_formatted_prompt(query_str)
response = llm.complete(fmt_prompt)

In [5]:
print(str(response))

1. Useful for questions related to apples


In [6]:
query_str = "What are the health benefits of eating orange peels?"
fmt_prompt = get_formatted_prompt(query_str)
response = llm.complete(fmt_prompt)

In [7]:
print(str(response))

Useful for questions related to oranges


In [8]:
query_str = "Can you tell me more about the amount of Vitamin C in apples and oranges."
fmt_prompt = get_formatted_prompt(query_str)
response = llm.complete(fmt_prompt)

In [9]:
print(str(response))

1. Useful for questions related to apples
2. Useful for questions related to oranges


**Observation**: While the response corresopnds to the correct choice, it is hard to parse into a structured output. For instance, the second query doesn't even return a number corresponding to the choice (while the first and third queries do). 

## 2. A Router Prompt that can generate structured outputs

Therefore the next step is to try to prompt the model to output a more structured representation (JSON). 

We define an output parser class (`RouterOutputParser`). This output parser will be responsible for both formatting the prompt and also parsing the result into a structured object (an `Answer`).

We then apply the `format` and `parse` methods of the output parser around the LLM call using the router prompt to generate a structured output.

### 2.a Import Answer Class

We load in the Answer class from our codebase. It's a very simple dataclass with two fields: `choice` and `reason`

In [22]:
from dataclasses import fields
from pydantic import BaseModel
import json

In [23]:
class Answer(BaseModel):
    choice: int
    reason: str

In [24]:
print(json.dumps(Answer.schema(), indent=2))

{
  "title": "Answer",
  "type": "object",
  "properties": {
    "choice": {
      "title": "Choice",
      "type": "integer"
    },
    "reason": {
      "title": "Reason",
      "type": "string"
    }
  },
  "required": [
    "choice",
    "reason"
  ]
}


### 2.b Define Router Output Parser

In [11]:
from llama_index.types import BaseOutputParser

In [12]:
FORMAT_STR = """The output should be formatted as a JSON instance that conforms to 
the JSON schema below. 

Here is the output schema:
{
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "choice": {
        "type": "integer"
      },
      "reason": {
        "type": "string"
      }
    },
    "required": [
      "choice",
      "reason"
    ],
    "additionalProperties": false
  }
}
"""

If we want to put `FORMAT_STR` as part of an f-string as part of a prompt template, then we'll need to escape the curly braces so that they don't get treated as template variables.

In [13]:
def _escape_curly_braces(input_string: str) -> str:
    # Replace '{' with '{{' and '}' with '}}' to escape curly braces
    escaped_string = input_string.replace("{", "{{").replace("}", "}}")
    return escaped_string
    

We now define a simple parsing function to extract out the JSON string from the LLM response (by searching for square brackets)

In [14]:
def _marshal_output_to_json(output: str) -> str:
    output = output.strip()
    left = output.find("[")
    right = output.find("]")
    output = output[left : right + 1]
    return output

We put these together in our `RouterOutputParser`

In [54]:
from typing import List

class RouterOutputParser(BaseOutputParser):
    def parse(self, output: str) -> List[Answer]:
        """Parse string."""
        json_output = _marshal_output_to_json(output)
        json_dicts = json.loads(json_output)
        answers = [Answer.from_dict(json_dict) for json_dict in json_dicts]
        return answers

    def format(self, prompt_template: str) -> str:
        return prompt_template + "\n\n" + _escape_curly_braces(FORMAT_STR)

### 2.c Give it a Try

We create a function called `route_query` that will take in the output parser, llm, and prompt template and output a structured answer.

In [55]:
output_parser = RouterOutputParser()

In [16]:
from typing import List

def route_query(
    query_str: str, 
    choices: List[str],
    output_parser: RouterOutputParser
):
    choices_str = "\n\n".join(choices)
    
    fmt_base_prompt = router_prompt0.format(
        num_choices=len(choices),
        max_outputs=len(choices),
        context_list=choices_str,
        query_str=query_str
    )
    fmt_json_prompt = output_parser.format(fmt_base_prompt)
    
    raw_output = llm.complete(fmt_json_prompt)
    parsed = output_parser.parse(str(raw_output))

    return parsed

NameError: name 'RouterOutputParser' is not defined

## 3. Perform Routing with a Function Calling Endpoint

In the previous section, we showed how to build a router with a text completion endpoint. This includes formatting the prompt to encourage the model output structured JSON, and a parse function to load in JSON.

This process can feel a bit messy. Function calling endpoints (e.g. OpenAI) abstract away this complexity by allowing the model to natively output structured functions. This obviates the need to manually prompt + parse the outputs. 

LlamaIndex offers an abstraction called a `PydanticProgram` that integrates with a function endpoint to produce a structured Pydantic object. We integrate with OpenAI and Guidance.

We redefine our `Answer` class with annotations, as well as an `Answers` class containing a list of answers.

In [27]:
from pydantic import Field

class Answer(BaseModel):
    "Represents a single choice with a reason."""
    choice: int
    reason: str

class Answers(BaseModel):
    """Represents a list of answers."""
    answers: List[Answer]

In [26]:
Answers.schema()

{'title': 'Answers',
 'description': 'Represents a list of answers.',
 'type': 'object',
 'properties': {'answers': {'title': 'Answers',
   'type': 'array',
   'items': {'$ref': '#/definitions/Answer'}}},
 'required': ['answers'],
 'definitions': {'Answer': {'title': 'Answer',
   'description': 'Represents a choice.',
   'type': 'object',
   'properties': {'choice': {'title': 'Choice', 'type': 'integer'},
    'reason': {'title': 'Reason', 'type': 'string'}},
   'required': ['choice', 'reason']}}}

In [31]:
from llama_index.program import OpenAIPydanticProgram

In [37]:
router_prompt1 = router_prompt0.partial_format(
    num_choices=len(choices),
    max_outputs=len(choices),
)

In [38]:
program = OpenAIPydanticProgram.from_defaults(
    output_cls=Answers,
    prompt=router_prompt1,
    verbose=True,
)

In [39]:
query_str = "What are the health benefits of eating orange peels?"
output = program(
    context_list=choices_str,
    query_str=query_str
)

Function call: Answers with args: {
  "answers": [
    {
      "choice": 2,
      "reason": "Useful for questions related to oranges"
    }
  ]
}


In [40]:
output

Answers(answers=[Answer(choice=2, reason='Useful for questions related to oranges')])

## 4. Plug Router Module as part of a RAG pipeline

In this section we'll put the router module to use in a RAG pipeline. We'll use it to dynamically decide whether to perform question-answering or summarization. Question-answering is performed through vector index retrieval, while summarization is performed through our summary index.