# Building pipelines that perform routing



## Conditional routing

This example will route the query to different processing paths, such as checking the query length and checking if the query contains specific keywords.

Scenario: Keyword Detection Routing:

* If the query contains the keyword "capital", it routes to a query generation component that fetches city names.
* If the query doesn't contain "capital", it goes to a general information retrieval system that can provide a broader response.

We'll build this example using a conditional router that handles both conditions based on query length and keyword detection.

In [9]:
from haystack import Pipeline
from haystack.components.routers import ConditionalRouter
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from dotenv import load_dotenv
import os
from haystack.utils import Secret

load_dotenv('.env')

openai_api_key = os.getenv("OPENAI_API_KEY")

# Example of a pipeline with a ConditionalRouter that routes 
# queries based on their keyword presence

# Define the routes based on query length and keyword check
routes = [

    {
        "condition": "{{'capital' in query}}",  # Check if the query contains the keyword "capital"
        "output": "{{query}}",  # Proceed with the query if "capital" is in the query
        "output_name": "capital_related_query",
        "output_type": str,
    },
    {
        "condition": "{{'capital' not in query}}",  # Otherwise, handle general queries
        "output": "This is a general query: {{query}}",
        "output_name": "general_query",
        "output_type": str,
    }
]

# Create the router
router = ConditionalRouter(routes=routes)

# Create the pipeline with components
pipe = Pipeline()

# Add the router, prompt builder, document retriever, and generator
pipe.add_component("router", router)
pipe.add_component("prompt_builder", PromptBuilder("Answer the following query: {{query}}"))
pipe.add_component("generator", OpenAIGenerator( api_key= Secret.from_env_var("OPENAI_API_KEY"),
))

# Connect the components
pipe.connect("router.capital_related_query", "prompt_builder.query")
pipe.connect("prompt_builder", "generator")


<haystack.core.pipeline.pipeline.Pipeline object at 0x14fbb7f50>
🚅 Components
  - router: ConditionalRouter
  - prompt_builder: PromptBuilder
  - generator: OpenAIGenerator
🛤️ Connections
  - router.capital_related_query -> prompt_builder.query (str)
  - prompt_builder.prompt -> generator.prompt (str)

In [10]:
# Example 1: A short query that triggers a warning
result_short = pipe.run(data={"router": {"query": "Berlin"}})
print(result_short)
# Expected output: {'router': {'short_query_warning': 'Query is too short: Berlin'}}


{'router': {'general_query': 'This is a general query: Berlin'}}


In [11]:
# Example 2: A longer query containing the keyword "capital"
result_long_capital = pipe.run(data={"router": {"query": "What is the capital of France?"}})
print(result_long_capital)
# Expected output: {'generator': {'replies': ['The capital of France is Paris.']}}


{'generator': {'replies': ['The capital of France is Paris.'], 'meta': [{'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 7, 'prompt_tokens': 19, 'total_tokens': 26, 'completion_tokens_details': CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), 'prompt_tokens_details': PromptTokensDetails(audio_tokens=0, cached_tokens=0)}}]}}


In [12]:
# Example 3: A longer query without the keyword "capital"
result_long_general = pipe.run(data={"router": {"query": "Tell me about the Eiffel Tower."}})
print(result_long_general)
# Expected output: {'generator': {'replies': ['The Eiffel Tower is a famous landmark in Paris.']}}


{'router': {'general_query': 'This is a general query: Tell me about the Eiffel Tower.'}}


## Routing based on file type

This router is useful for routing different file types (e.g., plain text, PDF, images, audio) to different components based on their MIME types. In this example, we will:

Route file paths:
Plain text files will be processed using a converter and splitter, while PDF files will be routed separately and converted into documents using a PDF converter.
Unclassified files (files that don't match any MIME types provided) will be handled as an "unclassified" category.

Scenario:
We have the following files:

* A plain text file: example.txt
* An image file: image.jpg


The objective is to:

* Convert the plain text file into a document.
* Skip the image file as it is unclassified and doesn't match any of the specified MIME types.

We will use the `FileTypeRouter` to route these files to their respective processing paths.

In [13]:
from haystack import Pipeline
from haystack.components.routers import FileTypeRouter
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.converters import TextFileToDocument
from haystack.components.preprocessors import DocumentSplitter
from haystack.components.writers import DocumentWriter

# Create an in-memory document store to hold the processed documents
document_store = InMemoryDocumentStore()

# Initialize the pipeline
pipeline = Pipeline()

# Add the FileTypeRouter that routes only 'text/plain' and 'application/pdf'
pipeline.add_component(instance=FileTypeRouter(mime_types=["text/plain", "application/pdf"]), name="file_type_router")

# Add components for text file conversion and PDF conversion
pipeline.add_component(instance=TextFileToDocument(), name="text_file_converter")

# Add components for splitting and writing documents
pipeline.add_component(instance=DocumentSplitter(), name="splitter")
pipeline.add_component(instance=DocumentWriter(document_store=document_store), name="writer")

# Connect components in the pipeline
pipeline.connect("file_type_router.text/plain", "text_file_converter.sources")
pipeline.connect("text_file_converter.documents", "splitter.documents")
pipeline.connect("splitter.documents", "writer.documents")

<haystack.core.pipeline.pipeline.Pipeline object at 0x14fc01910>
🚅 Components
  - file_type_router: FileTypeRouter
  - text_file_converter: TextFileToDocument
  - splitter: DocumentSplitter
  - writer: DocumentWriter
🛤️ Connections
  - file_type_router.text/plain -> text_file_converter.sources (List[Union[str, Path, ByteStream]])
  - text_file_converter.documents -> splitter.documents (List[Document])
  - splitter.documents -> writer.documents (List[Document])

In [14]:
# Run the pipeline with a list of file paths
result = pipeline.run({"file_type_router": {"sources": ["example.txt", "image.jpeg"]}})

print(result)

{'file_type_router': {'unclassified': [PosixPath('image.jpeg')]}, 'writer': {'documents_written': 1}}


## Routers for text classification


In [20]:
from haystack.components.routers import TransformersTextRouter
import pandas as pd

### Use case 1: classifying whether a query is a statement or question

Using the `shahrukhx01/question-vs-statement-classifier` model from Hugging Face we can classify whether a question is in statement form or query form

In [21]:
text_router = TransformersTextRouter(model="shahrukhx01/question-vs-statement-classifier")
text_router.warm_up()

queries = [
    "Who was the father of Arya Stark",  # Interrogative Query
    "Lord Eddard was the father of Arya Stark",  # Statement Query
]

results = {"Query": [], "Output Branch": [], "Class": []}

for query in queries:
    result = text_router.run(text=query)
    results["Query"].append(query)
    results["Output Branch"].append(next(iter(result)))
    results["Class"].append("Question" if next(iter(result)) == "LABEL_1" else "Statement")

pd.DataFrame.from_dict(results)



Unnamed: 0,Query,Output Branch,Class
0,Who was the father of Arya Stark,LABEL_1,Question
1,Lord Eddard was the father of Arya Stark,LABEL_0,Statement


### Sentiment classification

Using the `cardiffnlp/twitter-roberta-base-sentiment` model from  HF we will classify sentiments in statements.

In [23]:
text_router = TransformersTextRouter(model="cardiffnlp/twitter-roberta-base-sentiment")
text_router.warm_up()

In [24]:

queries = [
    "What's the answer?",  # neutral query
    "Would you be so lovely to tell me the answer?",  # positive query
    "Can you give me the damn right answer for once??",  # negative query
]

sent_results = {"Query": [], "Output Branch": [], "Class": []}

for query in queries:
    result = text_router.run(text=query)
    sent_results["Query"].append(query)
    sent_results["Output Branch"].append(next(iter(result)))
    sent_results["Class"].append({"LABEL_0": "negative", "LABEL_1": "neutral", "LABEL_2":"positive"}.get(next(iter(result)), "Unknown"))

pd.DataFrame.from_dict(sent_results)



Unnamed: 0,Query,Output Branch,Class
0,What's the answer?,LABEL_1,neutral
1,Would you be so lovely to tell me the answer?,LABEL_2,positive
2,Can you give me the damn right answer for once??,LABEL_0,negative


### Zero-Shot Classification with `TransformersZeroShotTextRouter`

TransformersZeroShotTextRouter let's you perform zero-shot classification by providing a suitable base transformer model and defining the classes the model should predict.

In [25]:
from haystack.components.routers import TransformersZeroShotTextRouter

text_router = TransformersZeroShotTextRouter(
    model="MoritzLaurer/deberta-v3-large-zeroshot-v2.0",
    labels=["spam", "not spam"])
text_router.warm_up()

In [26]:
queries = [
    "This is an urgent request to transfer funds as the CEO is facing an emergency, click on this link to proceed",  # spam
    "This is an urgent request to resolve the a power outage at the building, please respond as soon as possible",  # not spam
]

results = {"Query": [], "Output Branch": []}

for query in queries:
    result = text_router.run(text=query)
    results["Query"].append(query)
    results["Output Branch"].append(next(iter(result)))

pd.DataFrame.from_dict(results)

Unnamed: 0,Query,Output Branch
0,This is an urgent request to transfer funds as...,spam
1,This is an urgent request to resolve the a pow...,not spam


More on routers

https://docs.haystack.deepset.ai/docs/routers