# Building Fallbacks to Websearch with Conditional Routing

Retrieval Step is crucial in RAG applications, if retrieval step does not have access to enough information or document, this render the retrieval to be limited. A practical solution is to fall back to Web Search. Routing allows us to do this, where we leverage web as the data source on some conditions.

We will create a pipeline that has conditional routing that directs the query to a `web-based RAG` route if the answer is not found in the initially given documents.

In [13]:
import os
from dotenv import load_dotenv

load_dotenv()

if os.getenv("OPENAI_API_KEY") is None:
    raise ValueError("OPENAI_API_KEY is not present")


if os.getenv("SERPERDEV_API_KEY") is None:
    raise ValueError("Serper API Key is not present")

#### Creating a Document

This is a document about Munich, where the answer to the question will be initally searched.

In [14]:
from haystack.dataclasses import Document

documents = [
    Document(content="""
             Munich, the vibrant capital of Bavaria in southern Germany, exudes a perfect blend of rich cultural
                                heritage and modern urban sophistication. Nestled along the banks of the Isar River, Munich is renowned
                                for its splendid architecture, including the iconic Neues Rathaus (New Town Hall) at Marienplatz and
                                the grandeur of Nymphenburg Palace. The city is a haven for art enthusiasts, with world-class museums like the
                                Alte Pinakothek housing masterpieces by renowned artists. Munich is also famous for its lively beer gardens, where
                                locals and tourists gather to enjoy the city's famed beers and traditional Bavarian cuisine. The city's annual
                                Oktoberfest celebration, the world's largest beer festival, attracts millions of visitors from around the globe.
                                Beyond its cultural and culinary delights, Munich offers picturesque parks like the English Garden, providing a
                                serene escape within the heart of the bustling metropolis. Visitors are charmed by Munich's warm hospitality,
                                making it a must-visit destination for travelers seeking a taste of both old-world charm and contemporary allure.
             """)
]

#### Creating Initial Pipeline Components

Below we instruct the model to return with `no_answer` if it does not know the answer to the question. This works well with the model `gpt-3.5-turbo`. If we use another Generator we need to ensure the prompt is clear for the model to obey such instructions.

In [15]:
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import OpenAIGenerator

prompt_template = """
Answer the following query given the documents.
If the answer is not container within the document reply with 'no_answer'.
Query: {{query}}
Documents:
{% for document in documents %}
    {{ document.content }}
{% endfor %}
"""

prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator(model="gpt-3.5-turbo")

#### Initializing the Web Search Components

In [16]:
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.components.websearch.serper_dev import SerperDevWebSearch

prompt_for_websearch = """
Answer the following query given the documents retrievd from the web.
Your answer should indicate that your answer was generated from websearch.

Query: {{ query }}
Documents:
{% for document in documents %}
    {{ document.content }}
{% endfor %}
"""

websearch = SerperDevWebSearch()
prompt_builder_for_websearch = PromptBuilder(template=prompt_for_websearch)
llm_for_websearch = OpenAIGenerator(model="gpt-3.5-turbo")

#### Creating the ConditionalRouter

`ConditionalRouter` handles data routing on specific conditions. The condition is a Jinja2 string expression that determines if the route is selected. 

In this example, we have two routes
1. If the LLM replies with the `no_answer` keyword, the pipeline should perform web search
2. Otherwise, the given documents are enough for an answer and pipeline execution ends here.

In [17]:
from haystack.components.routers import ConditionalRouter

routes = [
    {  
        "condition": "{{ 'no_answer' in replies[0] }}",
        "output": "{{ query }}", # we pass the query as the input of the next pipeline
        "output_name": "go_to_websearch",
        "output_type": str
    },
    {
        "condition": "{{ 'no_answer' not in replies[0] }}",
        "output": "{{ replies[0] }}", # we pass the answer instead
        "output_name": "answer",
        "output_type": str
    }
]

router = ConditionalRouter(routes)

#### Building the Pipeline

Add all components to the pipeline and connect them!

`go_to_websearch` output of the router should be connected to the `websearch` to retrieve documents from the web and also to `prompt_builder_for_websearch` to use in the prompt.

In [18]:
from haystack import Pipeline

pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.add_component("router", router)
pipe.add_component("websearch", websearch)
pipe.add_component("prompt_builder_for_websearch", prompt_builder_for_websearch)
pipe.add_component("llm_for_websearch", llm_for_websearch)

pipe.connect("prompt_builder", "llm")
pipe.connect("llm.replies", "router.replies")
pipe.connect("router.go_to_websearch", "websearch.query")
pipe.connect("router.go_to_websearch", "prompt_builder_for_websearch.query")
pipe.connect("websearch.documents", "prompt_builder_for_websearch.documents")
pipe.connect("prompt_builder_for_websearch", "llm_for_websearch")

<haystack.core.pipeline.pipeline.Pipeline object at 0x7d548b152c90>
🚅 Components
  - prompt_builder: PromptBuilder
  - llm: OpenAIGenerator
  - router: ConditionalRouter
  - websearch: SerperDevWebSearch
  - prompt_builder_for_websearch: PromptBuilder
  - llm_for_websearch: OpenAIGenerator
🛤️ Connections
  - prompt_builder.prompt -> llm.prompt (str)
  - llm.replies -> router.replies (List[str])
  - router.go_to_websearch -> websearch.query (str)
  - router.go_to_websearch -> prompt_builder_for_websearch.query (str)
  - websearch.documents -> prompt_builder_for_websearch.documents (List[Document])
  - prompt_builder_for_websearch.prompt -> llm_for_websearch.prompt (str)

##### Visualize the Pipeline
This will create a pipe.png file in the directory

In [19]:
pipe.draw("pipe.png")

#### Running the Pipeline

In [27]:
query = "Where is Munich?"

result = pipe.run({
    "prompt_builder": {
        "query": query,
        "documents": documents
    },
    "router": {
        "query": query
    }
})

# Print the `answer` coming from the ConditionalRouter
# We have `router` key, hence the result is there
print(result['router']['answer'])

Munich is located in southern Germany.


In [28]:
result

{'llm': {'meta': [{'model': 'gpt-3.5-turbo-0125',
    'index': 0,
    'finish_reason': 'stop',
    'usage': {'completion_tokens': 9,
     'prompt_tokens': 270,
     'total_tokens': 279}}]},
 'router': {'answer': 'Munich is located in southern Germany.'},
 'llm_for_websearch': {'replies': ['I am happy to help with your query. Could you please provide me with the documents retrieved from the web so I can generate an answer for you?'],
  'meta': [{'model': 'gpt-3.5-turbo-0125',
    'index': 0,
    'finish_reason': 'stop',
    'usage': {'completion_tokens': 30,
     'prompt_tokens': 39,
     'total_tokens': 69}}]}}

In [32]:
query = "How many people live in Munich?"

result = pipe.run({
    "prompt_builder": {
        "query": query,
        "documents": documents
    },
    "router": {
        "query": query
    }
})



# Print the `replies` generated using the web searched Documents
# The `websearch` key is present and the `router`` key is absent
print(result['llm_for_websearch']['replies'])

['According to the documents retrieved from the web, the population of Munich varies slightly depending on the source and year, but the estimates range from around 1.35 million to 1.59 million as of 2024.']


In [33]:
result

{'llm': {'meta': [{'model': 'gpt-3.5-turbo-0125',
    'index': 0,
    'finish_reason': 'stop',
    'usage': {'completion_tokens': 2,
     'prompt_tokens': 273,
     'total_tokens': 275}}]},
 'websearch': {'links': ['https://en.wikipedia.org/wiki/Munich',
   'https://worldpopulationreview.com/world-cities/munich-population',
   'https://www.macrotrends.net/global-metrics/cities/204371/munich/population',
   'https://en.wikipedia.org/wiki/Demographics_of_Munich',
   'https://www.statista.com/statistics/505774/munich-population/',
   'https://www.britannica.com/place/Munich-Bavaria-Germany',
   'https://www.statista.com/statistics/519723/munich-population-by-age-group/',
   'https://eurocities.eu/cities/munich/',
   'https://www.ricksteves.com/watch-read-listen/read/articles/munich-a-metropolis-with-smalltown-charm',
   'https://www.quora.com/How-many-people-live-in-Munich']},
 'llm_for_websearch': {'replies': ['According to the documents retrieved from the web, the population of Munich v