<a href="https://colab.research.google.com/github/vblagoje/notebooks/blob/main/haystack2x-demos/haystack_rag_firecrawl_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Introduction

This notebook provides a detailed guide on leveraging the upcoming Haystack 2.3 and the Firecrawl scrape service for implementing an OpenAPI service-based Retriever-Augmented Generation (RAG) workflow. It explains how to process user queries by scraping web pages and using the retrieved text data to answer questions, a classic RAG on the web use case.

The pipeline starts by fetching the OpenAPI specification for Firecrawl and converting it into OpenAI function-calling definitions. Firecrawl scrapes relevant information from the web based on user input, providing structured markdown for easy integration with LLMs.

You'll be guided through installing necessary libraries, selecting an LLM provider, and constructing a Haystack 2.0 pipeline that orchestrates function calling, service requests, and LLM response generation.

Note: A Firecrawl account is required to run this pipeline. Signing up is straightforward and provides access to their scraping capabilities.

For more details on Firecrawl and its features, refer to the [Firecrawl documentation](https://docs.firecrawl.dev/introduction).

## 1. Setup

Let's install necessary libraries and import key modules to build the foundation for the subsequent steps.

In [1]:
!pip install -q cohere-haystack git+https://github.com/deepset-ai/haystack-experimental.git@openapi

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m173.8/173.8 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m345.2/345.2 kB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.2/139.2 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m325.5/325.5 kB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.3/41.3 kB[0m [31m2.5 MB/s[0m et

In [2]:
import getpass
import os
import json
import requests

from typing import Dict, Any, List

from haystack import Pipeline
from haystack.components.generators.utils import print_streaming_chunk
from haystack.components.converters import OutputAdapter
from haystack.dataclasses import ChatMessage, ByteStream
from haystack.utils import Secret

from haystack_integrations.components.generators.cohere import CohereChatGenerator
from haystack_experimental.components.tools.openapi import OpenAPITool, LLMProvider

## 2.  Enter the API keys for LLM provider and serper.dev

In [3]:
llm_api_key = getpass.getpass(f"Enter Cohere api key for:")
firecrawl_dev_key = getpass.getpass("Enter Firecrawl api key:")

Enter Cohere api key for:··········
Enter Firecrawl api key:··········


## 3. Build our RAG Web QA pipeline



In [10]:
pipe = Pipeline()
pipe.add_component("firecrawl", OpenAPITool(generator_api=LLMProvider.COHERE,
                                            generator_api_params={"model":"command-r",
                                                                  "api_key":Secret.from_token(llm_api_key)},
                                            spec="https://raw.githubusercontent.com/mendableai/firecrawl/main/apps/api/openapi.json",
                                            credentials=Secret.from_token(firecrawl_dev_key)))

pipe.add_component("final_prompt_adapter", OutputAdapter("{{system_message + service_response}}", List[ChatMessage]))
pipe.add_component("llm", CohereChatGenerator(api_key=Secret.from_token(llm_api_key), model="command-r",
                                              generation_kwargs={"max_tokens": 1024},
                                              streaming_callback=print_streaming_chunk))


pipe.connect("firecrawl", "final_prompt_adapter.service_response")
pipe.connect("final_prompt_adapter", "llm.messages")

<haystack.core.pipeline.pipeline.Pipeline object at 0x79fdbec05b70>
🚅 Components
  - firecrawl: OpenAPITool
  - final_prompt_adapter: OutputAdapter
  - llm: CohereChatGenerator
🛤️ Connections
  - firecrawl.service_response -> final_prompt_adapter.service_response (List[ChatMessage])
  - final_prompt_adapter.output -> llm.messages (List[ChatMessage])

As you can see in the pipeline graph above, for a given firecrawl request, our pipeline follows these steps:

1. **Fetch OpenAPI Spec**: Retrieve the OpenAPI specification for firecrawl and convert it into OpenAI function-calling definitions.

2. **Determine Parameters**: Use a function-calling model to identify the necessary parameters for the firecrawl service.

3. **Invoke firecrawl**: Dispatch the request to firecrawl with the determined parameters and gather the responses.

4. **Compile Results**: Organize and format the search results from firecrawl.

5. **Generate Response**: Combine the system prompt, user query, and compiled search results, then pass them to LLM to generate the final response.

In [16]:
user_prompt = "Given the article below, answer: What's the main takeaway from this article? Be insightful and elaborate, base your answer on the article only."

In [17]:
result = pipe.run(data={"firecrawl": {"messages": [ChatMessage.from_user("Scrape https://haystack.deepset.ai/blog/rag-evaluation-with-prometheus-2")]},
                        "final_prompt_adapter": {"system_message": [ChatMessage.from_user(user_prompt)]}})


The main takeaway from the article is the introduction of Prometheus 2, a cutting-edge open-source model specifically designed for evaluating the output of language models. What sets Prometheus 2 apart is its ability to bridge the gap between proprietary models and open LMs for evaluation purposes. The model is trained to perform direct assessments and pairwise rankings, which makes it a versatile tool for evaluating language models.

The article also serves as a comprehensive guide on how to leverage Prometheus 2 within the Haystack framework. Haystack users can now create custom evaluators using Prometheus 2, enabling them to assess their models' performance along various dimensions, such as correctness, response relevance, and logical robustness. This enhancement empowers developers to fine-tune their language models and RAG pipelines, ensuring they deliver accurate and meaningful responses.

By employing Prometheus 2 as an evaluator, users can benefit from a robust evaluation proce