# Session 3: Haystack Continued

This notebook is designed for **VS Code** and uses **Ollama** to run local LLM models.

**What youâ€™ll do**
- Get an Opensearch instance running locally with Podman
- Build a RAG Pipeline with Opensearch 
- Build Streamlit front-end and connect to Pipeline for queries
- Explore Hybrid Retrieval with Opensearch


Make sure to to run your Ollama models in a terminal window
```bash
ollama start
ollama pull llama3.2
ollama pull nomic-embed-text

## 1. Installing Podman

**For MAC OS**
```bash
#For MAC OS run this in a terminal window
brew install --cask podman-desktop #For the GUI application
#or
brew install podman #For the terminal version


**For Windows:** Download Using the Following link: https://podman-desktop.io/downloads/windows

## 2. Starting up an Opensearch Cluster

Run the following commands in the terminal:
```bash
#Start up the Podman Machine for the First Time
podman machine init
podman machine set --rootful #allows port binding without restrictions inside the VM
podman machine start

Run the following command to start up your Opensearch Cluster

```bash
  podman run \
    -p 9200:9200 \
    -p 9600:9600 \
    -e "discovery.type=single-node" \
    -e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=OSPassword246" \
    --name opensearch \
    docker.io/opensearchproject/opensearch:latest


Open a new Terminal window and run the following Curl command to check that your Opensearch Cluster is running and reachable:

```bash
curl -k -u admin:OSPassword246 https://localhost:9200 

## 3. Creating a RAG Ingestion Pipeline Connected to Opensearch
Use some wikepedia pages to ingest as content

In [None]:
# you can swap out these URLs with any public text URLs you like
PUBLIC_URLS = [
    "https://en.wikipedia.org/wiki/Yellow_warbler",
    "https://en.wikipedia.org/wiki/Natural_language_processing",
    "https://en.wikipedia.org/wiki/Bioluminescence"
]


The code below should create the ingestion pipeline and successfully output the required inputs to run it:

In [None]:
# We are going to use LinkContentFetcher and HTMLToDocument Components to fetch and convert the texts into Haystack Documents
from haystack import Pipeline
from haystack.components.preprocessors import DocumentSplitter
from haystack.components.writers import DocumentWriter
from haystack_integrations.document_stores.opensearch import OpenSearchDocumentStore
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.preprocessors import DocumentCleaner

from haystack.document_stores.types import DuplicatePolicy


#initialise all the components here:
# --- OpenSearch DocumentStore (local) ---
document_store = OpenSearchDocumentStore(
    hosts="http://localhost:9200",
    index="public_texts",
    use_ssl=True,
    verify_certs=False,
    http_auth=("admin", "OSPassword246"),
)
fetcher = LinkContentFetcher( user_agents=["ai-mutual-mentorship/0.1 (https://github.com/larry6point6/ai-mutual-mentorship-scheme)"]) # takes input of URL lists and outputs stream (a list of Bytestream objects)
# https://docs.haystack.deepset.ai/docs/linkcontentfetcher
converter = HTMLToDocument() # takes a list of Bytestream objects and outputs a list of Haystack Documents
# https://docs.haystack.deepset.ai/docs/htmltodocument
splitter = DocumentSplitter(split_by="word", split_length=200, split_overlap=15) # takes a list of Haystack Documents and splits them into smaller chunks
#https://docs.haystack.deepset.ai/docs/documentsplitter
cleaner = DocumentCleaner() # removes emtpy white lines extra spaces and repeated substrings by default (you can add custom cleaning parameters like remove_regex)
# https://docs.haystack.deepset.ai/docs/documentcleaner
writer = DocumentWriter(document_store=document_store, policy=DuplicatePolicy.SKIP)


# initalise the ingestion pipeline
ingestion_pipeline = Pipeline() 

# Add all the components to the pipeline
ingestion_pipeline.add_component(instance=fetcher, name="fetcher")
ingestion_pipeline.add_component(instance=converter, name="converter")
ingestion_pipeline.add_component(instance=cleaner, name="cleaner")
ingestion_pipeline.add_component(instance=splitter, name="splitter")
ingestion_pipeline.add_component(instance=writer, name="writer")

# Connect the inputs and outputs of the components together
ingestion_pipeline.connect("fetcher.streams", "converter") # When there is only one correct type of input/output these can be inferred
ingestion_pipeline.connect("converter", "cleaner")
ingestion_pipeline.connect("cleaner", "splitter")
ingestion_pipeline.connect("splitter", "writer")

## Print out the list of required inputs using the following command
ingestion_pipeline.inputs()


Run the ingestion pipeline

In [None]:

# --- Run the ingestion pipeline ---
# Run ingestion
result = ingestion_pipeline.run({"fetcher": {"urls": PUBLIC_URLS}})

print("Ingestion result:", result)
print("Documents in store:", document_store.count_documents())

## 4. Creating the RAG Retrieval Pipeline


We will now create the second part of the pipeline to run every time a user inputs a query. We will just use the Dense/Embedding retriever for this example. 

In [None]:
query= "What is bioluminescence?"

prompt_template = """
Given the following information, answer the question.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{ query }}?
"""

# See the prompt builder component here to learn more about prompt templates: https://docs.haystack.deepset.ai/docs/promptbuilder

In [None]:

from haystack import Pipeline
from haystack_integrations.document_stores.opensearch import OpenSearchDocumentStore
from haystack_integrations.components.retrievers.opensearch import (
    OpenSearchEmbeddingRetriever,
)
# Import OllamaTextEmbedder to embed the user query
from haystack_integrations.components.embedders.ollama import OllamaTextEmbedder
# Import the OllamaGenerator to generate the response to the user query with the retrieved documents as context
from haystack_integrations.components.generators.ollama import OllamaGenerator
# Import PromptBuilder to construct the new prompt with context for the generator
from haystack.components.builders import PromptBuilder

# Re-initialise the Opensearch document store as was done previously
document_store = OpenSearchDocumentStore(
    hosts="http://localhost:9200",
    index="public_texts",
    use_ssl=True,
    verify_certs=False,
    http_auth=("admin", "OSPassword246"),
)

# Initialise the embedding and generation models to embed the user query
EMBED_MODEL = "nomic-embed-text"
GENERATION_MODEL = "llama3.2"
OLLAMA_ENDPOINT = "http://localhost:11434"
TOP_K = 5

query_embedder = OllamaTextEmbedder(
    model=EMBED_MODEL,
    url=OLLAMA_ENDPOINT,
)

# Initialise the Ollama text generator component
response_generator = OllamaGenerator(
    model=GENERATION_MODEL,
    url=OLLAMA_ENDPOINT,
)

# Initialise the embedding retriever 
embedding_retriever = OpenSearchEmbeddingRetriever(
    document_store=document_store,
    top_k=TOP_K,
)


#Initialise the prompt builder
prompt_builder = PromptBuilder(
    template=prompt_template,
    required_variables=["query", "documents"],
)

In [None]:
# Create the RAG Pipeline and connect the components for the dense retrieval approach (query embedding + OpenSearch embedding retriever)
retrieval_pipeline = Pipeline()
retrieval_pipeline.add_component("query_embedder", query_embedder)
retrieval_pipeline.add_component("retriever", embedding_retriever)
retrieval_pipeline.add_component("prompt_builder", prompt_builder)
retrieval_pipeline.add_component("response_generator", response_generator)
retrieval_pipeline.connect("query_embedder.embedding", "retriever.query_embedding")
retrieval_pipeline.connect("retriever", "prompt_builder")
retrieval_pipeline.connect("prompt_builder", "response_generator")

# Print out the list of required inputs for the retrieval pipeline
retrieval_pipeline.inputs()

In [None]:
# Run the retrieval pipeline with the user query as input
result = retrieval_pipeline.run({"query_embedder": {"text": query}, "prompt_builder": {"query": query}},
                                include_outputs_from="retriever") # we need to give the query to both the query embedder and prompt builder

print(result)

In [None]:
# Pretty print just the response and retrieved documents
print("Generated response:", result["response_generator"]["replies"][0])

print("Retrieved documents:")
for doc in result["retriever"]["documents"]:
    print(doc)

## 5. Connecting a simple front-end

We will now export all of the pipeline code into retriever.py. Then run the streamlit app in app.py to run queries through our model.

## Bonus/Take Home Task

See if you can change the retrieval from Dense Embedding retrieval to Sparse BM25 or even Hybrid Retrieval

HINT: The BM25 retriever will not need a query embedding and the hybrid retrieval works slightly differently as it's own pipeline. See the docs for both below: