# RAG Workflow Demonstration

If you run the [Website Chatbot](https://docs.aperturedata.io/workflows/crawl_to_rag) workflow, you can use this notebook to test the results by sending in some natural language questions and seeing the results.

## Import some modules we will need

In [2]:
import requests
import json
import IPython.display as display
from ipywidgets import Textarea, Button, VBox, Checkbox, Output
import os
import getpass
import time

## Work out how to contact the RAG workflow

This assumes that you are running this notebook as a workflow on the same instance.
You will be asked to enter the token that you set for your RAG endpoint.

In [None]:
API_URL = 'https://<DB_HOST>/rag'

print(f"Using API URL: {API_URL}")

# Input API_KEY from terminal as password
API_KEY = getpass.getpass('Enter your API key: ')

## Ensure the server is ready

It may take a few minutes for the workflow to crawl the website, extract and segment the text, generate embeddings, and be ready to respond to queries.
This step checks to see if the RAG service is ready and prints out the configuration.

In [63]:
def wait_until_ready(api_base: str, api_key: str, delay: float = 5.0, max_tries: int = 60):
    print("Waiting for server to become ready...")
    tries = 0
    headers = {
        'Authorization': f'Bearer {api_key}',
        'Content-Type': 'application/json'
    }
    while tries < max_tries:
        try:
            url = f"{api_base}/config"
            print(f"Attempt {tries+1}: Checking server readiness at {url}")
            r = requests.get(url, timeout=2, headers=headers)
            if r.status_code == 200:
                config = r.json()
                if 'ready' not in config or config.get("ready") == True:
                    print("Server is ready.")
                    return config  # Optional: you can return the config for inspection
                else:
                    print(f"Server reports not ready (try {tries+1})")
            else:
                print(f"Received status code {r.status_code}")
        except Exception as e:
            print(f"Error contacting server (try {tries+1}): {e}")
        tries += 1
        time.sleep(delay)

    raise RuntimeError("Server did not become ready in time.")


config = wait_until_ready(API_URL, API_KEY)
print("Configuration:", json.dumps(config, indent=2))

Waiting for server to become ready...
Attempt 1: Checking server readiness at http://localhost:8000/rag/config
Server is ready.
Configuration: {
  "llm_provider": "cohere",
  "llm_model": "command-r-plus",
  "embedding_model": "openclip ViT-B-32 laion2b_s34b_b79k",
  "input": "crawl-to-rag7",
  "n_documents": 4,
  "host": "workflow-test-i8ewnkjd.farm0000.cloud.aperturedata.io",
  "count": 766,
  "ready": true
}


## Now ask questions

This demonstration uses the non-streaming API, so it may take a few moments for the results to appear.
This reflects the time it takes for the LLM to stop sending output.

In [None]:
summary_history = ""

query_input = Textarea(placeholder="Enter your question here...", layout={'width': '100%', 'height': '100px'})
run_button = Button(description="Ask", button_style='primary')
output_area = Output()

def render_markdown_result(query, answer, rewritten_query, documents, old_history, new_history):
    doc_md = ""
    for i, doc in enumerate(documents):
        url = doc.get("url", f"Document {i+1}")
        text = doc.get("page_content", "")[:1000]
        doc_md += f"<details><summary><b>{url}</b></summary>\n\n```\n{text}\n```\n</details>\n\n"
    
    return f"""
## User Query
{query}

## Previous Summary
{old_history}

## Rewritten Query

{rewritten_query}

## Answer
{answer}

## Source Documents
{doc_md}

## Updated Summary
{new_history}
"""


def on_button_click(_):
    global summary_history
    output_area.clear_output()
    query = query_input.value.strip()

    if not query:
        with output_area:
            print("Please enter a query.")
        return

    payload = {
        "query": query,
        "history": summary_history
    }

    headers = {
        'Authorization': f'Bearer {API_KEY}',
        'Content-Type': 'application/json'
    }


    response = requests.post(f"{API_URL}/ask", json=payload, headers=headers)

    if response.status_code != 200:
        with output_area:
            print(f"Error: {response.status_code}\n{response.text}")
        return

    data = response.json()
    old_history = summary_history
    summary_history = data.get("history", summary_history)

    markdown_output = render_markdown_result(
        query=query,
        old_history=old_history or "None",
        answer=data.get("answer", "—"),
        rewritten_query=data.get("rewritten_query", "—"),
        documents=data.get("documents", []),
        new_history=summary_history
    )

    with output_area:
        display.display(display.Markdown(markdown_output))


run_button.on_click(on_button_click)
ui = VBox([query_input, run_button, output_area])
display.display(ui)

VBox(children=(Textarea(value='', layout=Layout(height='100px', width='100%'), placeholder='Enter your questio…