# From RAGs to riches: Build an AI document interrogation app in 30 mins

Dharhas Pothina | PyData NYC 2023

---

## Hi! I'm Dharhas from Quansight👋

<!-- ![caption](images/viking_boat_desk.jpg) -->


<img src="images/quansight.png"/>

## Today's goal: Build a Retrieval Augmented Generation (RAG) based Document Query AI Assistant

<img src="images/ragna-web-ui.gif"/>

## We are not building ☝️ but you can try it out.

## visit: https://ragna.quansight.dev

## username: `enter a name or email`

## password: `tryragna`

## This is a fully-featured UI built with Panel and Ragna

This uses **Ragna's REST API**, which is more convenient to build production applications and **Panels ChatInterface** widget

**Note: This is demo website that will be taken down after this talk**

## Retrieval-Augmented Generation (RAG): Make LLMs more useful

### LLMs are trained on vast but static datasets.

<img src="images/chatgpt-what-is-ragna-framework.png"/>

### Google indexes the web and caught up pretty quickly.

<img src="images/google-what-is-ragna-framework.png"/>

### RAG is a method to augment foundational LLMs with fresh data and to reduce hallucinations and get around the limited space available in an LLM prompt (around 3,000 works for ChatGPT 3.5) 

<img src="images/RAG.png"/>

## Let's get started!

### We'll be using:

<img src="images/ragna-logo.png" width=55%/>

RAG orchestration framework designed to scale from research to production.

<br>

<img src="images/panel-logo.png" width=50%/>

Powerful interactive dashboard and application development framework.

### 1. Provide relevant data

10-K reports from Ford, GM, and Tesla, as well as a file describing Ragna:

In [None]:
documents = [

    "files/what-is-ragna.txt",
    "files/ford-10k-2022.pdf",
    "files/gm-10k-2022.pdf",
    
]

print(open(documents[0], "r").read())

### 2. Preliminary setup and configuration

```bash
export OPENAI_API_KEY=XXX # Export relevant API keys

ragna init # Create ragna.toml config-file using CLI wizard
``` 

Using the configuration file, you can set the assistants, source storages, API endpoints, etc.

Create configuration using the file:

In [None]:
from dotenv import load_dotenv
from ragna import Config
import warnings

warnings.filterwarnings('ignore')
load_dotenv()
config = Config.from_file('ragna.toml')
config

### 3. Select assistants & source storage:

- LLMs
    - OpenAI GPT 3.5 Turbo 16k (API)
    - OpenAI GPT 4 (API)
    - Airoboros L2 7B 2.2 GPTQ (Local LLM)
- Vector Databases
    - Chroma
    - LanceDB

In [None]:
from ragna.assistants import Gpt4, Gpt35Turbo16k 
from local_llm import Airoboros
from ragna.source_storages import Chroma, LanceDB

from ragna.core import Rag

# also import our chat interface builder tools for later
import panel as pn
pn.extension()

In [None]:
rag = Rag(config)

### 4. Start a chat

In [None]:
# Note: Embedding documents takes a few minutes

chat_gpt = rag.chat(documents=documents[:1], 
                source_storage=Chroma,
                assistant=Gpt35Turbo16k,
               )

chat_local = rag.chat(documents=documents[:1], 
                source_storage=LanceDB,
                assistant=Airoboros,
               )

await chat_gpt.prepare() # Ragna is async by design
await chat_local.prepare()

### 5. Ask your questions!

In [None]:
answer = await chat_gpt.answer("What is Ragna?")
print(f"\nRagna GPT 3.5 Response: \n\n{answer.content}")

In [None]:
answer = await chat_local.answer("What is Ragna?")
print(f"\nRagna Airoboros Response: \n\n{answer.content}")

## Lets look at the sources used:

In [None]:
print(answer.sources[0])

## Lets make this into an actual chat experience with Panel

We will use Panel's ChatInterface widget

https://panel.holoviz.org/reference/chat/ChatInterface.html

### We need to define a callback function to receive a query and return a response

In [None]:
async def callback(contents: str, user: str, instance: pn.chat.ChatInterface):
    answer_gpt = await chat_gpt.answer(contents)
    answer_local = await chat_local.answer(contents)
    if user == 'dharhas': 
        instance.send({'user': 'openai gpt 3.5', 'object': answer_gpt.content})
        instance.send({'user': 'airoboros L2 7B', 'object': answer_local.content})

### Lets setup the chat widget

In [None]:
chat_interface = pn.chat.ChatInterface(
    callback = callback, 
    callback_user = "Ragna",
    user = "dharhas",
    avatar = "images/dharhas_avatar.png",
    show_clear = False,
    show_undo = False,
)

### Start up the chat widget

In [None]:
chat_interface.send(
    "Send a message to get a reply from Ragna!", 
    user="Ragna", 
    avatar = "images/ragna-avatar.png", 
    respond=False
)

chat_interface.servable()

Some more prompts :)

### Aside: We can quickly compare assistants & source storages

In [None]:
import asyncio
import itertools
from pprint import pprint

from ragna.assistants import Gpt4, Gpt35Turbo16k
from local_llm import Airoboros
from ragna.source_storages import Chroma, LanceDB

source_storages = [Chroma, LanceDB]
assistants = [Airoboros, Gpt35Turbo16k, Airoboros, Gpt4]

prompt = "How much did GM and Ford earn"

async def answer_prompt(source_storage, assistant):
    async with rag.chat(
        documents=documents,
        source_storage=source_storage,
        assistant=assistant,
    ) as chat:
        message = await chat.answer(prompt)
        return message.content

experiments = {
    (source_storage.display_name(), assistant.display_name()): answer_prompt(
        source_storage, assistant
    )
    for source_storage, assistant in itertools.product(source_storages, assistants)
}

pprint(experiments)

In [None]:
results = dict(zip(experiments.keys(), await asyncio.gather(*experiments.values())))
pprint(results)

## Thank you! Questions?

### Learn more: [**ragna.chat**](https://ragna.chat/)

Please share your thoughts and feedback!

contact me: dharhas@quansight.com

<img src="images/viking_boat_desk.jpg"/>