<img src="images/ragna-logo.png" width="200px" align="right"/>

# RAG and LLM Experiments

<hr>

## Explore the Web UI

Ragna ships with a Panel-based chat application, sometimes also referred to as the web UI. You can use this directly, or as an example to build your own applications.

Before you can run the Ragna UI, you need to create a config `ragna.toml` file. This can be done with an interactive wizard by running `ragna init` in a terminal. 

In a cloud environment like Nebari there are a few extra configurations items needed. For this tutorial you can create the config file by running the next cell.

This will create write `ragna.toml` file to your home directory.

In [None]:
# Create a Ragna config file for Nebari and place it at ~/ragna.toml

import os
with open(f"{os.environ['HOME']}/ragna.toml", "w") as f:
    f.write(f"""
# Replace the $USER occurrences in this file
# with the username that you logged in

local_root = "./.cache/ragna"
authentication = "ragna.deploy.InJupyterHubAuthentication"
document = "ragna.core.LocalDocument"
source_storages = [
    "ragna.source_storages.RagnaDemoSourceStorage",
    "ragna.source_storages.Chroma",
    "ragna.source_storages.LanceDB",
]
assistants = [
    "ragna.assistants.RagnaDemoAssistant",
    "ragna.assistants.Gpt35Turbo16k",
    "ragna.assistants.Gpt4",
    "local_llm.Llama38BInstruct",
]

[api]
hostname = "127.0.0.1"
port = 31476
root_path = "/user/{os.environ['JUPYTERHUB_USER']}/proxy/31476/"
url = "https://pycon-tutorial.quansight.dev/user/{os.environ['JUPYTERHUB_USER']}/proxy/31476/"
database_url = "sqlite:///./.cache/ragna/ragna.db"
origins = [
    "https://pycon-tutorial.quansight.dev",
]

[ui]
hostname = "127.0.0.1"
port = 31477
origins = [
    "https://pycon-tutorial.quansight.dev",
]
""")

### To run the Ragna UI from Nebari, open a terminal window and run the following commands. 

1. Activate the conda environment
   
```bash
conda activate pycon-ragna
```

2. Start the UI (this may take a few minutes to launch)

```bash
dotenv --file ~/shared/pycon/.env run -- python -m ragna ui --config ~/ragna.toml
```

3. Run the cell below to generate the link to visit the Web UI

In [None]:
# Web UI Link
import os 
print(f"https://pycon-tutorial.quansight.dev/user/{os.environ['JUPYTERHUB_USER']}/proxy/31477/")

### Side note: Local setup instructions 💻

On your personal computers, you can directly run: `ragna ui` to start the UI and go to `http://localhost:31477` to use it.

## Compare LLMs

Orchestration tools like Ragna can be useful for comparing and experimenting with LLMs quickly. 

In the following cells, let's see how our local LLM, Llama3-8B, compares to OpenAI's GPT 3.5 and 4.

In [None]:
import asyncio
import itertools
from pathlib import Path
from pprint import pprint

from dotenv import load_dotenv

from ragna import Rag
from ragna.assistants import Gpt4, Gpt35Turbo16k
from ragna.source_storages import Chroma

from local_llm import Llama38BInstruct

In [None]:
dotenv_path = Path.home() / Path("shared/pycon/.env")
assert load_dotenv(dotenv_path=dotenv_path)

Let's inquire about PSF's annual reports again.

In [None]:
documents = [
    "files/psf-report-2021.pdf",
    "files/psf-report-2022.pdf",
    "files/psf-report-2023.pdf",
]

source_storages = [Chroma]
assistants = [Gpt35Turbo16k, Gpt4, Llama38BInstruct]

prompt = "What was PSF's net income in 2021, 2022, and 2023?"

In [None]:
rag = Rag()

async def answer_prompt(source_storage, assistant):
    async with rag.chat(
        documents=documents,
        source_storage=source_storage,
        assistant=assistant,
    ) as chat:
        message = await chat.answer(prompt)
        return message.content

In [None]:
experiments = {
    (source_storage.display_name(), assistant.display_name()): answer_prompt(
        source_storage, assistant
    )
    for source_storage, assistant in itertools.product(source_storages, assistants)
}

pprint(experiments)

In [None]:
results = dict(zip(experiments.keys(), await asyncio.gather(*experiments.values())))
pprint(results)

<hr>

_❗️ **Warning:** Make sure to stop the Jupyter Kernel (in the JupyterLab Menu Bar, click on "Kernel" -> "Shut down Kernel") before proceeding to prevent the "insufficient VRAM" error._

<br>

**✨ Next: [Conclusion](07-conclusion.ipynb) →**

💬 _Wish to continue discussions after the tutorial? Contact the presenters: [@pavithraes](https://github.com/pavithraes), [@dharhas](https://github.com/dharhas), [@ahuang11](https://github.com/ahuang11)_

<hr>