<img src="images/ragna-logo.png" width=20% align="right"/>

# RAG and LLM Experiments

<hr>

## Explore the Web UI

Ragna ships with a Panel-based chat application, sometimes also referred to as the web UI. You can use this directly, or as an example to build your own applications.

To run the Ragna UI from Nebari, open a terminal window and run the following commands.

1. Activate the conda environment
   
```bash
conda activate global-pycon-de
```

2. Start the UI

```bash
PYTHON_PATH="${HOME}:${PYTHON_PATH}" \
  dotenv --file ~/shared/analyst/.env run -- \
    python -m \
        ragna ui --config ragna.toml
```

3. Go to https://pycon-tutorial.quansight.dev/user/{USER}/proxy/31477/ (replace {USER} with your Nebari username)

### Side note: Local setup instructions 💻

On your personal computers, you can directly run: `ragna ui` in your terminal to start the UI.

## Advanced configuration

`rag.chat()` takes the following keyword arguments to help you optimize the quality of answers:

* `chunk_size` - Size of each chunk (sections of the document that contain context) to use.
* `chunk_overlap` - Size of the overlap with previous and next chunk for retrieving additional context for future prompts.
* `num_tokens` - Maximum number of context tokens, and in turn the number of document chunks, pulled out of the vector database.

You can also set these configurations in the web app.

## Compare LLMs

Orchestration tools like Ragna can be useful for comparing and experimenting with LLMs quickly.

In the following cells, let's see how our local LLM, Mistral 7B, compares to OpenAI's GPT 3.5 and 4.

In [None]:
import asyncio
import itertools
from pathlib import Path
from pprint import pprint

from dotenv import load_dotenv
from local_llm import Mistral7BInstruct
from ragna import Rag
from ragna.assistants import Gpt4, Gpt35Turbo16k
from ragna.source_storages import Chroma

In [None]:
dotenv_path = Path.home() / Path("shared/analyst/.env")
load_dotenv(dotenv_path=dotenv_path)

Let's inquire about PSF's annual reports again.

In [None]:
documents = [
    "files/psf-report-2022.pdf",
    "files/psf-report-2021.pdf",
]

source_storages = [Chroma]
assistants = [Gpt35Turbo16k, Gpt4, Mistral7BInstruct()]

prompt = "What was PSF's net income in 2021 and 2022?"

In [None]:
rag = Rag()


async def answer_prompt(source_storage, assistant):
    async with rag.chat(
        documents=documents,
        source_storage=source_storage,
        assistant=assistant,
    ) as chat:
        message = await chat.answer(prompt)
        return message.content

In [None]:
experiments = {
    (source_storage.display_name(), assistant.display_name()): answer_prompt(
        source_storage, assistant
    )
    for source_storage, assistant in itertools.product(source_storages, assistants)
}

pprint(experiments)

In [None]:
results = dict(zip(experiments.keys(), await asyncio.gather(*experiments.values())))
pprint(results)

<hr>

_❗️ **Warning:** Make sure to stop the Jupyter Kernel (in the JupyterLab Menu Bar, click on "Kernel" -> "Interrupt Kernel") and close this notebook before proceeding._

<br>

**✨ Next: [Conclusion](05-conclusion.ipynb) →**

<hr>