<img src="images/ragna-logo.png" width=15% align="right"/>

# Basics of RAG-powered chat app

<hr>

## What is RAG?

LLMs are trained on vast, but static datasets. This means, while they can predict answers for several general questions like:

<img src="images/chatgpt-what-is-pyconde.png" width=60%/>

They can't answer or hallucinate answers for recent events or specific topics:

<img src="images/chatpgt-when-is-pyconde.png" width=60%/>

Retrieval-augmented generation (RAG) is a method to augment foundational LLMs with contextual data (documents), to reduce hallucinations and get around the limited space available in an LLM prompts (around 3,000 words for ChatGPT 3.5).

<img src="images/RAG-new.png" width=70%/>

## What is Ragna?

Open source library for RAG Orchestration with a Python API, REST API, and web UI.

<img src="images/ragna-architecture.png" width=80%/>

## Build a chat function with Ragna

### Step 0: Ragna configuration

Create a `ragna.toml` file and copy the content from `shared/analyst/ragna.toml.tpl` file. Update the following fields with your Nebari username:

* `local_root`
* `api.root_path`
* `database_url`
* `ui.root_path`

Using the configuration file, you can set the assistants, source storages, API endpoints, etc.

To use default LLMs like OpenAI, you will need API keys. For this tutorial, we have included a key for you.

In [None]:
from pathlib import Path

from dotenv import load_dotenv

dotenv_path = Path.home() / Path("shared/analyst/api-keys.env")
load_dotenv(dotenv_path=dotenv_path)

#### Aside: Local setup

On local computers, follow these step to get started with Ragna:

1. Install Ragna with pip or conda
2. Run `ragna init` to create the `ragna.toml` config file with a guided CLI. 
3. Get an OpenAI API key at https://platform.openai.com/api-keys, and set the environment variable `export OPENAI_API_KEY="XXX"

### Step 1: Select relevant documents

Let's use PyCon DE & PyData Berlin 2024's [FAQ](https://2024.pycon.de/frequently-asked-questions/#live-streams), and  [PyLadies](https://2024.pycon.de/blog/pyladies-at-pyconde-pydata/) pages.

Note: There are more documents in the `/files` directory that you can explore.

In [3]:
documents = [
    "files/pycon-de-faqs.pdf",
    "files/pycon-de-pyladies.pdf",
]

### Step 2: Select assistants and source storage

🔗 [Check the available assistants in the docs](https://ragna.chat/en/stable/generated/tutorials/gallery_python_api/#step-3-select-an-assistant)

We are selecting OpenAI's GPT-3.5 and GPT-4 LLMs, and Chroma and LanceDB source storages.

In [4]:
from ragna import Rag
from ragna.assistants import Gpt4, Gpt35Turbo16k
from ragna.source_storages import Chroma, LanceDB

In [5]:
rag = Rag()

### Step 3: Start chat

In [6]:
chat_gpt = rag.chat(
    documents=documents,
    source_storage=Chroma,
    assistant=Gpt4,
)

await chat_gpt.prepare()

Message(content=How can I help you with the documents?, role=MessageRole.SYSTEM, sources=[])

### Step 4: Ask questions

In [7]:
answer = await chat_gpt.answer("When is PyCon DE 2024?")
print(f"\nLLM Response: \n\n{answer.content}")


LLM Response: 

The PyCon DE & PyData Berlin 2024 conference begins on April 22, 2024.


Lets look at the sources used:

In [8]:
print(answer.sources[0])

id='4047c99f-246e-40a8-8c8a-b17662679593' document=<ragna.core.LocalDocument object at 0x7cff38275790> location='1, 2' content="\uf0c9\nPyLadies at PyCon DE &\nPyData Berlin 2024\nMAR 12, 2024 BY ORGANIZERS & PYLADIES\nBecause diversity isn't supplementary, it's foundational.\nThe PyLadies segment at this year's PyCon DE & PyData Berlin\nconference is a collaboration with the PySV (Python Software Verband\ne.V.) to foment diversity, inclusion, and belonging within the tech\ncommunity. PyLadies is a global volunteer organization dedicated to\nsupporting underestimated and underrepresented genders in tech,\nincluding women, non-binary, and trans people, encouraging their\nactive participation and leadership within the Python open-source\necosystem. We aim to offer a safe, welcoming environment for all skill\nlevels to learn, share, grow, and become today's and tomorrow's\nleaders.\n\uf077\nWe focus on uplifting women and gender-underrepresented groups.\nFor those who do not identify with

In [9]:
answer = await chat_gpt.answer("How to participate in lightning talks at PyCon DE?")
print(f"\nLLM Response: \n\n{answer.content}")


LLM Response: 

To participate in the lightning talks at PyCon DE, you need to sign up by putting your name and topic on the whiteboard next to the registration desk. The sign-up process is on a first-come-first-serve basis. The queue is reset every day in the morning. You can talk about almost anything, but promotions for products or companies, and 'we are hiring' calls are not allowed. Conference announcements are limited to one minute only. Each speaker is allowed one lightning talk per day. Any speaker who hasn't given a lightning talk at the conference is prioritized over those who have already given a talk.


In [10]:
answer = await chat_gpt.answer("When is the PyLadies lunch?")
print(f"\nLLM Response: \n\n{answer.content}")


LLM Response: 

The PyLadies Lunch is scheduled for Tuesday, April 23, from 12:10 to 13:10.


<hr>

**✨ Next: [Use Local LLM with Ragna](03-RAG-local-llm.ipynb) →**

<hr>