# Gai/Gen: Retrieval-Augmented-Generation (RAG)

## 1. Note

The following examples has been tested on the following environment:

-   NVidia GeForce RTX 2060 6GB
-   Windows 11 + WSL2
-   Ubuntu 22.04
-   Python 3.10
-   CUDA Toolkit 11.8

## 2. Create Virtual Environment and Install Dependencies

We will create a seperate virtual environment for this to avoid conflicting dependencies that each underlying model requires.

```sh
sudo apt update -y && sudo apt install ffmpeg git git-lfs -y
conda create -n RAG python=3.10.10 -y
conda activate RAG
pip install -e ".[RAG]"
```

## 3. Install Model

In [None]:
%%bash
huggingface-cli download hkunlp/instructor-large \
        --local-dir ~/gai/models/instructor-large \
        --local-dir-use-symlinks False

## 4. Example

### Index and Retrieve Text File

In [1]:
# Step 1: Reset 'demo' collection
from gai.gen.rag import RAG
rag = RAG()
rag.delete_collection("demo")
rag.list_collections()

  from .autonotebook import tqdm as notebook_tqdm
2024-02-22 20:55:13 INFO gai.gen.rag.RAG:[32mDeleting demo...[0m


[]

In [2]:
# Step 2: Index a text file

path="./pm_long_speech_2023.txt"
rag.load()
doc_id = await rag.index_async(
    collection_name='demo',
    file_path=path,
    file_type='txt',
    source="https://www.pmo.gov.sg/Newsroom/2023-National-Day-Rally-Speech",
    title="2023 National Day Rally Speech",
    )

load INSTRUCTOR_Transformer
max_seq_length  512


100%|██████████| 29/29 [00:00<00:00, 681.09it/s]
2024-02-22 20:55:17 INFO gai.gen.rag.RAG:[32mRAG.index_async: Begin indexing...[0m
0it [00:00, ?it/s]2024-02-22 20:55:19 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 1/29 chunk 5d98b767-67d4-4058-b6b2-cd18b452d741 into collection demo[0m
1it [00:01,  1.64s/it]2024-02-22 20:55:19 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 2/29 chunk 6980715e-c444-4533-94f1-d05f6d91bb2a into collection demo[0m
2024-02-22 20:55:19 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 3/29 chunk 8d7af7f5-e57a-4b37-b2be-135ae5afbb1e into collection demo[0m
3it [00:01,  2.07it/s]2024-02-22 20:55:19 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 4/29 chunk 0580bbe1-1448-47e4-876a-d91c2fcdf5e8 into collection demo[0m
2024-02-22 20:55:19 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 5/29 chunk be0811e0-64eb-4d3a-95d1-28a880b657bc into collection demo[0m
5it [00:01,  3.61it/s]2024-02-22 20:55:19 DEBUG gai.gen.rag.RAG:[35mRAG.index_

In [3]:
# Step 3: View doc summary
from pprint import pprint
doc = rag.get_document(doc_id)
pprint({
    "Id":doc.Id,
    "Title":doc.Title,
    "FileName":doc.FileName,
    "File":doc.FileType,
    "Source":doc.Source,
    "ByteSize":doc.ByteSize,
    "Collection":doc.CollectionName,
    "ChunkSize":doc.ChunkGroups[0].ChunkSize,
    "Chunks": len(doc.ChunkGroups[0].Chunks)
    })

{'ByteSize': 43352,
 'ChunkSize': 2000,
 'Chunks': 29,
 'Collection': 'demo',
 'File': 'txt',
 'FileName': 'pm_long_speech_2023.txt',
 'Id': '3f047a5665ea01fc239fce1933a75e3ec2d64e574fbfc70d4b974449896c6365',
 'Source': 'https://www.pmo.gov.sg/Newsroom/2023-National-Day-Rally-Speech',
 'Title': '2023 National Day Rally Speech'}


In [4]:
# Step 4: Retrieve answers
rag.retrieve(collection_name="demo",query_texts="Who are the young seniors?")

2024-02-22 20:55:30 INFO gai.gen.rag.RAG:[32mRetrieving by query Who are the young seniors?...[0m
2024-02-22 20:55:30 INFO gai.gen.rag.dalc.RAGVSRepository:[32mRetrieving by query Who are the young seniors?...[0m
2024-02-22 20:55:31 DEBUG gai.gen.rag.dalc.RAGVSRepository:[35mresult=[['3843666a-69fa-464d-9406-140862d4aa04', 'e9f4b4f1-4291-42d4-8c9e-4e51ead578fa', 'ea43c5c8-5e28-47d0-bd84-eacf4e120673']][0m
2024-02-22 20:55:31 DEBUG gai.gen.rag.RAG:[35mresult={"documents":{"0":"The seniors looked happy, but some of them were not so well. A few were wheelchair-bound, but they still joined in the activities. This cheerful lady told me she hoped to joget again! Why not, even in a wheelchair? Other seniors were using the health services at the AAC. Some were getting their vital signs checked so that doctors could follow up if something was amiss. One was having a teleconsultation \u2013 with nurses physically there to help him, and a doctor calling in on Zoom from the polyclinic. It w

Unnamed: 0,documents,metadatas,distances,ids
Abstract,"The seniors looked happy, but some of them wer...",,0.121018,3843666a-69fa-464d-9406-140862d4aa04


In [5]:
# Index and Retrieve PDF
path = "./attention-is-all-you-need.pdf"
rag.unload()
rag.load()
doc_id = await rag.index_async(
    collection_name='demo',
    file_path=path,
    file_type='pdf',
    source="arxiv.org",
    title="Attention is All You Need",
    )
rag.retrieve(collection_name="demo",query_texts="How is the transformer different from RNN?")


load INSTRUCTOR_Transformer
max_seq_length  512


100%|██████████| 22/22 [00:00<00:00, 487.99it/s]
2024-02-22 20:55:42 INFO gai.gen.rag.RAG:[32mRAG.index_async: Begin indexing...[0m
0it [00:00, ?it/s]2024-02-22 20:55:43 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 1/22 chunk dacf656c-b424-4a0f-b192-40ab630bc84c into collection demo[0m
1it [00:00,  1.88it/s]2024-02-22 20:55:43 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 2/22 chunk d82f6aa6-3974-46fb-8107-691e7cc6c5dc into collection demo[0m
2024-02-22 20:55:43 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 3/22 chunk 889ab410-3a3f-4881-8e60-23f51483adf9 into collection demo[0m
3it [00:00,  4.78it/s]2024-02-22 20:55:43 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 4/22 chunk 0d343170-c4e1-42bc-8e05-768af1fbf882 into collection demo[0m
4it [00:00,  5.68it/s]2024-02-22 20:55:43 DEBUG gai.gen.rag.RAG:[35mRAG.index_async: Indexed 5/22 chunk a9394ed3-afbd-4d64-95f7-540396de2c57 into collection demo[0m
2024-02-22 20:55:43 DEBUG gai.gen.rag.RAG:[35mRAG.index_

Unnamed: 0,documents,metadatas,distances,ids
Abstract,This inherently sequential nature precludes pa...,,0.127805,da98e055-40d4-4efd-ad13-4fa26b4184d8


---
## 5. Running as a Service

In this example, we will start 2 services: one for RAG API and one for RAG Listener.
We will then index a document using curl and observe the progress using the listener.

### Step 1: Start the API service

#### Option A: Run in a Docker container (Recommended)

```bash
docker run -d \
    --name gai-rag \
    -p 12031:12031 \
    --gpus all \
    -v ~/gai/models:/app/models \
    kakkoii1337/gai-rag:latest
```

Wait for model to load

```bash
docker logs gai-rag
```

When the loading is completed, the logs should show this:

```bash
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:12031 (Press CTRL+C to quit)
```

#### Option B: Run from Terminal

```bash
cd /gai-gen/gai/api/
python rag_api.py
```

### Step 2: Start the Listener Service

The listener can be helpful when used with the API. It can be used to monitor the indexing progress via web socket. 
This is especially useful when monitoring the progress while indexing large files.

```python
# prettier-ignore
import asyncio
import os, sys
import websockets

async def listen():
    ws_uri = "ws://localhost:12031/api/v1/rag/index-file/ws"
    async with websockets.connect(ws_uri) as websocket:
        while True:
            message = await websocket.recv()
            logger.info(f"Received: {message}")

asyncio.run(listen())
```

The above code is saved under `/tests/integration_tests/rag/rag_listener`.

```bash
cd tests/integration_tests/rag
python rag_listener.py
```

If the listener is successfully started, you should see the following message from the API Server logs:

![rag-listener-connected](./imgs/rag-listener-connected.png)


### Step 3: Test RAG

**Send Request**

```bash
cd tests/integration_tests/rag
```

The following example uses curl script `tests/integration_tests/rag/3_curl_index.sh` to index a file .

```bash
curl -X POST 'http://localhost:12031/gen/v1/rag/index-file' \
    -H 'accept: application/json' \
    -H 'Content-Type: multipart/form-data' \
    -s \
    -F 'collection_name=demo' \
    -F 'file=@./pm_long_speech_2023.txt' \
    -F 'metadata={"source": "https://www.pmo.gov.sg/Newsroom/National-Day-Rally-2023#:~:text=COVID%2D19%20was%20the%20most,indomitable%20spirit%20of%20our%20nation."}'
```

**NOTE**: The indexing may fail if the file was already indexed. To re-index, you can delete the demo collection.

```bash
curl -X DELETE 'http://localhost:12031/gen/v1/rag/collection/demo'

```



### Video

![gai-gen-rag](../doc/docs/gai-gen/imgs/gai-gen-rag.gif)