# Gai/Gen: Retrieval-Augmented-Generation (RAG)

## 1. Note

The following examples has been tested on the following environment:

-   NVidia GeForce RTX 2060 6GB
-   Windows 11 + WSL2
-   Ubuntu 22.04
-   Python 3.10
-   CUDA Toolkit 11.8

## 2. Create Virtual Environment and Install Dependencies

We will create a seperate virtual environment for this to avoid conflicting dependencies that each underlying model requires.

```sh
sudo apt update -y && sudo apt install ffmpeg git git-lfs -y
conda create -n RAG python=3.10.10 -y
conda activate RAG
pip install -e ".[RAG]"
```

## 3. Install Model

In [None]:
%%bash
huggingface-cli download hkunlp/instructor-large \
        --local-dir ~/gai/models/instructor-large \
        --local-dir-use-symlinks False

## 4. Example

In [8]:
from gai.gen.rag import RAG
RAG.delete_collection("demo")
RAG.list_collections()

[]

In [9]:
from gai.gen.rag import RAG
RAG.get_collection("demo")
RAG.list_collections()

[Collection(name=demo)]

In [10]:
# Index
from gai.gen.Gaigen import Gaigen
gen = Gaigen.GetInstance().load('rag')
with open("../tests/gen/rag/pm_long_speech_2023.txt") as f:
    text = f.read()
    gen.index(collection_name="demo",text=text, path_or_url="2023 National Day Speech", metadata={"source":"https://www.pmo.gov.sg/Newsroom/2023-National-Day-Rally-Speech"})

  from tqdm.autonotebook import trange


load INSTRUCTOR_Transformer
max_seq_length  512


100%|██████████| 29/29 [00:04<00:00,  5.98it/s]


In [11]:
from gai.gen.Gaigen import Gaigen
gen = Gaigen.GetInstance().load('rag')
result=gen.retrieve(collection_name="demo",query_texts="Who are the young seniors?")
print(result)


                                           documents  \
0  The seniors looked happy, but some of them wer...   
1  What I found most encouraging was that many se...   
2  SECTION 3: AGEING\nI want to talk about two ot...   

                                           metadatas  distances  \
0  {'chunks_dir': '/home/roylai/gai/chunks/2023 N...   0.242035   
1  {'chunks_dir': '/home/roylai/gai/chunks/2023 N...   0.247290   
2  {'chunks_dir': '/home/roylai/gai/chunks/2023 N...   0.254954   

                                                 ids  
0  94f0f70f5a0ec555696a1bac479d55533734d89fcfa491...  
1  d040a9e16a818fd6598483721a043a3a22bf9dca24bf35...  
2  f5b54668d2357185abda3d81ceda8d1218b230100a483e...  
