# RAGatouille Retriever Llama Pack 

RAGatouille is a [cool library](https://github.com/bclavie/RAGatouille) that lets you use e.g. ColBERT and other SOTA retrieval models in your RAG pipeline. You can use it to either run inference on ColBERT, or use it to train/fine-tune models.

This LlamaPack shows you an easy way to bundle RAGatouille into your RAG pipeline. We use RAGatouille to index a corpus of documents (by default using colbertv2.0), and then we combine it with LlamaIndex query modules to synthesize an answer with an LLM.

In [1]:
# Option: if developing with the llama_hub package
from llama_hub.llama_packs.ragatouille_retriever.base import RAGatouilleRetrieverPack

# Option: download_llama_pack
# from llama_index.llama_pack import download_llama_pack

# RAGatouilleRetrieverPack = download_llama_pack(
#     "RAGatouilleRetrieverPack",
#     "./ragatouille_pack",
#     skip_load=True,
#     # leave the below line commented out if using the notebook on main
#     # llama_hub_url="https://raw.githubusercontent.com/run-llama/llama-hub/jerry/add_llm_compiler_pack/llama_hub"
# )

## Load Documents

Here we load the ColBERTv2 paper: https://arxiv.org/pdf/2112.01488.pdf.

In [1]:
!wget "https://arxiv.org/pdf/2112.01488.pdf" -O colbertv2.pdf

--2024-01-04 15:15:56--  https://arxiv.org/pdf/2112.01488.pdf
Resolving arxiv.org (arxiv.org)... 151.101.131.42, 151.101.195.42, 151.101.3.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.131.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1081592 (1.0M) [application/pdf]
Saving to: ‘colbertv2.pdf’


2024-01-04 15:15:56 (17.5 MB/s) - ‘colbertv2.pdf’ saved [1081592/1081592]



In [6]:
from llama_index import SimpleDirectoryReader
from llama_index.llms import OpenAI

reader = SimpleDirectoryReader(input_files=["colbertv2.pdf"])
docs = reader.load_data()

In [8]:
index_name = "my_index"
ragatouille_pack = RAGatouilleRetrieverPack(
    docs,
    llm=OpenAI(model="gpt-3.5-turbo"),
    index_name=index_name
)

  from .autonotebook import tqdm as notebook_tqdm
artifact.metadata: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.63k/1.63k [00:00<00:00, 3.22MB/s]




[Jan 04, 15:19:14] #> Creating directory .ragatouille/colbert/indexes/my_index 


#> Starting...


config.json: 100%|██████████| 743/743 [00:00<00:00, 2.58MB/s]
pytorch_model.bin: 100%|██████████| 438M/438M [00:10<00:00, 41.0MB/s] 
tokenizer_config.json: 100%|██████████| 405/405 [00:00<00:00, 1.48MB/s]
vocab.txt: 100%|██████████| 232k/232k [00:00<00:00, 10.3MB/s]
tokenizer.json: 100%|██████████| 466k/466k [00:00<00:00, 31.2MB/s]
special_tokens_map.json: 100%|██████████| 112/112 [00:00<00:00, 148kB/s]


[Jan 04, 15:19:28] Loading segmented_maxsim_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...




[Jan 04, 15:19:34] [0] 		 #> Encoding 129 passages..


100%|██████████| 3/3 [00:07<00:00,  2.40s/it]


[Jan 04, 15:19:42] [0] 		 avg_doclen_est = 171.97674560546875 	 len(local_sample) = 129
[Jan 04, 15:19:42] [0] 		 Creating 2,048 partitions.
[Jan 04, 15:19:42] [0] 		 *Estimated* 22,185 embeddings.
[Jan 04, 15:19:42] [0] 		 #> Saving the indexing plan to .ragatouille/colbert/indexes/my_index/plan.json ..
Clustering 21076 points in 128D to 2048 clusters, redo 1 times, 20 iterations
  Preprocessing in 0.00 s
[0.034, 0.035, 0.034, 0.031, 0.032, 0.035, 0.034, 0.032, 0.033, 0.035, 0.031, 0.033, 0.031, 0.036, 0.034, 0.034, 0.032, 0.033, 0.032, 0.032, 0.033, 0.035, 0.033, 0.033, 0.031, 0.034, 0.034, 0.032, 0.033, 0.036, 0.032, 0.038, 0.034, 0.033, 0.035, 0.029, 0.034, 0.032, 0.033, 0.039, 0.035, 0.034, 0.031, 0.034, 0.034, 0.032, 0.032, 0.037, 0.035, 0.035, 0.03, 0.034, 0.035, 0.035, 0.032, 0.035, 0.039, 0.033, 0.036, 0.033, 0.033, 0.037, 0.034, 0.034, 0.035, 0.036, 0.034, 0.033, 0.031, 0.033, 0.034, 0.034, 0.034, 0.033, 0.036, 0.035, 0.035, 0.036, 0.037, 0.037, 0.037, 0.032, 0.034, 0.037, 0.

0it [00:00, ?it/s]
  0%|          | 0/3 [00:00<?, ?it/s][A
 33%|███▎      | 1/3 [00:03<00:06,  3.18s/it][A
100%|██████████| 3/3 [00:06<00:00,  2.15s/it][A
1it [00:06,  6.53s/it]
100%|██████████| 1/1 [00:00<00:00, 3890.82it/s]
100%|██████████| 2048/2048 [00:00<00:00, 356354.89it/s]


[Jan 04, 15:19:49] #> Optimizing IVF to store map from centroids to list of pids..
[Jan 04, 15:19:49] #> Building the emb2pid mapping..
[Jan 04, 15:19:49] len(emb2pid) = 22185
[Jan 04, 15:19:49] #> Saved optimized IVF to .ragatouille/colbert/indexes/my_index/ivf.pid.pt

#> Joined...
Done indexing!


TypeError: RetrieverQueryEngine.__init__() got an unexpected keyword argument 'service_context'

In [None]:
# get the custom retriever (which just uses RAG.search)

In [None]:
retriever = ragatouille_pack.get_modules()["retriever"]
nodes = retriever.retrieve("How does ColBERTv2 compare with SPLADEv2?")

# even lower-level
RAG = ragatouille_pack.get_modules()["RAG"]
results = RAG.search("How does ColBERTv2 compare with SPLADEv2?", index_name=index_name)

In [None]:
# run pack e2e, which includes the full query engine with OpenAI LLMs
response = ragatouille_pack.run("How does ColBERTv2 compare with SPLADEv2?")
print(str(response))