Here’s a simple example program demonstrating **Retrieval-Augmented Generation (RAG)** using Python, with tools like Hugging Face’s libraries.

For this example, we’ll use a pre-trained retriever model like **DPR (Dense Passage Retriever)** for retrieval and T5 for generation.

In [None]:
pip install transformers datasets faiss-cpu

Collecting datasets
  Downloading datasets-3.2.0-py3-none-any.whl.metadata (20 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.9.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.4 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets)
  Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)
Downloading datasets-3.2.0-py3-none-any.whl (480 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m480.6/480.6 kB[0m [31m9.2 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading faiss_cpu-1.9.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64

In [None]:
from transformers import T5ForConditionalGeneration, T5Tokenizer, DPRContextEncoder, DPRContextEncoderTokenizer, DPRQuestionEncoder, DPRQuestionEncoderTokenizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np


In [None]:
# Sample knowledge base
knowledge_base = [
    "The Eiffel Tower is located in Paris, France.",
    "Python is a popular programming language used for AI and machine learning.",
    "The human heart has four chambers: two atria and two ventricles.",
    "Albert Einstein developed the theory of relativity.",
    "RAG stands for Retrieval-Augmented Generation in AI.",
    "Damon Salvatore, played by Ian Somerhalder, is a main character in The Vampire Diaries.",
    "He's a centuries-old vampire and the older brother of Stefan Salvatore.",
    "Damon is initially portrayed as the show's villain but later becomes a complex and multidimensional character.",
    "His love for Elena Gilbert, played by Nina Dobrev, is a central theme throughout the series.",
    "Damon's charming, manipulative, and sometimes ruthless nature makes him a compeling and intriguing character.",
    "Stefan Salvatore, played by Paul Wesley, is a main character in The Vampire Diaries. He's a centuries-old vampire and the younger brother of Damon Salvatore. Stefan is initially portrayed as the show's hero but later becomes a complex and multidimensional character. His love for Elena Gilbert is a central theme throughout the series. Stefan's compassionate, responsible, and sometimes brooding nature makes him a relatable and endearing character."


]

In [None]:
# Step 1: Encode the Knowledge Base with DPR Context Encoder
context_tokenizer = DPRContextEncoderTokenizer.from_pretrained("facebook/dpr-ctx_encoder-single-nq-base")
context_encoder = DPRContextEncoder.from_pretrained("facebook/dpr-ctx_encoder-single-nq-base")

encoded_knowledge_base = context_encoder(**context_tokenizer(knowledge_base, return_tensors="pt", padding=True, truncation=True)).pooler_output


The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DPRQuestionEncoderTokenizer'. 
The class this function is called from is 'DPRContextEncoderTokenizer'.
Some weights of the model checkpoint at facebook/dpr-ctx_encoder-single-nq-base were not used when initializing DPRContextEncoder: ['ctx_encoder.bert_model.pooler.dense.bias', 'ctx_encoder.bert_model.pooler.dense.weight']
- This IS expected if you are initializing DPRContextEncoder from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DPRContextEncoder from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification mod

In [None]:
# Step 2: Query Encoding with DPR Question Encoder
query = input("Enter your query: ")
question_tokenizer = DPRQuestionEncoderTokenizer.from_pretrained("facebook/dpr-question_encoder-single-nq-base")
question_encoder = DPRQuestionEncoder.from_pretrained("facebook/dpr-question_encoder-single-nq-base")

query_embedding = question_encoder(**question_tokenizer(query, return_tensors="pt", truncation=True)).pooler_output

Enter your query: who is Damon's brother


Some weights of the model checkpoint at facebook/dpr-question_encoder-single-nq-base were not used when initializing DPRQuestionEncoder: ['question_encoder.bert_model.pooler.dense.bias', 'question_encoder.bert_model.pooler.dense.weight']
- This IS expected if you are initializing DPRQuestionEncoder from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DPRQuestionEncoder from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


In [None]:
# Step 3: Find the Most Relevant Context Using Cosine Similarity
similarities = cosine_similarity(query_embedding.detach().numpy(), encoded_knowledge_base.detach().numpy())
most_relevant_index = np.argmax(similarities)

retrieved_context = knowledge_base[most_relevant_index]
print(f"Retrieved Context: {retrieved_context}")


Retrieved Context: Stefan Salvatore, played by Paul Wesley, is a main character in The Vampire Diaries. He's a centuries-old vampire and the younger brother of Damon Salvatore. Stefan is initially portrayed as the show's hero but later becomes a complex and multidimensional character. His love for Elena Gilbert is a central theme throughout the series. Stefan's compassionate, responsible, and sometimes brooding nature makes him a relatable and endearing character.


In [None]:
# Step 4: Use T5 for Generation
t5_tokenizer = T5Tokenizer.from_pretrained("t5-small")
t5_model = T5ForConditionalGeneration.from_pretrained("t5-small")

In [None]:

# Combine query and retrieved context
input_text = f"question: {query} context: {retrieved_context}"
input_ids = t5_tokenizer(input_text, return_tensors="pt").input_ids

In [None]:

# Generate the final answer
output_ids = t5_model.generate(input_ids, max_length=50, num_beams=2)
generated_answer = t5_tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(f"Generated Answer: {generated_answer}")

Generated Answer: Stefan Salvatore
