<img src="images/dspy_img.png" height="35%" width="%65">

## Naive Retreival Augmented Generation (RAG)

Amazingly easy and modular, DSPy Modules can be chained or stacked to create 
a pipeline. In our case, building a Naive RAG comprises using `dspy.Signature` and `dspy.ChainOfThought`, and own module class `RAG` (see implementation in [dspy_utils](dspy_utils.py). 

Out of the box, DSPy supports a set of [Retrievers clients](https://dspy-docs.vercel.app/api/category/retrieval-model-clients). For this example,
we will use `dspy.ColBERTv2`.

<img src="images/dspy_rag_pipeline.png">
<img src="images/dspy_rag_flow.png">

[source](https://towardsdatascience.com/intro-to-dspy-goodbye-prompting-hello-programming-4ca1c6ce3eb9)

In [1]:
_file_ = "10_dspy_naive_rag.ipynb"

In [2]:
import warnings
import dspy
import warnings
from dspy_utils import RAG, BOLD_BEGIN, BOLD_END

# Filter out warnings
warnings.filterwarnings("ignore")

In [3]:
%env OPENAI_API_KEY=
openai_api_key = %env OPENAI_API_KEY

env: OPENAI_API_KEY=


In [4]:
from columbus_api import Columbus
columbus = Columbus()
llm = columbus.get_llm_for_DSPy("gpt-4-turbo", openai_api_key=openai_api_key)
llm.kwargs['max_tokens']=1500

2024-06-13 11:24:28,418 - INFO - Start
2024-06-13 11:24:30,425 - ERROR - Error! can not get Columbus access token. Chech columbus_api.json.
2024-06-13 11:24:30,437 - INFO - Columbus class ready
2024-06-13 11:24:31,740 - ERROR - Columbus access token error: check your evns.
2024-06-13 11:24:31,749 - INFO - get_llm_for_DSPy modelname=gpt-4-turbo, apikey=''
2024-06-13 11:24:31,753 - INFO - get_llm_for_DSPy OpenAI


In [5]:
# llm.kwargs['extra_headers']['Authorization'] = f"Bearer {columbus.get_access_token()}"    
llm("Hello World!")

['Hello! How can I assist you today?']

### Questions to ask the RAG model

In [6]:

QUESTIONS = [
    "What is the capital of Tanzania?",
    "Who was the president of the United States in 1960?",
    "What is the largest mammal?",
    "What is the most populous country?",
    "What is the most widely spoken language?",
    "Which country won the FIFA Football World Cup in 1970?",
    "Which country has won the most FIFA Football World Cups?",
    "Who is the author of the book '1984'?",
    "What is the most popular programming language?",
    "What is dark matter in physics?",
]

### Instantiate our Language Model

In [7]:
# Setup OLlama environment on the local machine
# ollama_mistral = dspy.OllamaLocal(model='mistral',
#                                     max_tokens=2500)
# Instantiate the ColBERTv2 as Retrieval module
colbert_rm = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')

# Configure the settings
# dspy.settings.configure(lm=ollama_mistral, rm=colbert_rm)
dspy.settings.configure(lm=llm, rm=colbert_rm)

### Query the RAG 

In [8]:
# Instantiate the RAG module
rag = RAG(num_passages=5)
for idx, question in enumerate(QUESTIONS):
    print(f"{BOLD_BEGIN}Question {idx + 1}: {BOLD_END}{question}")
    response = rag(question=question)
    print(f"{BOLD_BEGIN}Answer    : {BOLD_END}{response.answer}")
    print("-----------------------------\n")

Question 1: What is the capital of Tanzania?


2024-06-13 11:25:36,738 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : Dodoma
-----------------------------

Question 2: Who was the president of the United States in 1960?


2024-06-13 11:25:41,737 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : Dwight D. Eisenhower
-----------------------------

Question 3: What is the largest mammal?


2024-06-13 11:25:54,221 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : The blue whale.
-----------------------------

Question 4: What is the most populous country?


2024-06-13 11:26:03,533 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : China
-----------------------------

Question 5: What is the most widely spoken language?


2024-06-13 11:26:15,926 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : English
-----------------------------

Question 6: Which country won the FIFA Football World Cup in 1970?


2024-06-13 11:26:23,720 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : Brazil
-----------------------------

Question 7: Which country has won the most FIFA Football World Cups?


2024-06-13 11:26:28,973 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : Brazil
-----------------------------

Question 8: Who is the author of the book '1984'?


2024-06-13 11:26:34,573 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : George Orwell
-----------------------------

Question 9: What is the most popular programming language?


2024-06-13 11:26:41,713 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : Java
-----------------------------

Question 10: What is dark matter in physics?


2024-06-13 11:26:52,364 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


Answer    : Dark matter is a hypothetical type of matter distinct from ordinary matter such as protons and neutrons, neutrinos, and dark energy. It does not emit or interact with electromagnetic radiation, making it invisible to the entire electromagnetic spectrum. Its existence and properties are inferred from its gravitational effects on visible matter, gravitational lensing, and its influence on the universe's large-scale structure, galaxies, and the cosmic microwave background.
-----------------------------



### Inspect history of the prompts

In [9]:
# print(ollama_mistral.inspect_history(n=1))
print(llm.inspect_history(n=1))




Given a context, question, answer the question.

---

Follow the following format.

Context: ${context}

Question: ${question}

Reasoning: Let's think step by step in order to ${produce the answer}. We ...

Answer: ${answer}

---

Context:
[1] «Dark matter in fiction | Dark matter is defined as hypothetical matter that is undetectable by its emitted radiation, but whose presence can be inferred from gravitational effects on visible matter. It has been used in a variety of fictional media, including computer and video games and books. In such cases, dark matter is usually attributed extraordinary physical or magical properties. Such descriptions are often inconsistent with the known properties of dark matter proposed in physics and cosmology. For example in computer games, it is often used as material for making weapons and items, and is usually depicted as black or a similar color.»
[2] «Dark Matter (disambiguation) | Dark matter is matter that is undetectable by its emitted radiati

## All this is amazing! 😜 Feel the wizardy in DSPy Modularity 🧙‍♀️