### Retrieval Augmented Generation
Retrieval augmented generation of RAG for short is a framework to combine elements of both retrieval-based and generation-based models in NLP. It aims to leverage the strengths of both approaches to improve the quality and relevance of the response. 

The generation-based approach such as sequence-to-sequence has a limitation that it can only learn from the training data used to train the model e.g. the model does not have specific domains such as health or insurance.

The RAG implementation can help solving the issue by providing the context to the LLM where the context is retrieved based on the "query" or user input question. There will be a Vector Store is used for searching the relevant contents, the contents may be company internal documents, or domain documents. The prompt is then modified to use the contents and answer to the user.

Here is an example of what RAG is:
![Retrieval Augmented Generation](images/rag.png)

### Components
RAG basically has 3 main components: Text Embedding, Vector Store and LLM. The Text Embedding takes responsibility to generate embeddings from text and those embeddings are comparable. The Vector Store stores embeddings and raw texts at the same time and can perform similarity search (consine similiarity, euclidean distance, ...). The LLM generates response from given __context__ and __question__ from user as input.

### Prompt Engineering
__TODO:__ Add some prompt templates
### Reference
- https://towardsdatascience.com/retrieval-augmented-generation-rag-from-theory-to-langchain-implementation-4e9bd5f6a4f2
- https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html

In [1]:
%pip install transformers sentence-transformers langchain-community cohere pymongo langchain-mongodb

Note: you may need to restart the kernel to use updated packages.


In [1]:
%load_ext autoreload
%autoreload 2

In [1]:
import os
import json
config_path = "/home/jovyan/work/computer-science/config.json"
os.environ["COHERE_API_KEY"] = json.load(open(config_path))["cohere_api_key"]
os.environ["MONGO_CLUSTER_URI"] = json.load(open(config_path))["mongo_cluster_uri"]

In [2]:
from langchain_core.globals import set_debug
set_debug(False)

### Insert Data

In [3]:
from rag.vector_store import VectorStore

In [4]:
vector_store = VectorStore()

In [15]:
vector_store.add_document("rag/docs/profile1.txt")
vector_store.add_document("rag/docs/profile2.txt")
vector_store.add_document("rag/docs/profile3.txt")
vector_store.add_document("rag/docs/profile4.txt")
vector_store.add_document("rag/docs/profile5.txt")

### Inference

In [6]:
from rag.inference_service import InferenceService

In [7]:
inference_service = InferenceService(vector_store.vector_search)

In [11]:
print(inference_service.invoke("Write SQL to retrieve user").content)

Here's the SQL query to retrieve a user:
```sql
SELECT * FROM user_profile WHERE user_id = [user ID];
```
Replace [user ID] with the actual ID of the user you wish to retrieve. This query will fetch all the columns for the user with the specified ID from the user_profile table.


In [14]:
print(inference_service.invoke("Introduce Pham Phu Tony who is a Data Analytics Manager").content)

Hello! I'd be happy to introduce Pham Phu Tony, who is a Data Analytics Manager. 

Tony is an experienced data analytics manager at Prudential, who is skilled in designing data warehouses for sales analytics purposes. He is also responsible for source code quality control and collaborates with his team to implement system monitoring. Tony leads a team of data engineers in the development of cutting-edge, end-to-end data processing solutions, encompassing automatic maintenance, data backup, retention, and even system rollback and recovery. 

He is a well-rounded and proactive manager when it comes to data analytics!


In [16]:
print(inference_service.invoke("Columns in agent_dim?").content)

The table agent_dim has the following columns: 
1. agent_num (INT)
2. agent_name (VARCHAR(30))
3. date_of_birth (DATETIME)
4. office_code (VARCHAR(5))
5. ga_code (VARCHAR(5))


In [17]:
print(inference_service.invoke("What is agent_num").content)

The agent_num is an integer value that serves as a unique identifier for each row in the agent_dim table. It appears to be a primary key for the table, helping to distinguish and access specific agents' information.
