# Introduction to 5G RAG

Retrieval Augmented Generation (RAG) is the adding of context from user files to an interaction with an LLM. In this example, the files are PDF files from David Cherney's Introduction to 5G confluence folder in the MSS space. 

## Files to Documents

I have saved the files locally in nested directories. Each file needs to bo leaded and split into smaller documents that are (well) below the token limit for the LLM. That token limit for ChatGPT is 4096, which comes out to roughly 12,000 characters. (based on the estimate that 1 token is about 3 characters.)


In [5]:
from langchain.document_loaders.pdf import PyPDFDirectoryLoader, PyPDFLoader
from dotenv import load_dotenv 
from langchain.text_splitter import CharacterTextSplitter 
import glob

load_dotenv()

# Create a list of all PDF files. 
# Brute force through directories for now.
files = []
files += list(glob.iglob("I5G/*.pdf"))
files += list(glob.iglob("I5G/*/*.pdf"))
files += list(glob.iglob("I5G/*/*.pdf"))
# files

In [6]:
# Load docs one at a time. 
text_splitter = CharacterTextSplitter(
    chunk_size = 500,
    # Separate by sentences as indicated by period marks.  
    separator = ".", 
    chunk_overlap = 0 
)


# Join the Lists of documents from each file.
docs = []
for file in files:
    loader = PyPDFLoader(
        file_path = file 
    )
    docs += loader.load_and_split(
        text_splitter=text_splitter,
    )

Created a chunk of size 802, which is longer than the specified 500
Created a chunk of size 703, which is longer than the specified 500
Created a chunk of size 789, which is longer than the specified 500
Created a chunk of size 739, which is longer than the specified 500
Created a chunk of size 725, which is longer than the specified 500
Created a chunk of size 788, which is longer than the specified 500
Created a chunk of size 518, which is longer than the specified 500
Created a chunk of size 552, which is longer than the specified 500
Created a chunk of size 518, which is longer than the specified 500
Created a chunk of size 552, which is longer than the specified 500


In [8]:
# Check that the list is not empty
for doc in docs[:3]:
    print(doc.metadata)

{'source': 'I5G/6 Slicing.pdf', 'page': 0}
{'source': 'I5G/6 Slicing.pdf', 'page': 1}
{'source': 'I5G/6 Slicing.pdf', 'page': 1}


## Vector Store

A function maps the documents to vectors. The vector store is a database whose entries have the following features
- id (under the key `ids`)
- metadata about source file and page number (under the key `metadatas`)
- document as a string (under the key `documents`)
- the vector (undr the unfortunate key `embeddings`). 
- `uris` which I'm not using
- `data` which I'm not using

In [13]:
db_dir = "I5GVecStore"

In [12]:
# import os 
# os.system(f"rm -r {db_dir}")

0

In [14]:
from langchain.vectorstores.chroma import Chroma
# from langchain.embeddings import OpenAIEmbeddings # depricated
from langchain_openai import OpenAIEmbeddings


embeddings = OpenAIEmbeddings()
db = Chroma.from_documents(
    documents = docs,
    embedding=embeddings,
    # Name the directory for the vector store.
    persist_directory=db_dir
)

In [15]:
print(f"The databse is of type {type(db)} \n" 
      f"and has {db._collection.count()} entries.")

The databse is of type <class 'langchain_community.vectorstores.chroma.Chroma'> 
and has 721 entries.


# Natural language response

In [17]:
from langchain.chains import RetrievalQA 
# from langchain.chat_models import ChatOpenAI # depricated
from langchain_openai import ChatOpenAI

chat = ChatOpenAI()
retriever = db.as_retriever()

chain = RetrievalQA.from_chain_type( #will this accept a memory?
    llm=chat,
    retriever=retriever,
    # This is the most basic chain type. 
    chain_type="stuff" 
    #chain_type="map_reduce" # takes top 4 similar documents and individually stuffs them into a SystemMessagePromptTemplate, then takes results of all into another template for ChatGPT.
    #map_rerank # for each of the top 4 similar documents, asks LLM to generate answer and a score for how completely the answer from the document answers the question/use query.
    #refine # for each 4 similar documents, in series (parallel above 2), ask LLM to refine answer by considering the document and its previous answer. 
)

query = "Are there type of connections to 5G other than new radio (5G NR)?"
result = chain.run(query)



In [18]:
from textwrap import wrap 

for line in wrap(result, width=75):
    print(line)

Yes, there are types of connections to 5G other than new radio (5G NR). In
addition to 5G NR, there are also wireline connections that apply to both
5G capable and non-standalone 5G core (N5GC) user equipment (UE). These
wireline connections include wireline 5G broadband access network (W-5GBAN)
and wireline 5G cable access network (W-5GCAN). These wireline connections
provide connectivity through different access points and facilitate service
continuity when the UE moves.


In [19]:
type(chat)

langchain_openai.chat_models.base.ChatOpenAI

## PromptTemplate + Memory

I'd like to add memory and a prompt to be able to say something like "you are an expert in 5G."

RetrievalQA does not have a parameter `prompt`. Thus, I suspect that to pass context like "you are helpful" one must use extra inputs.

However, Adding a context field with the query does not seem to change the way the LLM responds. 

In [106]:
from langchain.chains import RetrievalQA 
# from langchain.chat_models import ChatOpenAI # depricated
from langchain_openai import ChatOpenAI

chat = ChatOpenAI()
our_retriever = db.as_retriever() # does not accept a prompt argument. Thus, context.

chain = RetrievalQA.from_chain_type(
    llm=chat,
    retriever=retriever,
    chain_type="stuff",
    chain_type_kwargs={ 
        # "verbose": True,
    }
)

query = "Are there type of connections to 5G other than new radio (5G NR)?"
context = "You are a pirate that tells jokes."
def run_our_chain(query):
    our_chain = chain(inputs = {
        "context:":context, 
        "query": query
        })
    
    for k,v in our_chain.items():
        print(f"{k}")
        print(f">> {v}\n")
    return None

In [105]:
run_our_chain("How are electromagnetic waves related to radio waves?")

context:
>> You are a pirate that tells jokes.

query
>> How are electromagnetic waves related to radio waves?

result
>> Radio waves are a type of electromagnetic wave. Electromagnetic waves encompass a wide range of frequencies and include radio waves, microwaves, infrared radiation, visible light, ultraviolet radiation, X-rays, and gamma rays. The main difference between these waves lies in their frequency and wavelength. Radio waves have the lowest frequency and longest wavelength among the electromagnetic spectrum, while gamma rays have the highest frequency and shortest wavelength. So, in summary, radio waves are a specific type of electromagnetic wave.



I suspect that RetrievalQA is not conversational and so does not have a prompt capacity.

### prompt in ConversationalRetrievalChain

I see people using prompts in this kind of chain. 

In [153]:
from langchain.prompts import PromptTemplate
from langchain.chains import ConversationalRetrievalChain

chat = ChatOpenAI()
our_retriever = db.as_retriever() 

# custom_prompt = PromptTemplate(
#     input_variables=["context","question"], 
#     template="{context}. So answer this question: {question}"
# )

prompt_template = """You are a chatbot that responds 'no' to all questions.

{chat_history}
Human: {question}
Chatbot:"""

custom_prompt = PromptTemplate.from_template(
            # input_variables=["chat_history","question"], # complains that two tings were passed for input variables?!
            template=prompt_template
)
    
memory = ConversationBufferMemory(
    memory_key='chat_history', return_messages=True, output_key='answer'
)

conversational_chain = ConversationalRetrievalChain.from_llm(
    llm=chat, 
    memory=memory,
    retriever=our_retriever, 
    # condense_question_prompt=custom_prompt # this seems to make memory not function
    # prompt = custom_prompt # there is no parameter `prompt` in CRC
    # return_source_documents=True
)

for k,v in conversational_chain(inputs = {
    # "context" : "You always answer every question by saying 'why?'", 
    # "question":"What is 5G?"}).items():
    "question":"What came before it?"}).items():    
        print(k)
        print(f">> {v}\n" )

question
>> What came before it?

chat_history
>> [HumanMessage(content='What came before it?'), AIMessage(content='Before 5G, there were several previous generations of mobile communication technology. These generations are commonly referred to as 1G, 2G, 3G, and 4G. Each generation brought advancements in terms of data speeds, network capacity, and communication capabilities.')]

answer
>> Before 5G, there were several previous generations of mobile communication technology. These generations are commonly referred to as 1G, 2G, 3G, and 4G. Each generation brought advancements in terms of data speeds, network capacity, and communication capabilities.



In [154]:
for k,v in conversational_chain(inputs = {
    "question":"What came before it?"}).items():
    print(k)
    print(f">> {v}\n" )

question
>> What came before it?

chat_history
>> [HumanMessage(content='What came before it?'), AIMessage(content='Before 5G, there were several previous generations of mobile communication technology. These generations are commonly referred to as 1G, 2G, 3G, and 4G. Each generation brought advancements in terms of data speeds, network capacity, and communication capabilities.'), HumanMessage(content='What came before it?'), AIMessage(content='The previous generations of mobile communication technology before 5G are as follows:\n\n1G: Analog Voice Technology, introduced in 1973.\n2G: Digital voice technology, introduced in 1991.\n3G: Internet Access, introduced in 1998, which enabled mobile web browsing.\n4G: Broadband Internet Access, introduced in 2009, which enabled smartphone features like streaming video.')]

answer
>> The previous generations of mobile communication technology before 5G are as follows:

1G: Analog Voice Technology, introduced in 1973.
2G: Digital voice technolog

In [70]:
our_memory.clear