# Retrieval Augmented Generation (RAG)
We basically first retrieve data using a search engine trained to our data, and then generate an answer based on that via LLMs

In [60]:
!wget https://raw.githubusercontent.com/alexeygrigorev/minsearch/main/minsearch.py

--2024-09-14 23:44:22--  https://raw.githubusercontent.com/alexeygrigorev/minsearch/main/minsearch.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3832 (3.7K) [text/plain]
Saving to: ‘minsearch.py.6’


2024-09-14 23:44:22 (34.2 MB/s) - ‘minsearch.py.6’ saved [3832/3832]



In [61]:
import minsearch # alexeys small and fast search engine
import requests

from dotenv import load_dotenv
from openai import OpenAI


In [62]:
#setup LLM model client, OLLAMA MUST BE RUNNING ON THE COMPUTER
client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',
)

## Train search engine

In [63]:
# load json data directly from the url 
docs_url = 'https://github.com/DataTalksClub/llm-zoomcamp/blob/main/01-intro/documents.json?raw=1'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

# rearrange data a bit (add course type to each faq)
documents = []
for course_dict in documents_raw:
    for doc in course_dict['documents']:
        doc['course'] = course_dict['course'] #adding it to every faq
        documents.append(doc)

# initialize class, tell the search engine what is searchable and what are keywords
index = minsearch.Index(
    text_fields=['text','section','question'],
    keyword_fields=['course']
)

#actually train the search engine
index.fit(docs=documents)

<minsearch.Index at 0x17bc37970>

## Generate LLM

In [64]:
#aux fcns
def search(query,filter_dict={'course': 'data-engineering-zoomcamp'}):
    '''  
    This function runs the already trained search engine and retrieves the top 5 results,
    '''
    boost = {'question': 3.0, 'section': 0.5} # what to stress on, what is more important. give it weights
    results = index.search(
        query=query,
        filter_dict=filter_dict,
        boost_dict=boost,
        num_results=5
    )

    return results

In [65]:
def build_prompt(query, search_results):
    ''' 
    This function starts with a prompt template, and given the query fills the template out with the results from the search engine
    '''
    # we will give the llm some context
    # Alexey mentions that this is a bit of art and science because you somewhat iterate until you find something that works for you.
    prompt_template =  """ 

    You are a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database. 
    Use only the facts from the CONTEXT when answering the QUESTION.
    If the CONTEXT does not contain the answer, output NONE.

    QUESTION: {question}

    CONTEXT: {context}

    """.strip() #no line breaks
    
    #convert search results into proper formatted context
    context = ""
    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer:{doc['text']}\n\n"


    # we formally add the info on the prompt
    return prompt_template.format(question=query,context=context).strip()
    

In [66]:
def llm(prompt,model='phi3'):
    ''' 
    This function trains chatGPT with our prompt (with the search engine results)
    '''
    # train openai/chatpgt with the prompt
    response = client.chat.completions.create(
        model = model,
        messages=[{'role':'user','content': prompt}]
    )   

    return response.choices[0].message.content

In [67]:
def rag(query):
    '''  
    This function, given a question, finds best answers on the search engine, trains the llm with it, and returns a result
    '''
    # search for the question on the search engine
    results = search(query)
    # we create the context by basically stringing together the answers from the search engine
    prompt = build_prompt(query, results)
    # we train the llm (in this case Ollama) with the prompt and returns some user friendly answer
    answer = llm(prompt)

    return answer



In [71]:
q= 'The course just started. Can I still enroll? Give me an answer in 1 or 2 sentences please.'
answer = rag(q)
print(answer)

Yes, even if classes have commenced, new students who register before the end of module one can still join without penalties but must complete homework/assignments as specified in course materials by specific deadlines after a brief orientation and introduction to class content during office hours (live or recorded from previous session).
