## Operating a RAG pipeline using locally hosted small language models

A small language model (SLM) in this case is defined as a language model that is less than 10B parameters.  Such models usually take up about 5GB of disk space and can be executed locally on any reasonable spec machine.

In this notebook, we set up the following RAG architecture with the NY Times comment set using a local ChromaDB vector database to handle the vector embeddings of the comment set, and using a few options for SLMs involved using ollama. 

![Simple RAG Architecture](../rag_architecture.jpg) 

Note that to use ollama you need to install it on your machine.  See [the ollama website](https://ollama.com/) for details.  This workflow also assumes that you have already created the Chroma vector database (see the `chromadb_prep` folder for instructions).

In [2]:
# packages
from langchain_community.llms import Ollama
import chromadb
import pandas as pd

# location of pre-built local chromadb
CHROMA_DATA_PATH = "../chroma_data/"
collection_db="article_comments"

# chroma client 
chroma_client = chromadb.PersistentClient(path=CHROMA_DATA_PATH)

# SLM options via Ollama
mistral = Ollama(model="mistral")
llama3 = Ollama(model="llama3:70b")
gemma = Ollama(model="gemma")

# helper function for prompt construction
def construct_prompt(docs: dict, question: str) -> str:
    # convert the docs into a numbered list of comments
    results_df = pd.DataFrame(docs['documents']).transpose()
    results_df.columns = ['Comment']
    results_df['ComNum'] = [str(i) for i in range(1, len(results_df) + 1)]
    results_df['Numbered Comments'] = results_df['ComNum'] + '. ' + results_df['Comment']

    # Collect the results in a context
    context = "\n".join([r for r in results_df['Numbered Comments']])

    # construct prompt
    prompt = f"""
        Answer the following question: {question}.  
        Refer only to the following numbered list of comments from NY Times readers when answering: {context}.
        Check each numbered comment very carefully and ignore it if it does not contain language that is a close match to the original question.
        Provide as much information as possible in the summary, subject to the conditions already given.
        Begin your answer with 'Based on the responses from selected NY Times readers', and try to give a sense of majority and minority opinions on the topic, but only if there is an identifiable majority opinion.
        If there is not enough information provided to give a summarized opinion, indicate that this is the case.
        """

    return prompt

# RAG pipeline function
def ask_question_local(question:str, llm: Ollama() = llama3, 
                       collection: chromadb.PersistentClient() = collection_db, n_docs:int = 50, 
                       filters: dict ={}) -> str:
    
    # Find close documents in chromadb
    collection = chroma_client.get_collection(collection)
    results = collection.query(
       query_texts=[question],
       n_results=n_docs,
       where=filters
    )

    prompt = construct_prompt(results, question)
    
    # generate response
    print(llm.invoke(prompt))

In [4]:
# test function
ask_question_local("What do readers think about US foreign policy towards North Korea?", llm = llama3, n_docs = 100)

Based on the responses from selected NY Times readers, many are skeptical about the effectiveness of US foreign policy towards North Korea. A significant number of respondents believe that North Korea's leadership will never willingly give up its nuclear weapons or affiliated programs, citing the regime's existential need for a nuclear capability and its history of belligerence.

Some readers suggest that the onus is on the rest of the civilized world to contain North Korea through means such as developing anti-missile systems, squeezing Pyongyang financially, humiliating its leaders with sanctions, and developing cyber capabilities. A few respondents also propose more drastic measures, including assassinating NK scientists associated with the missile program and sending Navy SEALs to find secret tunnels and labs.

However, a minority opinion suggests that South Korea appears to be taking a more level-headed approach to negotiations, with some readers praising the country's leadership 