# RAG with Langchain and Llama
<hr style="border:2px solid black">

## 1. Introduction

#### Limitations of LLMs

- know nothing outside training data, e.g., up-to-date information, classified/private data
- not specialized in specfic use cases
- tend to hallucinate confidently, possibly leading to misinformation
- produce black box output: do not clarify what has led to the generation of particular content

#### Fine Tuning

- enhances model performance for specific use case through Transfer Learning with additional data
- changes model parameters, enhancing speed and reducing cost for specific task
- powelful tool for:
  + incorporating non-dynamic or past data
  + specific industries with nuances in writing style, vocabulary, terminology
- cutoff issue persists in absence of up-to-date information

#### Retrieval Augmented Generation (RAG)

- increases model capabilities through:
  + **retrieving** external and up-to-date information
  + **augmenting** the original prompt given to the model
  + **generating** response using context plus information
- ground llm model parameters remain unchanged (no Transfer Learning)
- powerful tool for making use of dynamic up-to-date information
- white box output: provides transprency behind the model without hallucination

#### RAG Framework

<img src="../../images/api_test/rag.png" width="950"/>


### Technology Stack

#### [LangChain](https://python.langchain.com/docs/introduction/)

> framework for developing applications powered by LLMs

#### [FAISS (Facebook AI Similarity Search)](https://ai.meta.com/tools/faiss/)

>  library allowing storage of contextual embedding vectors in vector database and similarity search

#### [Groq](https://groq.com/about-us/)

> engine providing fast AI inference (conclusion from brand new data) in the cloud

<hr style="border:2px solid black">

## 2. Warm Up

#### load credentials

In [57]:
from dotenv import load_dotenv
import os
from langchain_community.document_loaders import IFixitLoader
load_dotenv()

True

In [58]:
groq_key = os.getenv('GROQ_KEY')
USER_AGENT = os.getenv("USER_AGENT")

#### define llm

In [59]:
import warnings
warnings.filterwarnings("ignore")
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="llama3-8b-8192",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    api_key=groq_key
)

#### define promt template

**What is a Prompt?**
>- set of instructions or input for an LLM provided by a user to guide its response
>- helps it understand the context and generate relevant and coherent language-based output

In [60]:
from langchain.prompts.prompt import PromptTemplate

In [61]:
query = """
    given the information {information} about a device to fix:
    1. Name the device, make, model and problem
    2. two steps of repairing
    """

In [62]:
prompt_template = PromptTemplate(
    input_variables=["information"],
    template=query
)

#### define Chain

**What is a Chain?**

> - allows to link the output of one LLM call as the input of another

In [63]:
chain = prompt_template | llm

**Note:**
The `|` symbol chains together the different components, feeding the output from one component as input into the next component.
In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model. 

#### invoke Chain

In [None]:
# this is the point where we have to replace the https with a device information data set that can be querried
# or querry a question from the chatbot that leads to the replacement of this URL


# loader = IFixitLoader(
#     "https://www.ifixit.com/Answers/View/318583/My+iPhone+6+is+typing+and+opening+apps+by+itself"
# )
# data = loader.load()

In [103]:
data = IFixitLoader.load_suggestions("iPhone 6", doc_type = 'guide')
data

[Document(metadata={'source': 'https://www.ifixit.com/Guide/iPhone+6+Battery+Replacement/29363', 'title': 'iPhone 6 Battery Replacement'}, page_content="# iPhone 6 Battery Replacement\nBring life back to your iPhone 6 with [product|IF268-002|a new replacement battery|new_window=true]—it’s easy and will have a big impact! If your battery is swollen, [[What to do with a swollen battery|take appropriate precautions|new_window=true]].\n\nThis guide instructs you to remove the front panel assembly; this is intended to prevent damage to the display cables. If you feel comfortable supporting the display carefully while peeling the battery out of the iPhone, you can skip the display removal and go directly to the battery removal steps.\n\n***For optimal performance, after completing this guide, [[Battery Calibration|calibrate]] your newly installed battery:*** Charge it to 100% and keep charging it for at least two more hours. Then use your iPhone until it shuts off due to low battery. Finally

In [None]:
# output = chain.invoke(input={"information": data})

In [88]:
# print(output.content)

In [89]:
# output = chain.invoke(input={"information": data})
# print(output.content) # Interactive querry not possible at this stage

In [92]:
# Library
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter

def split_documents(documents, chunk_size=800, chunk_overlap=80): # check chink size and overlap for our purpose
    """
    this function splits documents into chunks of given size and overlap
    """
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap
    )
    chunks = text_splitter.split_documents(documents=documents)
    return chunks

In [98]:
# plit into chinks 
data_chunks = split_documents(data)
data_chunks

[Document(metadata={'source': 'https://www.ifixit.com/Guide/iPhone+6+Battery+Replacement/29363', 'title': 'iPhone 6 Battery Replacement'}, page_content='# iPhone 6 Battery Replacement\nBring life back to your iPhone 6 with [product|IF268-002|a new replacement battery|new_window=true]—it’s easy and will have a big impact! If your battery is swollen, [[What to do with a swollen battery|take appropriate precautions|new_window=true]].\n\nThis guide instructs you to remove the front panel assembly; this is intended to prevent damage to the display cables. If you feel comfortable supporting the display carefully while peeling the battery out of the iPhone, you can skip the display removal and go directly to the battery removal steps.'),
 Document(metadata={'source': 'https://www.ifixit.com/Guide/iPhone+6+Battery+Replacement/29363', 'title': 'iPhone 6 Battery Replacement'}, page_content='***For optimal performance, after completing this guide, [[Battery Calibration|calibrate]] your newly inst

In [99]:
print(f"number of chunks created: {len(data_chunks)}")

number of chunks created: 182


In [114]:
### converting the chunks to vectors

from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
import os

def create_embedding_vector_db(chunks, db_name, target_directory=f"../vector_databases"):
    """
    this function uses the open-source embedding model HuggingFaceEmbeddings 
    to create embeddings and store those in a vector database called FAISS, 
    which allows for efficient similarity search
    """
    # instantiate embedding model
    embedding = HuggingFaceEmbeddings(
        model_name='sentence-transformers/all-mpnet-base-v2' # embedding model converts text to vector ( stick to it)
    )
    # create the vector store 
    vectorstore = FAISS.from_documents( # stores embeddings
        documents=chunks,
        embedding=embedding
    )
    # save vector database locally
    if not os.path.exists(target_directory):
        os.makedirs(target_directory)
    vectorstore.save_local(f"{target_directory}/{db_name}_vector_db")

In [115]:
create_embedding_vector_db(chunks=data_chunks, db_name="test")

# its now created and there to use for next time, because its saved locally

# retrieve from vector when continuing ur work ( see below)

In [109]:
# Retrieve from Vector Database

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain
from langchain import hub

def retrieve_from_vector_db(vector_db_path):
    """
    this function splits out a retriever object from a local vector database
    """
    # instantiate embedding model
    embeddings = HuggingFaceEmbeddings(
        model_name='sentence-transformers/all-mpnet-base-v2'
    )
    react_vectorstore = FAISS.load_local(
        folder_path=vector_db_path,
        embeddings=embeddings,
        allow_dangerous_deserialization=True # important for pickle file
    )
    retriever = react_vectorstore.as_retriever()
    return retriever

def connect_chains(retriever):
    """
    this function connects stuff_documents_chain with retrieval_chain
    """
    stuff_documents_chain = create_stuff_documents_chain(
        llm=llm,
        prompt=hub.pull("langchain-ai/retrieval-qa-chat")
    )
    retrieval_chain = create_retrieval_chain(
        retriever=retriever,
        combine_docs_chain=stuff_documents_chain
    )
    return retrieval_chain



In [110]:
test_retriever = retrieve_from_vector_db("../vector_databases/test_vector_db")

In [111]:
# generation
test_retrieval_chain = connect_chains(test_retriever)
test_retrieval_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x00000163000F9550>, search_kwargs={}), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | ChatPromptTemplate(input_variables=['context', 'input'], optional_variables=['chat_history'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage

In [112]:
def print_output(
    inquiry,
    retrieval_chain=test_retrieval_chain
):
    result = retrieval_chain.invoke({"input": inquiry})
    print(result['answer'].strip("\n"))

In [113]:
print_output("give me a step by step guide to repair my iPhone 6. Only give me 5 steps")

Based on the provided context, here's a 5-step guide to repair your iPhone 6:

**Step 1: Disassemble the iPhone**
Power off your iPhone and discharge the battery below 25%. Remove the two 3.6 mm P2 Pentalobe screws next to the Lightning connector.

**Step 2: Open the iPhone**
Open the iPhone by swinging the home button end of the front panel assembly away from the rear case, using the top of the phone as a hinge. Several clips along the top edge of the front panel form a partial hinge. During reassembly, align the clips just below the top edge of the rear case. Then, slide the front panel upward until its top edge is flush with that of the rear case.

**Step 3: Prop up the Display**
Open the display to about a 90º angle, and lean it against something to keep it propped up while you're working on the phone. You can use an unopened canned beverage to hold it in place. Add a rubber band to keep the display securely in place while you work.

**Step 4: Remove the Old Screen**
Remove the old

<hr style="border:2px solid black">