### Generative AI at the Edge

##### This notebook goes through the steps required to chunk, process, generate embeddings, and answer user prompts

<font size="4">1. Import relevant libraries</font>

<font size="2">We use langchain as our main orchestrator</font>

In [None]:
from langchain.embeddings import LlamaCppEmbeddings
from werkzeug.utils import secure_filename
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain.vectorstores import FAISS
from langchain.llms import LlamaCpp
from langchain.vectorstores import Chroma
import time

<font size="4">2. Import relevant Large Langauge Models</font>

<font size="2">We use Llama v2 that has been quantized (shrunk smartly). Llama 2 is a family of generative text models that are optimized for assistant-like chat use cases or can be adapted for a variety of natural language generation tasks

This can take a few minutes to run as the model is around 3GB in size
</font>

In [None]:
print("Creating model...")

model = LlamaCpp(model_path="./models/llama-2-7b.Q4_K_M.gguf", n_threads=4)
embeddings = LlamaCppEmbeddings(model_path="./models/llama-2-7b.Q4_K_M.gguf", n_threads=4)

print("Model created.")

<font size="4">3. Load in texts</font>

<font size="2">In order to fine tune our model we want to create to embeddings which take in a more specific piece of text and creates word embeddings which can be used
</font>

In [None]:
text_file = "./data/codereviewer-short.txtx"

def load_file(path):
    # if the file extension is .txt, load as text
    if path.endswith(".txt"):
        loader = TextLoader(path)
        documents = loader.load()
    # if the file extension is .pdf, load as pdf
    elif path.endswith(".pdf"):
        loader = PyPDFLoader(path)
        documents = loader.load_and_split()

    return documents

documents = load_file(currentDocumentPath)
print(documents)

<font size="4">4. Chunk texts</font>

<font size="2">
</font>

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=50)
print("text was split")
texts = text_splitter.split_documents(documents)

print(f"Found {len(texts)} chunks")

<font size="4">5. Generate Embeddings</font>

<font size="2">
</font>

In [None]:
docSearch = FAISS.from_documents(texts, embeddings)

qaChain = RetrievalQA.from_chain_type(llm=model, chain_type="stuff", retriever=docSearch.as_retriever())
print(qaChain)

<font size="4">6. Run the prompt on the model</font>

<font size="2">
</font>

In [None]:
prompt = "What is the goal of a code review?"
response = model(prompt)
print(response)

<font size="4">7. Run the prompt on the model with embeddings</font>

<font size="2">
</font>

In [None]:
prompt = "What is the goal of a code review?"
response = qaChain.run(prompt)
print(response)

<font size="4">8. Explore which docs were relevant from the embeddings</font>

<font size="2">
</font>

In [None]:
# Retrieve relevant embeddings (Could also use a vector database here)
retriever = docSearch.as_retriever()
docs = retriever.get_relevant_documents(prompt)
for doc in docs:
    print("###")
    print(doc.page_content)