# RAG with Weaviate and local embedding model

## Overview
In this chapter we will:
1. Replace OpenAI's embedding model with a local one called ```nomic-embed-text```.
2. Load the embedding into a new vector database (With the same structure).
3. Query the database.
4. Pass the result along with the query to the LLM
We can then compare whether the results were any worse than those of the OpenAI embedding model.

### A local embedding model
As we saw in the previous chapter, OpenAI throttles our embedding and slows the process down. The rate appeared to be 5 embeddings per second. Not quick. In addition, OpenAI is also charging us for the pleasure. 

### Getting going
[Ollama](https://ollama.com/) allows you to run LLMs locally. While I run on a 6 year-old Linux machine with an ancient AMD GPU, I am going to see if that's enough to host a small embedding model like Nomic.

To run, download Ollama and [follow the instructions here](https://ollama.com/library/nomic-embed-text) on how to pull Nomic using Ollama's command line. 

### Create a new database
We will start by creating a new collection/database in Weaviate. The database will have the same structure as we will still use Langchain's tools to read and split the PDFs.

In [4]:
import weaviate.classes.config as wc
import weaviate
import os

headers = {
    "X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY")
}  # Replace with your OpenAI API key

client = weaviate.connect_to_local()

client.collections.create(
    name="ADI_DOCS_TOO",
    properties=[
        wc.Property(name="chunk_content", data_type=wc.DataType.TEXT),
        wc.Property(name="chunk_document_name", data_type=wc.DataType.TEXT),
        wc.Property(name="chunk_document_page", data_type=wc.DataType.INT),
    ],
    # Define the vectorizer module
    vectorizer_config=wc.Configure.Vectorizer.text2vec_openai(),
    # Define the generative module
    generative_config=wc.Configure.Generative.openai()
)

client.close()

### Extract text from the PDFs
Again, like in the previous chapter, we will repeat the text extraction.

In [5]:
from langchain.document_loaders.pdf import PyPDFDirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.schema.document import Document

# load the documents
def load_documents():
    document_loader = PyPDFDirectoryLoader("docs")
    return document_loader.load()

# split documents to managable chunks
def split_documents(documents: list[Document]):
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size = 800,
        chunk_overlap = 80,
        length_function = len,
        is_separator_regex=False,
    )
    return text_splitter.split_documents(documents)

documents = load_documents()
chunks = split_documents(documents)
print(chunks[0])

page_content='aBlackfin® A-V EZ-Extender®
Manual
Revision 2.1, July 2012
Part Number
82-000870-01
Analog Devices, Inc.
One T echnology Way
Norwood, Mass. 02062-9106' metadata={'source': 'docs/AV_Blkf_EZ_extender_man_rev.2.1.pdf', 'page': 0}


Because this takes a while... let's save the chunks to a file

In [6]:
import pickle

with open("docs/text_chunks.pkl", "wb") as file:  # 'wb' means write in binary mode
    pickle.dump(chunks, file)


In [8]:
!ls -alh "docs/text_chunks.pkl"

-rw-r--r-- 1 yuvalzukerman yuvalzukerman 45M Oct 21 19:56 docs/text_chunks.pkl


### Load the database with our local embedding model
To try things out, let's start with adding a single chunk into Weaviate with Nomic embedding via Ollama.

In [9]:
from weaviate.util import generate_uuid5
import ollama

try:
    # Connect to Weaviate
    client = weaviate.connect_to_local()
    # Get the collection
    adi_docs = client.collections.get("ADI_DOCS_TOO")

    chunk_obj = {
                "chunk_content": chunks[0].page_content,
                "chunk_document_name": chunks[0].metadata['source'],
                "chunk_document_page": chunks[0].metadata['page'],
            }
    
    # Create a UUID seed
    cur_doc = chunks[0].metadata['source']
    cur_page = chunks[0].metadata['page']

    seed = cur_doc + ":" + str(cur_page) + ":0"

    response = ollama.embeddings(model="nomic-embed-text", 
                                     prompt=chunks[0].page_content)

    chunk_vector = response["embedding"]
    
    uuid = adi_docs.data.insert(
        properties = chunk_obj,
        uuid= generate_uuid5(seed),
        vector = chunk_vector
    )

    print(uuid)
        
        

finally:
    client.close()

3a0fe015-49b5-55aa-8d72-c1abbbb2b499


That looks like it worked, but let's try to search for this. Since we brought our embedding, we need to embed our query ourselves.

In [11]:
import weaviate.classes.query as wq
from weaviate.classes.query import MetadataQuery

try:
    # Connect to Weaviate
    client = weaviate.connect_to_local()
    # Get the collection
    adi_docs = client.collections.get("ADI_DOCS_TOO")

    # our query
    query="EZ-Extender"

    # Get query embedding
    response = ollama.embeddings(model="nomic-embed-text", 
                                     prompt=query)

    query_vector = response["embedding"]
    
    # Perform query
    response = adi_docs.query.near_vector(
        near_vector = query_vector, 
        limit=5, # maximum number of results
        return_metadata=MetadataQuery(distance=True)
    )

    # Inspect the response
    for o in response.objects:
        print(
            o.properties["chunk_content"], o.uuid
        )  # Print the title and release year (note the release date is a datetime object)
        print(
            f"Distance to query: {o.metadata.distance:.3f}\n"
        )  # Print the distance of the object from the query

finally:
    client.close()

aBlackfin® A-V EZ-Extender®
Manual
Revision 2.1, July 2012
Part Number
82-000870-01
Analog Devices, Inc.
One T echnology Way
Norwood, Mass. 02062-9106 3a0fe015-49b5-55aa-8d72-c1abbbb2b499
Distance to query: 0.285



#### Scaling that to the remaining chunks...
Let's try a simplistic approach:
1. Iterate over the chunk list
2. Create a list of chunk objects (like we did above to hold the text, page, and source document)
3. Create a list of corresponding embeddings with matching position IDs

The goal will be to iterate over the two lists and batch insert the chunks into Weaviate. We will do that in the next step.