# RAG with Weaviate

Another try to implement a program that fills Weaviate using another embedding and retrieves data afterwards. The embedding I want to use is the bge-base.

In [1]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.vectorstores import Weaviate
import weaviate
from weaviate.embedded import EmbeddedOptions
# from langchain_community.embeddings.azure_openai import AzureOpenAIEmbeddings
from langchain_community.embeddings import HuggingFaceBgeEmbeddings
import dotenv
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import AzureChatOpenAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser


In [2]:
with open('state_of_the_union.txt', 'r') as file:
    text = file.read()

loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()

In [3]:
dotenv.load_dotenv()

True

In [4]:
content = dotenv.dotenv_values()


In [5]:
azure_open_ai_key = content['AZURE_OPENAI_KEY']
azure_open_ai_endpoint = content['AZURE_OPENAI_ENDPOINT']   
azure_open_ai_api_version = content['AZURE_OPENAI_API_VERSION']

In [6]:
splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)

In [None]:
# for chunk in chunks:
#     print(chunk)

Connect to Weaviate.
When calling this for the first time it creates the data folder and some files like:
- classification.db
- migration1.19.filter2search.skip.flag
- migration1.19.filter2search.state
- migration1.22.fs.hierarchy
- modules.db
- schema.db
- tx.db

In [8]:
client = weaviate.Client(embedded_options=EmbeddedOptions(persistence_data_path='./my_db/weaviate_data'))

Started /home/alois/.cache/weaviate-embedded: process ID 96938


            Consider upgrading to the new and improved v4 client instead!
            See here for usage: https://weaviate.io/developers/weaviate/client-libraries/python
            
{"action":"config_load","config_file_path":"./weaviate.conf.json","level":"info","msg":"Usage of the weaviate.conf.json file is deprecated and will be removed in the future. Please use environment variables.","time":"2024-05-10T16:53:32+02:00"}
{"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2024-05-10T16:53:32+02:00"}
{"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2024-05-10T16:53:32+02:00"}
{"level":"info","msg":"No resource limits set, weaviate will use all available memory and CPU. To limit resources, set LIMIT_RESOURCES=true","time":"2024-

Now store the result to Weaviate. This creates then a folder
- ./data/langchain_53625abcc2aa4ababde0b100bf9d69b8   

This then contains again a lot of other folders and files

In [10]:
vector_store = Weaviate.from_documents(
    client = client,
    documents = chunks,
    # the main change here is to replace the emgbedding with HuggingFaceBgeEmbeddings
    # this requires also a slight change in the import statement
    embedding = HuggingFaceBgeEmbeddings(),
    by_text = False,
    
)

  from .autonotebook import tqdm as notebook_tqdm
{"level":"info","msg":"Created shard langchain_24cbd8e54e6549c19fc80cc9e218c527_958lqiD5RYSo in 1.0791ms","time":"2024-05-10T16:53:38+02:00"}
{"action":"hnsw_vector_cache_prefill","count":1000,"index_id":"main","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2024-05-10T16:53:38+02:00","took":59400}


In [12]:
# client2 = weaviate.WeaviateClient()
client2 = weaviate.Client("http://localhost:8080")

for chunk in chunks:
    # Create an object with your chunk data
    obj = {
        "class": "YourClassName",  # Replace with your class name
        "properties": chunk
    }

    # Store the object in Weaviate
    client2.data.create_object(obj)

            Consider upgrading to the new and improved v4 client instead!
            See here for usage: https://weaviate.io/developers/weaviate/client-libraries/python
            


ConnectionError: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /v1/.well-known/openid-configuration (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5c499c74d0>: Failed to establish a new connection: [Errno 111] Connection refused'))

## Step 1: Retrieve
Once the vector database is populated, you can define it as the retriever component, which fetches the additional context based on the semantic similarity between the user query and the embedded chunks.

In [11]:
retriever = vector_store.as_retriever()
retriever


VectorStoreRetriever(tags=['Weaviate', 'HuggingFaceBgeEmbeddings'], vectorstore=<langchain_community.vectorstores.weaviate.Weaviate object at 0x7ffa507ca3f0>)

## Step 2 Augment

Next, to augment the prompt with the additional context, you need to prepare a prompt template. The prompt can be easily customized from a prompt template, as shown below.

In [15]:

template = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the question. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)

print(prompt)

input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. \nUse the following pieces of retrieved context to answer the question. \nIf you don't know the answer, just say that you don't know. \nUse three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:\n"))]


## Step 3: Generate
Finally, you can build a chain for the RAG pipeline, chaining together the retriever, the prompt template and the LLM. Once the RAG chain is defined, you can invoke it.

In [16]:
# llm = ChatOpenAI(model_name="gpt-3.5-turbo", openai_api_key=OPENAI_API_KEY, temperature=0)
llm = AzureChatOpenAI(model_name="mcc_chat", 
                        api_key=azure_open_ai_key,
                        api_version=azure_open_ai_api_version,
                        # azure_open_ai_endpoint=azure_open_ai_endpoint,
                        temperature=0)

rag_chain = (
    {"context": retriever,  "question": RunnablePassthrough()} 
    | prompt 
    | llm
    | StrOutputParser() 
)

query = "Who was the other officer besides officer Mora that was killed?"
rag_chain.invoke(query)

  warn_deprecated(
/home/alois/SAPDevelop/git_intern/mcc-ml-weaviate-experiments/.venv/lib/python3.10/site-packages/pydantic/main.py:1051: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/


'Officer Jason Rivera was the other officer killed besides Officer Mora.'