# Weaviate

- Author: [Haseom Shin](https://github.com/IHAGI-c)
- Design: []()
- Peer Review: []()
- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/13-LangChain-Expression-Language/11-Fallbacks.ipynb) [![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/13-LangChain-Expression-Language/11-Fallbacks.ipynb)

## Overview

This notebook covers how to get started with the Weaviate vector store in LangChain, using the `langchain-weaviate` package.

> [Weaviate](https://weaviate.io/) is an open-source vector database. It allows you to store data objects and vector embeddings from your favorite ML-models, and scale seamlessly into billions of data objects.

To use this integration, you need to have a running Weaviate database instance.

### Table of Contents

- [Overview](#overview)
- [Environment Setup](#environment-setup)


### Key Concepts


### References
- [Langchain-Weaviate](https://python.langchain.com/docs/integrations/providers/weaviate/)
- [Weaviate Documentation](https://weaviate.io/developers/weaviate)
---

## Environment Setup

Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.

**[Note]**
- `langchain-opentutorial` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. 
- You can checkout the [`langchain-opentutorial`](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details.

In [2]:
%%capture --no-stderr
%pip install langchain-opentutorial

In [3]:
# Install required packages
from langchain_opentutorial import package

package.install(
    [
        "openai",
        "langsmith",
        "langchain",
        "tiktoken",
        "langchain-weaviate",
        "langchain-openai",
    ],
    verbose=False,
    upgrade=False,
)


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [4]:
# Set environment variables
from langchain_opentutorial import set_env

set_env(
    {
        "OPENAI_API_KEY": "",
        "WEAVIATE_API_KEY": "",
        "LANGCHAIN_API_KEY": "",
        "LANGCHAIN_TRACING_V2": "true",
        "LANGCHAIN_ENDPOINT": "https://api.smith.langchain.com",
        "LANGCHAIN_PROJECT": "Weaviate",
    }
)

Environment variables have been set successfully.


You can alternatively set `OPENAI_API_KEY` in `.env` file and load it. 

[Note] This is not necessary if you've already set `OPENAI_API_KEY` in previous steps.

In [5]:
from dotenv import load_dotenv

load_dotenv(override=True)

True

## What is Weaviate?

Weaviate is a powerful open-source vector database that revolutionizes how we store and search data. It combines traditional database capabilities with advanced machine learning features, allowing you to:

- Store both JSON documents and their vector embeddings in a unified system
- Perform lightning-fast semantic searches across billions of data objects
- Utilize built-in machine learning modules or bring your own vectors
- Access data through an intuitive GraphQL API

> 💡 **Key Feature**: Weaviate achieves millisecond-level query performance, making it suitable for production environments.

## Why Use Weaviate?

Weaviate stands out for several reasons:

1. **Versatility**: Supports multiple media types (text, images, etc.)
2. **Advanced Features**:
   - Semantic Search
   - Question-Answer Extraction
   - Classification
   - Custom ML Model Integration
3. **Production-Ready**: Built in Go for high performance and scalability
4. **Developer-Friendly**: Multiple access methods through GraphQL, REST, and various client libraries


## Connecting to Weaviate

There are three main ways to connect to Weaviate:

1. **Local Connection**: Connect to a Weaviate instance running locally through Docker
2. **Weaviate Cloud Services (WCS)**: Use Weaviate's managed cloud service
3. **Custom Deployment**: Deploy Weaviate on Kubernetes or other custom configurations

For this notebook, we'll use Weaviate Cloud Services (WCS) as it provides the easiest way to get started without any local setup.

### Setting up Weaviate Cloud Services

1. First, sign up for a free account at [Weaviate Cloud Console](https://console.weaviate.cloud)
2. Create a new cluster and get your API key
3. Connect to your WCS cluster

In [7]:
from weaviate import connect_to_wcs
from weaviate.auth import AuthApiKey
import os
# Connect to the Weaviate instance
weaviate_client = connect_to_wcs(
    auth_credentials=AuthApiKey(api_key=os.getenv("WEAVIATE_API_KEY")),
    cluster_url="https://6s4qfg5urvg10hhctpsktg.c0.us-west3.gcp.weaviate.cloud"
)

## Finding Objects by Similarity

Weaviate allows you to find objects that are semantically similar to your query. Let's walk through a complete example, from importing data to executing similarity searches.

### Step 1: Preparing Your Data

Before we can perform similarity searches, we need to populate our Weaviate instance with data. We'll start by loading and chunking a text file into manageable pieces.

> 💡 **Tip**: Breaking down large texts into smaller chunks helps optimize vector search performance and relevance.

In [40]:
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_weaviate.vectorstores import WeaviateVectorStore
from langchain.schema import Document

# Create a document with metadata, including geo-information
raw_texts = [
    "The Eiffel Tower in Paris stands 324 meters tall and was completed in 1889.",
    "The Great Wall of China is over 21,000 kilometers long and was built over several centuries.",
    "The Taj Mahal in India was built by Emperor Shah Jahan as a tomb for his beloved wife.",
    "Machu Picchu in Peru was built by the Inca Empire in the 15th century at an altitude of 2,430 meters.",
    "The Pyramids of Giza in Egypt were built over 4,500 years ago as tombs for pharaohs.",
    "The Colosseum in Rome could hold up to 50,000 spectators for gladiatorial contests.",
    "Petra in Jordan was carved into rose-colored rock faces and served as a trading center.",
    "Angkor Wat in Cambodia is the world's largest religious monument, built in the 12th century."
]

# 각 텍스트에 해당하는 지역 정보
regions = [
    "Europe",    # Eiffel Tower
    "Asia",      # Great Wall
    "Asia",      # Taj Mahal
    "South America",  # Machu Picchu
    "Africa",    # Pyramids
    "Europe",    # Colosseum
    "Asia",      # Petra
    "Asia"       # Angkor Wat
]

docs = [
    Document(page_content=text, metadata={"region": region}) 
    for text, region in zip(raw_texts, regions)
]

embeddings = OpenAIEmbeddings()

db = WeaviateVectorStore.from_documents(docs, embeddings, client=weaviate_client)

### Step 2: Perform the search

We can now perform a similarity search. This will return the most similar documents to the query text, based on the embeddings stored in Weaviate and an equivalent embedding generated from the query text.

In [42]:
query = "What is Petra?"
docs = db.similarity_search(query, k=1)

for i, doc in enumerate(docs):
    print(f"\nDocument {i+1}:")
    print(doc.page_content)


Document 1:
Petra in Jordan was carved into rose-colored rock faces and served as a trading center.


You can also add filters, which will either include or exclude results based on the filter conditions. (See [more filter examples](https://weaviate.io/developers/weaviate/search/filters).)

In [44]:
from weaviate.classes.query import Filter

for region in regions:
    search_filter = Filter.by_property("region").equal(region)
    filtered_results = db.similarity_search(query, filters=search_filter, k=4)
    
    print(f"\n=== Monuments in {region} ===")
    print(f"Found {len(filtered_results)} results:")
    for i, doc in enumerate(filtered_results, 1):
        print(f"\nDocument {i}:")
        print(f"Content: {doc.page_content}")
        print(f"Region: {doc.metadata['region']}")


=== Monuments in Europe ===
Found 2 results:

Document 1:
Content: The Colosseum in Rome could hold up to 50,000 spectators for gladiatorial contests.
Region: Europe

Document 2:
Content: The Eiffel Tower in Paris stands 324 meters tall and was completed in 1889.
Region: Europe

=== Monuments in Asia ===
Found 4 results:

Document 1:
Content: Petra in Jordan was carved into rose-colored rock faces and served as a trading center.
Region: Asia

Document 2:
Content: Angkor Wat in Cambodia is the world's largest religious monument, built in the 12th century.
Region: Asia

Document 3:
Content: The Taj Mahal in India was built by Emperor Shah Jahan as a tomb for his beloved wife.
Region: Asia

Document 4:
Content: The Great Wall of China is over 21,000 kilometers long and was built over several centuries.
Region: Asia

=== Monuments in Asia ===
Found 4 results:

Document 1:
Content: Petra in Jordan was carved into rose-colored rock faces and served as a trading center.
Region: Asia

Documen

It is also possible to provide `k`, which is the upper limit of the number of results to return.

In [59]:
# Using the k parameter to limit the number of results
print("\n=== Limiting Results with k parameter ===")
search_filter = Filter.by_property("region").equal(regions[0])  # Europe
filtered_search_results = db.similarity_search(query, filters=search_filter, k=3)
print(f"\nSearching for monuments in {regions[0]} with k=3:")
print(f"Number of results: {len(filtered_search_results)}")
for i, doc in enumerate(filtered_search_results, 1):
    print(f"\nResult {i}:")
    print(f"Content: {doc.page_content}")

# Check if the number of results is k or less
assert len(filtered_search_results) <= 3, f"Expected 3 or fewer results, but got {len(filtered_search_results)}"
print("\nVerification: ✓ Number of results is correctly limited by k parameter")


=== Limiting Results with k parameter ===

Searching for monuments in Europe with k=3:
Number of results: 2

Result 1:
Content: The Colosseum in Rome could hold up to 50,000 spectators for gladiatorial contests.

Result 2:
Content: The Eiffel Tower in Paris stands 324 meters tall and was completed in 1889.

Verification: ✓ Number of results is correctly limited by k parameter


### Quantify Result Similarity

When performing similarity searches, you might want to know not just which documents are similar, but how similar they are. Weaviate provides this information through a relevance score.
> 💡 Tip: The relevance score helps you understand the relative similarity between search results.

In [62]:
docs = db.similarity_search_with_score("What monuments are in Asia?", k=5)

for doc in docs:
    print(f"{doc[1]:.3f}", ":", doc[0].page_content)

1.000 : Angkor Wat in Cambodia is the world's largest religious monument, built in the 12th century.
0.729 : The Taj Mahal in India was built by Emperor Shah Jahan as a tomb for his beloved wife.
0.527 : Petra in Jordan was carved into rose-colored rock faces and served as a trading center.
0.510 : The Great Wall of China is over 21,000 kilometers long and was built over several centuries.
0.304 : The Pyramids of Giza in Egypt were built over 4,500 years ago as tombs for pharaohs.


## Search mechanism

`similarity_search` uses Weaviate's [hybrid search](https://weaviate.io/developers/weaviate/api/graphql/search-operators#hybrid).

A hybrid search combines a vector and a keyword search, with `alpha` as the weight of the vector search. The `similarity_search` function allows you to pass additional arguments as kwargs. See this [reference doc](https://weaviate.io/developers/weaviate/api/graphql/search-operators#hybrid) for the available arguments.

So, you can perform a pure keyword search by adding `alpha=0` as shown below:

In [63]:
docs = db.similarity_search(query, alpha=0)
docs[0]

Document(metadata={'region': 'Asia'}, page_content='Petra in Jordan was carved into rose-colored rock faces and served as a trading center.')

## Persistence

Any data added through `langchain-weaviate` will persist in Weaviate according to its configuration. 

WCS instances, for example, are configured to persist data indefinitely, and Docker instances can be set up to persist data in a volume. Read more about [Weaviate's persistence](https://weaviate.io/developers/weaviate/configuration/persistence).

## Multi-tenancy

[Multi-tenancy](https://weaviate.io/developers/weaviate/concepts/data#multi-tenancy) allows you to have a high number of isolated collections of data, with the same collection configuration, in a single Weaviate instance. This is great for multi-user environments such as building a SaaS app, where each end user will have their own isolated data collection.

To use multi-tenancy, the vector store need to be aware of the `tenant` parameter. 

So when adding any data, provide the `tenant` parameter as shown below.

In [64]:
db_with_mt = WeaviateVectorStore.from_documents(
    docs, embeddings, client=weaviate_client, tenant="Foo"
)

2025-Jan-07 09:20 PM - langchain_weaviate.vectorstores - INFO - Tenant Foo does not exist in index LangChain_36336a662af7405b89d8d1e60ab90c5c. Creating tenant.


And when performing queries, provide the `tenant` parameter also.

In [65]:
db_with_mt.similarity_search(query, tenant="Foo")

[Document(metadata={'region': 'Asia'}, page_content='Petra in Jordan was carved into rose-colored rock faces and served as a trading center.')]

## Retriever options

Weaviate can also be used as a retriever

### Maximal marginal relevance search (MMR)

In addition to using similaritysearch  in the retriever object, you can also use `mmr`.

In [66]:
retriever = db.as_retriever(search_type="mmr")
retriever.invoke(query)[0]

Document(metadata={'region': 'Asia'}, page_content='Petra in Jordan was carved into rose-colored rock faces and served as a trading center.')

## Use with LangChain

A known limitation of large language models (LLMs) is that their training data can be outdated, or not include the specific domain knowledge that you require.

Take a look at the example below:

In [68]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
llm.predict("What is Eiffel Tower?")

"The Eiffel Tower is a famous landmark in Paris, France. It is a wrought iron lattice tower that was built for the 1889 World's Fair and has since become a global symbol of France and one of the most recognizable structures in the world. It stands at 1,063 feet tall and is one of the most visited tourist attractions in the world."

Vector stores complement LLMs by providing a way to store and retrieve relevant information. This allow you to combine the strengths of LLMs and vector stores, by using LLM's reasoning and linguistic capabilities with vector stores' ability to retrieve relevant information.

Two well-known applications for combining LLMs and vector stores are:
- Question answering
- Retrieval-augmented generation (RAG)

### Question Answering with Sources

Question answering in langchain can be enhanced by the use of vector stores. Let's see how this can be done.

This section uses the `RetrievalQAWithSourcesChain`, which does the lookup of the documents from an Index. 

First, we will chunk the text again and import them into the Weaviate vector store.

In [69]:
from langchain.chains import RetrievalQAWithSourcesChain
from langchain_openai import OpenAI

In [85]:
docsearch = WeaviateVectorStore.from_texts(
    raw_texts,
    embeddings,
    client=weaviate_client,
    metadatas=[{"source": f"{i}-pl"} for i in range(len(raw_texts))],
)

Now we can construct the chain, with the retriever specified:

In [86]:
chain = RetrievalQAWithSourcesChain.from_chain_type(
    OpenAI(temperature=0), chain_type="stuff", retriever=docsearch.as_retriever()
)

In [87]:
chain(
    {"question": "What is Eiffel Tower?"},
    return_only_outputs=True,
)

{'answer': ' The Eiffel Tower is a 324-meter tall structure located in Paris, France, completed in 1889.\n',
 'sources': '0-pl'}

### Retrieval-Augmented Generation

Another very popular application of combining LLMs and vector stores is retrieval-augmented generation (RAG). This is a technique that uses a retriever to find relevant information from a vector store, and then uses an LLM to provide an output based on the retrieved data and a prompt.

We begin with a similar setup:

In [None]:
# with open("state_of_the_union.txt") as f:
#     state_of_the_union = f.read()
# text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
# texts = text_splitter.split_text(state_of_the_union)

In [88]:
docsearch = WeaviateVectorStore.from_texts(
    raw_texts,
    embeddings,
    client=weaviate_client,
    metadatas=[{"source": f"{i}-pl"} for i in range(len(raw_texts))],
)

retriever = docsearch.as_retriever()

We need to construct a template for the RAG model so that the retrieved information will be populated in the template.

In [89]:
from langchain_core.prompts import ChatPromptTemplate

template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)

print(prompt)

input_variables=['context', 'question'] input_types={} partial_variables={} messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question}\nContext: {context}\nAnswer:\n"), additional_kwargs={})]


In [95]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

In [96]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What is Petra?")

'Petra is an archaeological site in Jordan, known for its stunning architecture carved into rose-colored rock faces. It served as a significant trading center in ancient times.'