## Installations

In [None]:
!pip install weaviate-client llama-index==0.8.10

## Connect to Weaviate

In [None]:
import weaviate

client = weaviate.Client(
    embedded_options=weaviate.embedded.EmbeddedOptions()
)

client.schema.get()  # Get the schema to test connection

## Create Schema

In [3]:
schema = {
   "classes": [
       {
           "class": "BlogPost",
           "description": "Blog post from the Weaviate website.",
           "vectorizer": "text2vec-openai",
           "properties": [
               {
                  "name": "content",
                  "dataType": ["text"],
                  "description": "Content from the blog post",
               }
            ]
        }
    ]
}

client.schema.create(schema)

print("Schema was created.")

Schema was created.


## Load in Data

In [5]:
from llama_index import download_loader, SimpleWebPageReader

SimpleWebPageReader = download_loader("SimpleWebPageReader")

loader = SimpleWebPageReader(html_to_text=True)
blog = loader.load_data(urls=['https://weaviate.io/blog/llamaindex-and-weaviate'])

## Construct Vector Store

In [6]:
from llama_index.vector_stores import WeaviateVectorStore
from llama_index import VectorStoreIndex, StorageContext
from llama_index.storage.storage_context import StorageContext
import os
import openai

openai.api_key = 'sk-key'

# construct vector store
vector_store = WeaviateVectorStore(weaviate_client = client, index_name="BlogPost", text_key="content")

# setting up the storage for the embeddings
storage_context = StorageContext.from_defaults(vector_store = vector_store)

# set up the index
index = VectorStoreIndex(blog, storage_context = storage_context)

query_engine = index.as_query_engine()

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


## Set up Sub Question Query Engine

In [7]:
from llama_index.tools import QueryEngineTool, ToolMetadata
from llama_index.query_engine import SubQuestionQueryEngine

query_engine_tools = [
    QueryEngineTool(
        query_engine = query_engine,
        metadata = ToolMetadata(name='BlogPost', description='Blog post about the integration of LlamaIndex and Weaviate')
    )
]

query_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=query_engine_tools)

## Query Time

In [8]:
response = await query_engine.aquery('How does LlamaIndex help data indexing in Weaviate?')

Generated 3 sub questions.
[36;1m[1;3m[BlogPost] Q: What is LlamaIndex?
[0m[33;1m[1;3m[BlogPost] Q: What is Weaviate?
[0m[38;5;200m[1;3m[BlogPost] Q: How does LlamaIndex integrate with Weaviate?
[0m[33;1m[1;3m[BlogPost] A: Weaviate is a software system that combines a language model with an external storage provider to create a "chat with your data" experience. It can be used to build powerful and reliable retrieval-augmented generation (RAG) systems, which enable the language model to access and retrieve specific facts, figures, or contextually relevant information. Weaviate can be used in various applications such as search engines and chatbots. It provides a vector database for storing and indexing data, and it can be integrated with other data frameworks like LlamaIndex to facilitate data management and query modules.
[0m[38;5;200m[1;3m[BlogPost] A: LlamaIndex integrates with Weaviate by providing the critical components needed to easily set up a powerful and reliable

In [9]:
print(response)

LlamaIndex helps with data indexing in Weaviate by providing tools and capabilities for data ingestion, management, and querying. It offers connectors to various data sources, allowing users to easily integrate data from existing files and applications into Weaviate. LlamaIndex supports indexing unstructured, semi-structured, and structured data, enabling users to split source documents into text chunks and store them in Weaviate's vector database. This facilitates efficient and effective indexing of data in Weaviate, making it easier to retrieve specific facts, figures, or contextually relevant information when building applications such as search engines and chatbots.
