# Querying Stage
In this stage, the RAG pipeline extracts the most pertinent context based on a user’s query and forwards it, along with the query, to the LLM to generate a response. This procedure equips the LLM with current knowledge that wasn’t included in its original training data. This also reduces the likelihood of hallucinations, a problem for LLMs when they invent answers for data they were insufficiently trained with. The pivotal challenges in this phase revolve around the retrieval, coordination, and analysis across one or several knowledge bases.

In [1]:
# Suppress Pydantic warnings since it's based in llamaindex
import warnings
warnings.filterwarnings('ignore', category=DeprecationWarning)

## Some hard-coded stuff in this cell
* OPEN AI Key
* Weaviate IP address

In [2]:
import os
# Set the OpenAI key and current Weaviate IP to run this notebook
OPENAI_KEY = "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"
os.environ["OPENAI_API_KEY"] = OPENAI_KEY

WEAVIATE_IP_ADDRESS = "34.133.13.119"

import weaviate
from weaviate import Client
from llama_index import VectorStoreIndex
from llama_index.storage import StorageContext
from llama_index.vector_stores import WeaviateVectorStore
from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters

/Users/iankelk/anaconda3/lib/python3.11/site-packages/pydantic/_internal/_config.py:267: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.4/migration/


In [5]:
# Custom prompt to exclude out of context answers
from llama_index.prompts import PromptTemplate

template = ("We have provided context information below. If the answer to a query is not contained in this context, "
            "please only reply that it is not in the context."
            "\n---------------------\n"
            "{context_str}"
            "\n---------------------\n"
            "Given this information, please answer the question: {query_str}\n"
)
qa_template = PromptTemplate(template)

## Some hard-coded stuff in this cell
* Weaviate IP address
* The websiteAddress obtained from what will probably be a dropdown in the frontend menu
* The timestamp obtained from what will probably be a dropdown in the frontend menu
* The query

In [7]:
# client setup
client = weaviate.Client(url="http://" + WEAVIATE_IP_ADDRESS + ":8080")

# construct vector store
vector_store = WeaviateVectorStore(weaviate_client=client, index_name="Pages", text_key="text")

# setting up the indexing strategy 
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# setup an index for the Vector Store
index = VectorStoreIndex.from_vector_store(vector_store, storage_context=storage_context)

# Create exact match filters for websiteAddress and timestamp
website_address_filter = ExactMatchFilter(key="websiteAddress", value="ai21.com")
timestamp_filter = ExactMatchFilter(key="timestamp", value="2023-10-06T18-11-24")

# Create a metadata filters instance with the above filters
metadata_filters = MetadataFilters(filters=[website_address_filter, timestamp_filter])

# Create a query engine with the custom prompt and filters
query_engine = index.as_query_engine(text_qa_template=qa_template, filters=metadata_filters)

# Execute the query
query_str = "How was AI21 Studio a game changer?"
response = query_engine.query(query_str)

# Print the response
print(response)

AI21 Studio was a game changer because it helped Verb.ai create a revolutionary writing tool for authors. It improved brainstorming and expression, making the process of completing long-form narratives faster, easier, and more fun. AI21 Studio assisted with all key stages of creation, including brainstorming, writing, and editing.


In [8]:
query_str = "Who is Kim Kardashian?"
response = query_engine.query(query_str)

# Print the response
print(response)

The information provided does not contain any information about Kim Kardashian.
