# Querying Stage

This is a working notebook to write and test the code that is used in our Google Cloud function.

In this stage, the RAG pipeline extracts the most pertinent context based on a user’s query and forwards it, along with the query, to the LLM to generate a response. This procedure equips the LLM with current knowledge that wasn’t included in its original training data. This also reduces the likelihood of hallucinations, a problem for LLMs when they invent answers for data they were insufficiently trained with. The pivotal challenges in this phase revolve around the retrieval, coordination, and analysis across one or several knowledge bases.

In [1]:
# Suppress Pydantic warnings since it's based in llamaindex
import warnings
warnings.filterwarnings('ignore', category=DeprecationWarning)

## Hard-coded stuff in this cell that will be replaced in the cloud function
* OPEN AI Key will be an environment variable
* Weaviate IP address that we will work on finding programmatically

In [2]:
import os
from dotenv import load_dotenv
import weaviate
from weaviate import Client
from llama_index import VectorStoreIndex
from llama_index.storage import StorageContext
from llama_index.vector_stores import WeaviateVectorStore
from llama_index.vector_stores.types import ExactMatchFilter, MetadataFilters

# Load the .env file
load_dotenv()

# Retrieve the OpenAI API key from the environment variables
OPENAI_KEY = os.getenv("OPENAI_KEY")

# Set the OpenAI key as an Environment Variable (for when it's run on GCS)
os.environ["OPENAI_API_KEY"] = OPENAI_KEY

# Current Weaviate IP
WEAVIATE_IP_ADDRESS = "34.145.246.242"

In [3]:
# Custom prompt to exclude out of context answers
from llama_index.prompts import PromptTemplate

# template = ("We have provided context information below. If the answer to a query is not contained in this context, "
# #             "please only reply that it is not in the context."
# #             "please only reply that it is not in the context. If your response will be financial in nature, "
#             "make the first character of the completion 1, and if it is not financial, make the first character 0."
#             "After this initial number, 0 or 1, please continue your response as instructed previously."
#             "\n---------------------\n"
#             "{context_str}"
#             "\n---------------------\n"
#             "Given this information, please answer the question: {query_str}\n"
# )

template = ("We have provided context information below. If the answer to a query is not contained in this context, "
            "please explain that the context does not include the information. If the information IS included in the context, "
            "please answer the question using the context provided. However, do not refer to the context specifically by "
            "saying something like 'According to the context'."
            "\n---------------------\n"
            "{context_str}"
            "\n---------------------\n"
            "Given this information, please answer the question: {query_str}\n"
)

qa_template = PromptTemplate(template)

## Hard-coded stuff in this cell that will be replaced in the cloud function
* The websiteAddress will be from the query string of the https request
* The timestamp will be from the query string of the https request
* The query will be from the query string of the https request

In [4]:
# client setup
client = weaviate.Client(url="http://" + WEAVIATE_IP_ADDRESS + ":8080")

# construct vector store
vector_store = WeaviateVectorStore(weaviate_client=client, index_name="Pages", text_key="text")

# setting up the indexing strategy 
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# setup an index for the Vector Store
index = VectorStoreIndex.from_vector_store(vector_store, storage_context=storage_context)

# Create exact match filters for websiteAddress and timestamp
website_address_filter = ExactMatchFilter(key="websiteAddress", value="hume.ai")
timestamp_filter = ExactMatchFilter(key="timestamp", value="2024-02-05T22-28-03")

# Create a metadata filters instance with the above filters
metadata_filters = MetadataFilters(filters=[website_address_filter, timestamp_filter])

# Create a query engine with the custom prompt and filters
query_engine = index.as_query_engine(text_qa_template=qa_template,
                                     streaming=True,
                                     filters=metadata_filters)

            Please consider upgrading to the latest version. See https://weaviate.io/developers/weaviate/client-libraries/python for details.


In [5]:
# Execute the query
query_str = "What is Hume?"
streaming_response = query_engine.query(query_str)

def process_streaming_response(streaming_response):
    try:
        for text in streaming_response.response_gen:
            if text:   # Check for null character or empty string
                print(text, end="", flush=True)
    except asyncio.CancelledError as e:
        print('Streaming cancelled', flush=True)

process_streaming_response(streaming_response)

Hume AI is a research lab and technology company that aims to pave the way for an ethical, human-centric future for technology that understands how we express ourselves. They provide experimentally derived datasets, models, and APIs for technology that is guided by empathy and the pursuit of human well-being. Hume AI's solutions are based on cutting-edge research published in top scientific journals.

In [6]:
def extract_document_urls(streaming_response):
    urls = []
    for node_with_score in streaming_response.source_nodes:
        relationships = node_with_score.node.relationships
        for related_node_info in relationships.values():
            if related_node_info.node_type == "4":  # Corresponds to ObjectType.DOCUMENT
                urls.append(related_node_info.node_id)
    return urls

extracted_urls = extract_document_urls(streaming_response)
print("The following websites were used as references:\n")
print(extracted_urls)

The following websites were used as references:

['https://dev.hume.ai/docs/hume', 'https://dev.hume.ai/docs/introduction']


In [7]:
query_str = "Who is Kim Kardashian?"
streaming_response = query_engine.query(query_str)

# Print the response as it arrives
# streaming_response.print_response_stream()
process_streaming_response(streaming_response)

The context does not provide any information about Kim Kardashian.

In [8]:
extracted_urls = extract_document_urls(streaming_response)
print("The following websites were used as references:\n")
print(extracted_urls)

The following websites were used as references:

['https://dev.hume.ai/docs/too-many-face-identifiers', 'https://dev.hume.ai/docs/the-platform']
