# Using Redis and OpenAI to chat with PDF documents

This notebook demonstrates how to use RedisAI and (Azure) OpenAI to chat with PDF documents. The PDF included is
a informational brochure about the Chevy Colorado pickup truck.

In this notebook, we will use LLamaIndex to chunk, vectorize, and store the PDF document in Redis as vectors
alongside associated text. The query interface provided by LLamaIndex will be used to search for relevant
information given queries from the user.

In [1]:
# Install the Python requirements
%pip install -q -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [2]:
import os
import sys
import logging

logging.basicConfig(
    stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

import textwrap
import openai

from langchain.llms import AzureOpenAI, OpenAI
from langchain.embeddings import OpenAIEmbeddings
from llama_index.vector_stores import RedisVectorStore
from llama_index import LangchainEmbedding
from llama_index import (
    GPTVectorStoreIndex,
    SimpleDirectoryReader,
    LLMPredictor,
    PromptHelper,
    ServiceContext,
    StorageContext
)

# Azure OpenAI | OpenAI (direct)

The notebook allows the user two choose between using the OpenAI and Azure OpenAI endpoints. Make sure to follow the instructions in the README and set the .env correctly according to whichever API you are using. 

NOTE: ONLY ONE API CAN BE USED AT A TIME.

- **[Use Azure OpenAI](#Azure-OpenAI)**
- **[Use OpenAI](#OpenAI) (direct)**

## Azure OpenAI 

Here we setup the AzureOpenAI models and API keys that we set by reading from the environment above. The ``PromptHelper`` sets the parameters for the OpenAI model. The classes defined here are used together to provide a QnA interface between the user and the LLM.

In [4]:
# setup Llama Index to use Azure OpenAI
openai.api_type = "azure"
openai.api_base = os.getenv("AZURE_OPENAI_API_BASE")
openai.api_version = os.getenv("OPENAI_API_VERSION")
openai.api_key = os.getenv("OPENAI_API_KEY")


# Get the OpenAI model names ex. "text-embedding-ada-002"
embedding_model = os.getenv("OPENAI_EMBEDDING_MODEL")
text_model = os.getenv("OPENAI_TEXT_MODEL")
# get the Azure Deployment name for the model
embedding_model_deployment = os.getenv("AZURE_EMBED_MODEL_DEPLOYMENT_NAME")
text_model_deployment = os.getenv("AZURE_TEXT_MODEL_DEPLOYMENT_NAME")

print(f"Using OpenAI models: {embedding_model} and {text_model}")
print(f"Using Azure deployments: {embedding_model_deployment} and {text_model_deployment}")


Using models: text-embedding-ada-002 and gpt-35-turbo


In [5]:

llm = AzureOpenAI(deployment_name=text_model_deployment, model_kwargs={
    "api_key": openai.api_key,
    "api_base": openai.api_base,
    "api_type": openai.api_type,
    "api_version": openai.api_version,
})
llm_predictor = LLMPredictor(llm=llm)

embedding_llm = LangchainEmbedding(
    OpenAIEmbeddings(
        model=embedding_model,
        deployment=embedding_model_deployment,
        openai_api_key= openai.api_key,
        openai_api_base=openai.api_base,
        openai_api_type=openai.api_type,
        openai_api_version=openai.api_version,
    ),
    embed_batch_size=1,
)

## OpenAI

The ``OpenAI`` class provides a simple interface to the OpenAI API.


In [3]:
# setup Llama Index to use OpenAI direct
openai.api_type = "openai"
openai.api_version = os.getenv("OPENAI_API_VERSION")
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_key = os.getenv("OPENAI_API_KEY")

# Get the OpenAI model names ex. "text-embedding-ada-002"
embedding_model = os.getenv("OPENAI_EMBEDDING_MODEL")
text_model = os.getenv("OPENAI_TEXT_MODEL")


print(f"Using OpenAI models: {embedding_model} and {text_model}")

Using OpenAI models: text-embedding-ada-002 and gpt-35-turbo


In [4]:
# Set up LLM
llm = OpenAI(model_kwargs={
    "api_key": openai.api_key,
    "api_base": openai.api_base,
    "api_type": openai.api_type,
    "api_version" : openai.api_version,
})
llm_predictor = LLMPredictor(llm=llm)

# Set up Embedding model
embedding_llm = LangchainEmbedding(
    OpenAIEmbeddings(
        model=embedding_model,
        openai_api_version=openai.api_version,
        openai_api_key= openai.api_key,
        openai_api_base=openai.api_base,
        openai_api_type=openai.api_type,
    ),
    embed_batch_size=1,
)

### LLamaIndex

[LlamaIndex](https://github.com/jerryjliu/llama_index) (GPT Index) is a project that provides a central interface to connect your LLM's with external data sources. It provides a simple interface to vectorize and store embeddings in Redis, create search indices using Redis, and perform vector search to find context for generative models like GPT.

Here we will use it to load in the documents (Chevy Colorado Brochure).

In [5]:
# load documents
documents = SimpleDirectoryReader('./docs').load_data()
print('Document ID:', documents[0].doc_id)

Document ID: 00952df8-749c-4aea-8ed6-4436dc931244


Llamaindex also works with frameworks like langchain to make prompting and other aspects of a chat based application easier. Here we can use the ``PromptHelper`` class to help us generate prompts for the (Azure) OpenAI model. The will be off by default as it can be tricky to setup correctly.

In [6]:
# set number of output tokens
num_output = int(os.getenv("OPENAI_MAX_TOKENS"))
# max LLM token input size
max_input_size = int(os.getenv("CHUNK_SIZE"))
# set maximum chunk overlap
max_chunk_overlap = float(os.getenv("CHUNK_OVERLAP"))

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

In [7]:
# define the service we will use to answer questions
# if you executive the Azure OpenAI code above, your Azure Models and creds will be used and the same for OpenAI
service_context = ServiceContext.from_defaults(
    llm_predictor=llm_predictor,
    embed_model=embedding_llm,
    prompt_helper=prompt_helper # uncomment to use prompt_helper.
)

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


## Initialize Redis as a Vector Database

Now we have our documents read in, we can initialize the ``RedisVectorStore``. This will allow us to store our vectors in Redis and create an index.

The ``GPTVectorStoreIndex`` will then create the embeddings from the text chunks by calling out to OpenAI's API. The embeddings will be stored in Redis and an index will be created.

NOTE: If you didn't set the ``OPENAI_API_KEY`` environment variable, you will get an error here.

In [8]:
def format_redis_conn_from_env(using_ssl=False):
    start = "rediss://" if using_ssl else "redis://"
    # if using RBAC
    password = os.getenv("REDIS_PASSWORD", None)
    username = os.getenv("REDIS_USERNAME", "default")
    if password != None:
        start += f"{username}:{password}@"

    return start + f"{os.getenv('REDIS_HOST')}:{os.getenv('REDIS_PORT')}"

# make using_ssl=True to use SSL with ACRE
redis_url = format_redis_conn_from_env(using_ssl=False)
print(f"Using Redis address: {redis_url}")


Using Redis address: redis://default:@redis:6379


In [9]:
# Create VectorStore
vector_store = RedisVectorStore(
    index_name="chevy_docs",
    index_prefix="blog",
    redis_url=redis_url,
    overwrite=True
)

# access the underlying client in the RedisVectorStore implementation to ping the redis instance
vector_store.client.ping()

True

In [10]:
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = GPTVectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context,
    service_context=service_context
)

INFO:llama_index.vector_stores.redis:Deleting index chevy_docs
Deleting index chevy_docs
INFO:llama_index.vector_stores.redis:Creating index chevy_docs
Creating index chevy_docs
INFO:llama_index.vector_stores.redis:Added 27 documents to index chevy_docs
Added 27 documents to index chevy_docs


## Test the RAG pipeline!

Now that we have our document stored in the index, we can ask questions against the index. The index will use the data stored in itself as the knowledge base for the LLM.

In [11]:
query_engine = index.as_query_engine()
response = query_engine.query("What types of variants are available for the Chevrolet Colorado?")
print("\n", textwrap.fill(str(response), 100))

INFO:llama_index.vector_stores.redis:Using filters: *
Using filters: *
INFO:llama_index.vector_stores.redis:Querying index chevy_docs
Querying index chevy_docs
INFO:llama_index.vector_stores.redis:Found 2 results for query with id ['blog/vector_79481d33-7016-4e1c-a8a3-d522997a4bc5', 'blog/vector_f803f1d5-cd34-4158-89d9-e379c1b2c7b5']
Found 2 results for query with id ['blog/vector_79481d33-7016-4e1c-a8a3-d522997a4bc5', 'blog/vector_f803f1d5-cd34-4158-89d9-e379c1b2c7b5']

  The Chevrolet Colorado is available in Extended Cab, Crew Cab Short Box, and Crew Cab Long Box
variants. It comes with standard mechanical features such as a 2.5L DOHC 4-cylinder with Variable
Valve Timing (VVT) and Direct Injection, a 3.6L DOHC V6 with Variable Valve Timing (VVT) and Direct
Injection (Crew Cab 4x4 and Crew Cab Long Box 2WD models), a 2-speed, electronic Autotrac® with
rotary controls; includes Neutral position for dinghy towing (4x4 models), and either a 6-speed
automatic, electronically controlled 

In [12]:
response = query_engine.query("What is the maximum towing capacity of the chevy colorado?")
print("\n", textwrap.fill(str(response), 100))

INFO:llama_index.vector_stores.redis:Using filters: *
Using filters: *
INFO:llama_index.vector_stores.redis:Querying index chevy_docs
Querying index chevy_docs
INFO:llama_index.vector_stores.redis:Found 2 results for query with id ['blog/vector_dc44fe57-b138-41aa-af20-4859dbe1cac4', 'blog/vector_f803f1d5-cd34-4158-89d9-e379c1b2c7b5']
Found 2 results for query with id ['blog/vector_dc44fe57-b138-41aa-af20-4859dbe1cac4', 'blog/vector_f803f1d5-cd34-4158-89d9-e379c1b2c7b5']

  The maximum towing capacity of the 2022 Chevy Colorado is up to 7,700 lbs with the available GM-
exclusive Duramax® 2.8L Turbo-Diesel engine. The Crew Cab ZR2 version of the Colorado has an
additional towing capacity of up to 7,000 lbs when equipped with the available diesel engine.
Additionally, the ZR2 Bison Edition comes with 17-inch AEV-designed aluminum wheels, AEV front and
rear bumpers with recovery points, five AEV hot-stamped boron steel skid plates, AEV fender flares,
fog lamps, front and rear floor liners 

In [13]:
response = query_engine.query("What are the main differences between the three engine types available for the Chevy Colorado?")
print("\n", textwrap.fill(str(response), 100))

INFO:llama_index.vector_stores.redis:Using filters: *
Using filters: *
INFO:llama_index.vector_stores.redis:Querying index chevy_docs
Querying index chevy_docs
INFO:llama_index.vector_stores.redis:Found 2 results for query with id ['blog/vector_668f9add-14cc-4e56-b06d-22ac8d6d6c76', 'blog/vector_f803f1d5-cd34-4158-89d9-e379c1b2c7b5']
Found 2 results for query with id ['blog/vector_668f9add-14cc-4e56-b06d-22ac8d6d6c76', 'blog/vector_f803f1d5-cd34-4158-89d9-e379c1b2c7b5']

  The main differences between the three engine types available for the Chevy Colorado are the amount
of horsepower and torque, the displacement, the bore and stroke, the compression ratio, the fuel
delivery system, the max payload rating, the max trailering weight rating, and the EPA-estimated
fuel economy. Additionally, some models can be equipped with an AEV rear bumper with recovery
points, five AEV hot-stamped boron steel skid plates, AEV fender flares, a bumper with winch
provisions, front and rear floor liners w