# Demo - RAG with LlamaIndex

Demo of how to use LlamaIndex for reading files from local into OpenAI model and ask questions related to you data. The purpose of this demo is to show how OpenAI model can not have information about your private data to be able to give answers. LlamaIndex can help to integrate and connect your private data with OpenAI models using RAG and different types of document indexing.

Install requirements

In [2]:
%pip install llama-index --quiet
%pip install openai --quiet
%pip install pip install docx2txt --quiet
%pip install llama-index openai pypdf --quiet
# using s3fs to get public bucket contents
%pip install fs_s3fs --quiet
%pip install s3fs --quiet

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spyder 5.3.3 requires pyqt5<5.16, which is not installed.
spyder 5.3.3 requires pyqtwebengine<5.16, which is not installed.
distributed 2022.7.0 requires tornado<6.2,>=6.0.3, but you have tornado 6.3.3 which is incompatible.
notebook 6.5.6 requires jupyter-client<8,>=5.3.4, but you have jupyter-client 8.4.0 which is incompatible.
notebook 6.5.6 requires pyzmq<25,>=17, but you have pyzmq 25.1.1 which is incompatible.
panel 0.13.1 requires bokeh<2.5.0,>=2.4.0, but you have bokeh 3.3.0 which is incompatible.
sagemaker-datawrangler 0.4.3 requires sagemaker-data-insights==0.4.0, but you have sagemaker-data-insights 0.3.3 which is incompatible.
sparkmagic 0.20.4 requires nest-asyncio==1.5.5, but you have nest-asyncio 1.5.8 which is incompatible.
spyder 5.3.3 requires ipython<8.0.0,>=7.31.1, but you have ipython 8.1

In [None]:

# %pip install awscli --quiet

Set up the OpenAI API key as an environment variable

In [3]:
# import openai
from openai import OpenAI
import os
import s3fs # so we can access public s3 buckets

api_key = ""
#fix
os.environ['OPENAI_API_KEY'] = api_key
#old
#openai.api_key = api_key
client = OpenAI(api_key=api_key)

Download Private-Data locally using the following command:

In [4]:
#old
# ! aws s3 cp s3://webage-genai-data/Private-Data/ Private-Data --recursive

!wget https://btcampdata.s3.amazonaws.com/Private-Data.zip
!unzip Private-Data.zip

--2023-11-30 21:02:08--  https://btcampdata.s3.amazonaws.com/Private-Data.zip
Resolving btcampdata.s3.amazonaws.com (btcampdata.s3.amazonaws.com)... 52.219.102.35, 52.219.107.28, 52.219.106.204, ...
Connecting to btcampdata.s3.amazonaws.com (btcampdata.s3.amazonaws.com)|52.219.102.35|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 96732 (94K) [application/zip]
Saving to: ‘Private-Data.zip’


2023-11-30 21:02:08 (1.01 MB/s) - ‘Private-Data.zip’ saved [96732/96732]

Archive:  Private-Data.zip
  inflating: Private-Data/CV2.pdf    
  inflating: Private-Data/CV1.pdf    


## Check OpenAI model

Check how the OpenAI model would answer on some specific question related to our private database.

In [5]:

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(model=model,
                                            messages=messages,
                                            temperature=0)

    return response.choices[0].message

Our question that we will ask:

In [6]:
prompt = "For what companies did Susan work in the past?"

In [7]:
response = get_completion(prompt)

print(response)

ChatCompletionMessage(content="I'm sorry, but as an AI language model, I don't have access to personal data about individuals unless it has been shared with me in the course of our conversation. I can't provide information about Susan's past work or any other personal details unless they have been explicitly provided to me.", role='assistant', function_call=None, tool_calls=None)


**We see that ChatGPT does not have information about Susan at all.**

## Connect custom data sources to your LLM.

We can use RAG to load our own data and feed LLM with our documents as context.

It comes with many ready-made readers for sources such as databases, Discord, Slack, Google Docs, Notion, GitHub reps etc.

Full list can be found here: https://llamahub.ai/

**SimpleDirectoryReader**

`SimpleDirectoryReader` is used for reading data locally.
In order to use it, simply pass in a input directory or a list of files. In our **Private-Data** folder, we have couple of CVs pdf files.

`SimpleDirectoryReader` will select the best file reader (either for csv, pdf etc) based on the file extensions.

Load CV files using SimpleDirectoryReader

In [8]:
from llama_index import TreeIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('Private-Data').load_data()

By default, LlamaIndex uses OpenAI GPT-3 **text-davinci-003** model

Create **TreeIndex** using loaded documents

In [9]:
index = TreeIndex.from_documents(documents)

Creating query engine

In [10]:
query_engine = index.as_query_engine()

Getting response

In [11]:
prompt = "For what companies did Susan work in the past?"
response = query_engine.query(prompt)

print(response)

Susan worked for Luxury Car Center and Japan Car Center in the past.


Depending on the answer, but if the model did not know, we might try different type if indexing or changing OpenAI model. Or change prompt to be more specific.

Let's try new OpenAI model firstly.

### Changing the underlying LLM / Embeddings

Sometimes, you want to use some other LLM for indexing instead of the default one.

In this example, we use **gpt-3.5-turbo** instead of **text-davinci-003**. Available models include, gpt-3.5-turbo-16k, gpt-4, gpt-4-32k, text-davinci-003, and text-davinci-002 and others.

In [12]:
from llama_index.llms import OpenAI
from llama_index import ServiceContext

Define a new LLM

In [13]:
llm = OpenAI(temperature=0.1, model="gpt-3.5-turbo-16k-0613")

Create **contex_service**

The ServiceContext is a bundle of commonly used resources used during the indexing and querying stage in a LlamaIndex pipeline/application. You can use it to set the global configuration, as well as local configurations at specific parts of the pipeline.

In [14]:
service_context = ServiceContext.from_defaults(llm=llm)

Create again **TreeIndex**

In [15]:
from llama_index import TreeIndex

index = TreeIndex.from_documents(documents, service_context=service_context)

Start query engine

In [16]:
query_engine = index.as_query_engine()

Run the same prompt as before.

In [17]:
response = query_engine.query(prompt)
print(response)

Susan worked for Luxury Car Center and Japan Car Center in the past.


Let's try another type of indexing

In [18]:
from llama_index import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents, service_context=service_context)

In [19]:
query_engine = index.as_query_engine()
response = query_engine.query(prompt)

print(response)

Susan worked for LUXURY CAR CENTER and JAPAN CAR CENTER in the past.


If the model still does not know the answer, we can be more specific with the prompt

In [20]:
prompt = "For what companies did Susan work in the past?"

In [21]:
response = query_engine.query(prompt)
print(response)

Susan worked for LUXURY CAR CENTER and JAPAN CAR CENTER in the past.


### Try another prompt

In [22]:
prompt2 = "Give me two names for software engineering position?"

In [23]:
response = query_engine.query(prompt2)
print(response)

Christopher Morgan, Senior Web Developer
Susan Williams, Store Manager


### Let's try another type of indexing to see the answer

### GPTVectorStoreIndex

GPTVectorStoreIndex creates numerical vectors from the text using word embeddings and retrieves relevant documents based on the similarity of the vectors.

In [24]:
from llama_index import GPTVectorStoreIndex

index2 = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)
query_engine2 = index2.as_query_engine()

In [25]:
response = query_engine2.query(prompt)
print(response)

Susan worked for LUXURY CAR CENTER and JAPAN CAR CENTER in the past.


In [26]:
response = query_engine2.query(prompt2)
print(response)

Christopher Morgan, Senior Web Developer
Susan Williams, Store Manager


### GPTListIndex

The GPTListIndex index is perfect when you don’t have many documents. Instead of trying to find the relevant data, the index concatenates all chunks and sends them all to the LLM. If the resulting text is too long, the index splits the text and asks LLM to refine the answer.

Since we do not have too much documents, let's see the results!

In [27]:
from llama_index.indices.list import GPTListIndex

index3 = GPTListIndex.from_documents(documents, service_context=service_context)
query_engine3 = index3.as_query_engine()

In [28]:
response = query_engine3.query(prompt)
print(response)

Susan worked for Luxury Car Center and Japan Car Center in the past.


In [29]:
response = query_engine3.query(prompt2)
print(response)

Christopher Morgan
Susan Williams


### GPTKeywordTableIndex

The GPTKeywordTableIndex implementation extracts the keywords from indexed nodes and uses them to find relevant documents. When we ask a question, first, the implementation will generate keywords from the question. Next, the index searches for the relevant documents and sends them to the LLM.

**IMPORTANT**

Using Keyword Indexing, every node is sent to the LLM to generate keywords. Sending every document to an LLM skyrockets the cost of indexing!

In [30]:
from llama_index.indices.keyword_table import GPTKeywordTableIndex

index4 = GPTKeywordTableIndex.from_documents(documents, service_context=service_context)
query_engine4 = index4.as_query_engine()

[nltk_data] Downloading package stopwords to /tmp/llama_index...
[nltk_data]   Unzipping corpora/stopwords.zip.


In [31]:
response = query_engine4.query(prompt)
print(response)

Empty Response


In [33]:
response = query_engine4.query(prompt2)
response

Response(response='Empty Response', source_nodes=[], metadata=None)

### Saving and Loading indexed documents

By default, data is stored in-memory. To persist to disk (under ./storage):

In [None]:
index4.storage_context.persist()


To reload from disk:

In [None]:
from llama_index import StorageContext, load_index_from_storage

# rebuild storage context

storage_context = StorageContext.from_defaults(persist_dir="./storage")
# load index
loaded_index = load_index_from_storage(storage_context)

In [None]:
query_engine_loaded = loaded_index.as_query_engine()


response_loaded_index = query_engine_loaded.query(prompt2)
print(response_loaded_index)