# Improving RAG quality in LLM apps while minimizing vector search costs via summarization

### Environment setup

Let's setup our environment, including dependencies and obtaining API keys.
> We'll take a few shortcuts here; for more thorough instructions see [First steps with Pinecone DB](https://www.ninetack.io/post/first-steps-with-pinecone-db#viewer-7cp5r)

#### Install dependencies

We install the `pinecone-client`, plus we need the `openai` package because we will be using the `text-embedding-ada-002` embedding model [from OpenAI](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).

In [3]:
! python -m pip install -qU \
    langchain==0.0.225 \
    pinecone-client==2.2.2 \
    openai==0.27.8 \
    pandas==2.0.3 \
    python-dotenv \
    tqdm

#### Environment variables

We need to set 3 environment variables. You can edit the code below to set them directly.

- `PINECONE_ENVIRONMENT` - The Pinecone environment where your index resides
- `PINECONE_API_KEY` - Your pinecone API key
- `OPENAI_API_KEY` - Your OpenAI API key

If a local `.env` file exists, load the env vars from it.

In [4]:
from dotenv import load_dotenv
load_dotenv()

True

Check the environment config output below, and edit if necessary with your variables.

In [5]:
import os

print("Check environment\n---------------------")

pinecone_env = os.environ.get('PINECONE_ENVIRONMENT') or "YOUR PINECONE ENVIRONMENT"
pinecone_api_key = os.environ.get('PINECONE_API_KEY') or "YOUR PINECONE API KEY"
openai_api_key = os.environ.get('OPENAI_API_KEY') or "YOUR OPENAI API KEY"

print("pinecone_env:", pinecone_env)
print("pinecone_api_key:", pinecone_api_key[:5], "...")
print("openai_api_key:", openai_api_key[:5], "...")

Check environment
---------------------
pinecone_env: us-west4-gcp-free
pinecone_api_key: 05131 ...
openai_api_key: sk-7w ...


Before: https://www.theverge.com/23678497/apple-iphone-15-news-rumors-release-date-specs-features

After: https://www.theverge.com/2023/9/12/23862837/iphone-15-event-apple-watch-ultra-airpods-usb-c

Setup problem
 - intro use case
 - index data w/ normal chunking
 - show it (kind of) working

Intro Summarize
 - Use LLM to summarize larger chunks
 - Index summaries
 - Keep larger chunks on s3

Results
 - Show better search outcomes

 Wrap-up/next steps

#### Initial load -- the simple way:

In [144]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

chunk_size = 500
chunk_overlap = chunk_size // 10

def load_document(file_path, source_tag):
  loader = TextLoader(file_path=file_path)
  doc = loader.load()[0]
  doc.metadata["source"] = source_tag
  return doc

source_files = {
  "Rumor": "./text/iphone15_rumors.txt",
  "Announcement": "./text/iphone15_announcements.txt"
}
documents = [load_document(file_path=file_path, source_tag=tag) for tag, file_path in source_files.items()]

text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size,
                                               chunk_overlap=chunk_overlap,
                                               separators=["\n\n", "\n", ".", " ", ""])
texts = text_splitter.split_documents(documents)
print(f"Split text from {len(source_files.keys())} source files into {len(texts)} chunks of text)")

Split text from 2 source files into 52 chunks of text)


Create Pinecone index. This takes a couple of minutes. We set dimensions to `1536` because we're going to use the `text-embedding-ada-002` embedding model [from OpenAI](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings).

In [162]:
import pinecone

pinecone.init(api_key=pinecone_api_key, environment=pinecone_env)

In [11]:
pinecone.create_index("iphone-index", dimension=1536, metric="cosine")

IndexDescription(name='iphone-index', metric='cosine', replicas=1, dimension=1536.0, shards=1, pods=1, pod_type='p1', status={'ready': True, 'state': 'Ready'}, metadata_config=None, source_collection='')

In [163]:
pinecone.describe_index("iphone-index")

IndexDescription(name='iphone-index', metric='cosine', replicas=1, dimension=1536.0, shards=1, pods=1, pod_type='p1', status={'ready': True, 'state': 'Ready'}, metadata_config=None, source_collection='')

In [70]:
pinecone_index = pinecone.Index(index_name="iphone-index")

Define a function to create embeddings, and then create embeddings for text chunks.

In [147]:
import openai
openai.api_key = openai_api_key

def create_embeddings(batch: list[str]):
  model_id = 'text-embedding-ada-002'
  embedding_resp = openai.Embedding.create(input=batch, model=model_id)
  return [emb['embedding'] for emb in embedding_resp['data']]

embeddings = create_embeddings([doc.page_content for doc in texts])

Upload embeddings to Pinecone

In [148]:
to_upload = [{
    'id': f"item-{i}",
    'values': emb,
    'metadata': {
      'source': texts[i].metadata['source'],
      'text': texts[i].page_content,
    }
  } for i, emb in enumerate(embeddings)]
response = pinecone_index.upsert(vectors=to_upload, namespace="direct")
response

{'upserted_count': 52}

Create embeddings for the query string

In [149]:
query_str = "When the iphone 15 is released, is it supposed to still have a mute button?"
query_emb = create_embeddings([query_str])[0]
len(query_emb)

1536

Run the query. As you can see, we're asking a question about what was actually announced, but we're getting matches against the rumors document. Info is pulled out of context.

In [150]:
response = pinecone_index.query(vector=query_emb, namespace="direct", top_k=1, include_metadata=True)

for match in response['matches']:
  print("Source:", match['metadata']['source'])
  print("Text:", match['metadata']['text'])
  print()

formatted_context = ""
for match in response['matches']:
  formatted_context += match['metadata']['text'] + "\n\n"

Source: Rumor
Text: More evidence points toward an action button on the iPhone 15 Pro:
As spotted by 9to5Mac, the iOS 17 beta 7 has a new haptic feedback pattern that makes the “phone vibrate more prominently” to signal when silent mode is on or off.
That would make sense if Apple ends up replacing the mute switch with a solid-state action button, as the vibration could help users determine which mode it's in.



Let's see how it does in answering the question

In [151]:
from langchain import PromptTemplate

qa_template_str = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Helpful Answer:"""
qa_template = PromptTemplate(template=qa_template_str, input_variables=["context", "question"])

In [152]:
qa_prompt =  qa_template.format(context=formatted_context, question=query_str)
print("Prompt:")
print(qa_prompt)

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[{"role": "user", "content": qa_prompt}]
)

print("\n\nResponse:")
print(response['choices'][0]['message']['content'])

Prompt:
Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

More evidence points toward an action button on the iPhone 15 Pro:
As spotted by 9to5Mac, the iOS 17 beta 7 has a new haptic feedback pattern that makes the “phone vibrate more prominently” to signal when silent mode is on or off.
That would make sense if Apple ends up replacing the mute switch with a solid-state action button, as the vibration could help users determine which mode it's in.



Question: When the iphone 15 is released, is it supposed to still have a mute button?
Helpful Answer:


Response:
I don't know.


Not great, maybe we just need to include more search results with `top_k=2`?

In [166]:
response = pinecone_index.query(vector=query_emb, namespace="direct", top_k=2, include_metadata=True)

for match in response['matches']:
  print("Source:", match['metadata']['source'])
  print("Text:", match['metadata']['text'])
  print()

formatted_context = ""
for match in response['matches']:
  formatted_context += match['metadata']['text'] + "\n\n"

Source: Rumor
Text: More evidence points toward an action button on the iPhone 15 Pro:
As spotted by 9to5Mac, the iOS 17 beta 7 has a new haptic feedback pattern that makes the “phone vibrate more prominently” to signal when silent mode is on or off.
That would make sense if Apple ends up replacing the mute switch with a solid-state action button, as the vibration could help users determine which mode it's in.

Source: Rumor
Text: The iPhone 15 might be a little dull on the outside:
We could be in for a muted iPhone color selection this year. YouTuber Jon Rettinger has gotten his hands on dummy units in what appear to be the iPhone 15 and 15 Pro’s launch colors. Those are black, white, light blue, light yellow, and a soft pink for the 15, while the 15 Pro could come in blue, dark gray, light gray, and white.



Hmm now we're mixing announcements with rumors, but let's see what happens.

In [167]:
qa_prompt =  qa_template.format(context=formatted_context, question=query_str)
print("Prompt:")
print(qa_prompt)

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[{"role": "user", "content": qa_prompt}]
)

print("\n\nResponse:")
print(response['choices'][0]['message']['content'])

Prompt:
Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

More evidence points toward an action button on the iPhone 15 Pro:
As spotted by 9to5Mac, the iOS 17 beta 7 has a new haptic feedback pattern that makes the “phone vibrate more prominently” to signal when silent mode is on or off.
That would make sense if Apple ends up replacing the mute switch with a solid-state action button, as the vibration could help users determine which mode it's in.

The iPhone 15 might be a little dull on the outside:
We could be in for a muted iPhone color selection this year. YouTuber Jon Rettinger has gotten his hands on dummy units in what appear to be the iPhone 15 and 15 Pro’s launch colors. Those are black, white, light blue, light yellow, and a soft pink for the 15, while the 15 Pro could come in blue, dark gray, light gray, and white.



Question: When the iphone 15 is released, is 

A little better, but doesn't really instill confidence. Let's try a different way.

### Alternate approach -- summaries

In [153]:
documents

[Document(page_content="In September, Apple is widely expected to reveal its iPhone 15 lineup of flagship smartphones. Like last year, 2023’s range is expected to consist of four models: the base iPhone 15, an iPhone 15 Max, an iPhone 15 Pro, and an iPhone 15 Pro Max. There’s a chance that the latter device might be branded as the iPhone 15 Ultra, although a more recent report suggests this branding might not be used until 2024.\n2023 is shaping up to be an interesting year for Apple’s bestselling device. Rumors suggest the company could finally make the switch from Lightning to USB-C while the Dynamic Island, which was exclusive to the Pro models in 2022, might trickle down to the non-Pro iPhone 15 and iPhone 15 Max.\nRead on for all our coverage of the latest leaks and rumors about this year’s iPhones.\n\n\nThe iPhone 15 might be a little dull on the outside:\nWe could be in for a muted iPhone color selection this year. YouTuber Jon Rettinger has gotten his hands on dummy units in wh

Let's chunk our documents using a larger chunk size

In [154]:
# create large chunks of source text

large_chunk_text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000,
                                               chunk_overlap=100,
                                               separators=["\n\n", "\n", ".", " ", ""])
large_chunks = large_chunk_text_splitter.split_documents(documents)
large_chunks

[Document(page_content='In September, Apple is widely expected to reveal its iPhone 15 lineup of flagship smartphones. Like last year, 2023’s range is expected to consist of four models: the base iPhone 15, an iPhone 15 Max, an iPhone 15 Pro, and an iPhone 15 Pro Max. There’s a chance that the latter device might be branded as the iPhone 15 Ultra, although a more recent report suggests this branding might not be used until 2024.\n2023 is shaping up to be an interesting year for Apple’s bestselling device. Rumors suggest the company could finally make the switch from Lightning to USB-C while the Dynamic Island, which was exclusive to the Pro models in 2022, might trickle down to the non-Pro iPhone 15 and iPhone 15 Max.\nRead on for all our coverage of the latest leaks and rumors about this year’s iPhones.\n\n\nThe iPhone 15 might be a little dull on the outside:\nWe could be in for a muted iPhone color selection this year. YouTuber Jon Rettinger has gotten his hands on dummy units in wh

Then for each large chunk, let's create a summarized version. 

In [128]:
from langchain import PromptTemplate

create_summary_prompt = """Create a short summary of the block of text below.

Text:
------------------------------------------
{text}
------------------------------------------

Your summary:"""
prompt_template = PromptTemplate(input_variables=["text"], template=create_summary_prompt)


In [155]:
from langchain.docstore.document import Document

summary_documents = []
for doc in large_chunks:
  to_summarize = doc.page_content
  print("--- Summarizing chunk: -------------")
  print(f"{to_summarize[0:40]}... ({len(to_summarize)}) total length")
  response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": prompt_template.format(text=to_summarize)}]
  )
  summary = response['choices'][0]['message']['content']
  summary_documents.append(Document(page_content=summary, metadata=doc.metadata))
  print("--- Summary: -----------------------")
  print(summary, "\n")


--- Summarizing chunk: -------------
In September, Apple is widely expected t... (1622) total length
--- Summary: -----------------------
In September, Apple is expected to announce the iPhone 15 lineup consisting of four models: iPhone 15, iPhone 15 Max, iPhone 15 Pro, and iPhone 15 Pro Max or Ultra. There are rumors of a switch from Lightning to USB-C and the possibility of the Dynamic Island feature being included in the non-Pro models. There may be a limited color selection for the iPhone 15 and 15 Pro, with pastel options and titanium frames rumored. Production issues have been reported due to the use of titanium frames. 

--- Summarizing chunk: -------------
This might be the iPhone 15’s USB-C port... (1720) total length
--- Summary: -----------------------
The rumored iPhone 15 may have a USB-C port in light blue and green colors, according to leaked images. However, the presence of a green USB-C port does not necessarily mean there will be a green iPhone at launch. The iPhone 1

Now let's chunk the summaries, and create embeddings for each chunk, and upload the embeddings to Pinecone in the "summaries" namespace.

Note that we're going prepend each summary with either "`Rumor:`" or "`Announcement:`". This will help assist semantic search in distinguishing queries.


In [156]:
to_embed = [f"{doc.metadata['source']}: {doc.page_content}" for doc in summary_documents]
to_embed


['Rumor: In September, Apple is expected to announce the iPhone 15 lineup consisting of four models: iPhone 15, iPhone 15 Max, iPhone 15 Pro, and iPhone 15 Pro Max or Ultra. There are rumors of a switch from Lightning to USB-C and the possibility of the Dynamic Island feature being included in the non-Pro models. There may be a limited color selection for the iPhone 15 and 15 Pro, with pastel options and titanium frames rumored. Production issues have been reported due to the use of titanium frames.',
 'Rumor: The rumored iPhone 15 may have a USB-C port in light blue and green colors, according to leaked images. However, the presence of a green USB-C port does not necessarily mean there will be a green iPhone at launch. The iPhone 15 Pro is expected to come in space black, silver, a new gray color resembling titanium, and dark blue, with no gold or red option. The iPhone 15 non-Pro may be available in black, green, blue, yellow, and pink. There is also evidence suggesting that the iPho

In [157]:
summary_embeddings = create_embeddings(to_embed)

In [158]:
to_upload = [{
    'id': f"item-{i}",
    'values': summary_embeddings[i],
    'metadata': {
      'source': summary_doc.metadata['source'],
      'source_id': f"{i}",
    }
  } for i, summary_doc in enumerate(summary_documents)] # TODO pickup here, this currently expects two docs but now we have 6 chunks
response = pinecone_index.upsert(vectors=to_upload, namespace="summaries")
response

{'upserted_count': 11}

Now let's re-run our query and see what comes back. When we find a match, we're going to substitute the original source text in our prompt to answer the user's question.

In [160]:
response = pinecone_index.query(vector=query_emb, namespace="summaries", top_k=1, include_metadata=True)

for match in response['matches']:
  print("Source:", match['metadata']['source'])
  print("Source ID:", int(match['metadata']['source_id']))
  print()

context = ""
for match in response['matches']:
  context += large_chunks[int(match['metadata']['source_id'])].page_content + "\n\n"

print("Original large context:")
print(context)

Source: Rumor
Source ID: 3

Original large context:
iPhone 15 Pro might get a titanium frame, thinner bezels, and a price hike:
Some big changes are expected to come to this year’s iPhone 15 Pro lineup — but they might come alongside a price hike, too. In Bloomberg this morning, reporter Mark Gurman confirmed a handful of details that have been floating around all year about what to expect when the next iPhone lineup is announced in just over a month.
The new Pro models will both come with titanium frames, instead of stainless steel, making them stronger and lighter, according to Gurman. Their screens will also have thinner bezels, thanks to a new display technology, shrinking the size of the black border by about a third. (Earlier leaks show what that might look like.) And as previously reported, expect the mute switch to be swapped out for a customizable button and the Lightning port to be replaced by USB-C.


The iPhone 15 Pro’s rumored action button sounds pretty useful:
The upcomi

In [161]:
qa_prompt =  qa_template.format(context=context, question=query_str)
print("Prompt:")
print(qa_prompt)

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[{"role": "user", "content": qa_prompt}]
)

print("\n\nResponse:")
print(response['choices'][0]['message']['content'])

Prompt:
Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

iPhone 15 Pro might get a titanium frame, thinner bezels, and a price hike:
Some big changes are expected to come to this year’s iPhone 15 Pro lineup — but they might come alongside a price hike, too. In Bloomberg this morning, reporter Mark Gurman confirmed a handful of details that have been floating around all year about what to expect when the next iPhone lineup is announced in just over a month.
The new Pro models will both come with titanium frames, instead of stainless steel, making them stronger and lighter, according to Gurman. Their screens will also have thinner bezels, thanks to a new display technology, shrinking the size of the black border by about a third. (Earlier leaks show what that might look like.) And as previously reported, expect the mute switch to be swapped out for a customizable button and 

Much better!

So not only are we getting better output, we're requiring significantly less vector storage to do it.

### Cleaning up

Selectively run these as needed to clean up and start a section over, or to remove the index completely when you're done.

In [145]:
pinecone_index.delete(delete_all=True, namespace="direct")

{}

In [146]:
pinecone_index.delete(delete_all=True, namespace="summaries")

{}

In [7]:
pinecone.delete_index("iphone-index")