# Retrieval-Augmented Generation using Anthropic Claude

This notebook demonstrates how to implement retrieval-augmented generation (RAG), connecting Claude with the data in your Pinecone vector database. We will cover the following steps:

1. Setup: Setup and set Pinecone and Anthropic API keys
2. Ingestion: Embedding and upserting data into Pinecone using integrated inference
3. Retrieval: Querying a dense and a sparse index from Pinecone to retrieve results using hybrid search
4. Augmentation: Prepare the prompt
5. Generation: Using Claude to answer questions with information from the database

This notebook accompanies this [Retrieval-Augmented Generation article](https://www.pinecone.io/learn/retrieval-augmented-generation/).

## 1. Setup
First, let's install the necessary libraries and set the API keys we will need to use in this notebook.

In [None]:
%pip install -qU \
     anthropic==0.54.0 \
     pinecone==7.0.2 \
     pinecone-notebooks==0.1.1 \
     pandas==2.2.2 \
     tqdm==4.67.1

Note: you may need to restart the kernel to use updated packages.


### Get and set the Pinecone API key

We will need a free [Pinecone API key](https://docs.pinecone.io/guides/get-started/quickstart). The code below will either authenticate you and set the API key as an environment variable or will prompt you to enter the API key and then set it in the environment.

In [6]:
import os
from getpass import getpass

def get_pinecone_api_key():
    """
    Get Pinecone API key from environment variable or prompt user for input.
    Returns the API key as a string.

    Only necessary for notebooks. When using Pinecone yourself, 
    you can use environment variables or the like to set your API key.
    """
    api_key = os.environ.get("PINECONE_API_KEY")
    
    if api_key is None:
        try:
            # Try Colab authentication if available
            from pinecone_notebooks.colab import Authenticate
            Authenticate()
            # If successful, key will now be in environment
            api_key = os.environ.get("PINECONE_API_KEY")
        except ImportError:
            # If not in Colab or authentication fails, prompt user for API key
            print("Pinecone API key not found in environment.")
            api_key = getpass("Please enter your Pinecone API key: ")
            # Save to environment for future use in session
            os.environ["PINECONE_API_KEY"] = api_key
            print("Pinecone API key saved to environment.")
            
    return api_key

PINECONE_API_KEY = get_pinecone_api_key()

Pinecone API key not found in environment.


### Set the Anthropic API key

Next, we'll need to get a [Claude API key](https://docs.anthropic.com/en/api/overview). The code below will prompt you to enter it and then set it in the environment.

In [7]:
def get_anthropic_api_key():
    """
    Get Anthropic API key from environment variable or prompt user for input.
    Returns the API key as a string.
    """

    api_key = os.environ.get("ANTHROPIC_API_KEY")
    
    if api_key is None:
        try:
            api_key = getpass("Please enter your Anthropic API key: ")
            # Save to environment for future use in session
            os.environ["ANTHROPIC_API_KEY"] = api_key
        except Exception as e:
            print(f"Error getting Anthropic API key: {e}")
            return None
    
    return api_key

ANTHROPIC_API_KEY = get_anthropic_api_key()

## 2. Ingestion

### Download the dataset
Now let's download the Amazon products dataset which has over 10k Amazon product descriptions and load it into a DataFrame.

In [9]:

!curl -0 "https://www-cdn.anthropic.com/48affa556a5af1de657d426bcc1506cdf7e2f68e/amazon-products.jsonl" > amazon-products.jsonl


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 5559k  100 5559k    0     0   508k      0  0:00:10  0:00:10 --:--:--  517k


In [10]:
import pandas as pd

data = []
with open('amazon-products.jsonl', 'r') as file:
    for line in file:
        try:
            data.append(eval(line))
        except:
            pass

df = pd.DataFrame(data)
display(df.head())
len(df)

Unnamed: 0,text
0,Product Name: DB Longboards CoreFlex Crossbow ...
1,Product Name: Electronic Snap Circuits Mini Ki...
2,Product Name: 3Doodler Create Flexy 3D Printin...
3,Product Name: Guillow Airplane Design Studio w...
4,Product Name: Woodstock- Collage 500 pc Puzzle...


10002

### Pinecone vector database and integrated inference

We'll use hybrid search to implement retrieval. We'll create separate dense and sparse indexes, upsert dense vectors into the dense index and sparse vectors into the sparse index, and search each index separately. Then we'll combine and deduplicate the results, use one of Pinecone’s hosted reranking models to rerank them based on a unified relevance score, and then return the most relevant matches.

We'll use integrated inference, so when creating the indexes, we'll specify a Pinecone-hosted model to use for embedding queries and documents. Pinecone handles the embedding for us,
so we can pass it text directly. Learn more about hybrid search [here](https://docs.pinecone.io/guides/search/hybrid-search) and integrated inference [here](https://docs.pinecone.io/guides/index-data/indexing-overview#integrated-embedding).

Now, we can initialize the dense index, using the `llama-text-embed-v2` embedding model.

In [11]:
from pinecone import Pinecone

pc = Pinecone()

dense_index_name = "traditional-rag-with-claude-dense"
if not pc.has_index(dense_index_name):
    pc.create_index_for_model(
        name=dense_index_name,
        cloud="aws",
        region="us-east-1",
        # Chunk text will be the field we embed from our documents
        embed={
            "model":"llama-text-embed-v2",
            "field_map":{"text": "chunk_text"}
        }
    )

dense_index = pc.Index(dense_index_name)
dense_index.describe_index_stats()

  from .autonotebook import tqdm as notebook_tqdm


{'dimension': 1024,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {},
 'total_vector_count': 0,
 'vector_type': 'dense'}

Next, we'll initialize the sparse index, using the `pinecone-sparse-english-v0` sparse embedding model.

In [12]:
sparse_index_name = "traditional-rag-with-claude-sparse"
if not pc.has_index(sparse_index_name):
    pc.create_index_for_model(
        name=sparse_index_name,
        cloud="aws",
        region="us-east-1",
        # Chunk text will be the field we embed from our documents
        embed={
            "model":"pinecone-sparse-english-v0",
            "field_map":{"text": "chunk_text"}
        }
    )

sparse_index = pc.Index(sparse_index_name)
sparse_index.describe_index_stats()


{'index_fullness': 0.0,
 'metric': 'dotproduct',
 'namespaces': {},
 'total_vector_count': 0,
 'vector_type': 'sparse'}

We should see that the two new Pinecone indexes both have a total_vector_count of 0, as we haven't added any vectors yet.

### Embedding and upserting data to Pinecone 

With our indexes set up, we can now take our product descriptions, embed them, and upsert them to each index.

In [13]:
from tqdm import tqdm

descriptions = df["text"].tolist()
batch_size = 96  # how many embeddings we create and insert at once via integrated embedding

def upsert_and_embed_into_index(index, namespace, descriptions, batch_size=96):
    for i in tqdm(range(0, len(descriptions), batch_size)):
        # Iterate over descriptions in batches of 96, the max number of docs that can be embedded each call
        # find end of batch
        i_end = min(len(descriptions), i+batch_size)
        descriptions_batch = descriptions[i:i_end]

        records = [
            {
                "id": f"description_{i+idx}",
                "chunk_text": description,
            }
            for idx, description in enumerate(descriptions_batch)
        ]

        # embed and upsert into Pinecone. This operation does both!
        index.upsert_records(namespace=namespace, records=records)


# We specify a namespace to upsert our data into.
# We write a check to ensure that we only upsert if the index is empty or is missing vectors
dense_embed_condition = dense_index.describe_index_stats()["total_vector_count"] == 0 or dense_index.describe_index_stats()["total_vector_count"] < len(descriptions)
if dense_embed_condition:
    upsert_and_embed_into_index(dense_index, "amz-products", descriptions)

sparse_embed_condition = sparse_index.describe_index_stats()["total_vector_count"] == 0 or sparse_index.describe_index_stats()["total_vector_count"] < len(descriptions)
if sparse_embed_condition:
    upsert_and_embed_into_index(sparse_index, "amz-products", descriptions)


100%|██████████| 105/105 [01:24<00:00,  1.24it/s]
100%|██████████| 105/105 [00:41<00:00,  2.51it/s]


{'index_fullness': 0.0,
 'metric': 'dotproduct',
 'namespaces': {'amz-products': {'vector_count': 9792}},
 'total_vector_count': 9792,
 'vector_type': 'sparse'}

In [17]:
print("dense index stats:")
dense_index.describe_index_stats()

dense index stats:


{'dimension': 1024,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {'amz-products': {'vector_count': 10002}},
 'total_vector_count': 10002,
 'vector_type': 'dense'}

In [18]:
print("sparse index stats:")
sparse_index.describe_index_stats()

sparse index stats:


{'index_fullness': 0.0,
 'metric': 'dotproduct',
 'namespaces': {'amz-products': {'vector_count': 10002}},
 'total_vector_count': 10002,
 'vector_type': 'sparse'}

## 3. Retrieval using hybrid search

### Perform semantic search

With our indexes populated, we can start making queries to get results.

We'll first query the dense index to find the 5 records most semantically related to the natural language query. Because the index is integrated with an embedding model, you provide the query as text and Pinecone converts the text to a dense vector automatically.

In [74]:
USER_QUESTION = "I want to get my daughter more interested in science. What kind of gifts should I get her?"

def search_dense_index(question):
    # search_records embeds and queries the Pinecone index in one step
    dense_results = dense_index.search(
        namespace="amz-products", 
        query={
            # specifies number of results to return
            "top_k":5,
            # specifies the query to embed and search for
            "inputs":{
                "text": USER_QUESTION
            }
        }
    )

    return dense_results["result"]["hits"]

dense_results = search_dense_index(USER_QUESTION)

for num, result in enumerate(dense_results):
    # Return result and score
    print(f"Result {num+1}:")
    print(result["_id"])
    print(result["fields"]["chunk_text"], result["_score"])
    print("\n")



Result 1:
description_5128
Product Name: Hey! Play! Kids Science Kit-Lab Set to Create Solutions, Litmus Paper, & More-Great Fun & Educational Stem Learning Activity for Boys & Girls

About Product: Hands on learning- equipped with 4 test tubes and a holding rack, 2 beakers, dropper, measuring spoon, funnel, 3 grams of purple sweet potato powder and 10 sheets of paper filter, This is an excellent Basic starter science kit for kids! | Uses household items- The items needed for experiments that are not included with the kit are everyday items, that are easily found around the house, like scissors, plastic wrap, vinegar, baking soda, and water. | Stem activity- The science kit is a fantastic STEM (science, technology, engineering, Math) learning toy that will help your kids understand the concepts of mixing substances like acid and alkaline liquids and making things like litmus paper. | Hours of fun- this set is a wonderful gift for birthdays, holidays, or any occasion! Your little girl o

### Perform lexical search

Now we'll perform a lexical search by searching the sparse index for the 5 records that most exactly match the words in the query. This is an advanced variant of keyword search, and uses the `pinecone-sparse-english-v0` sparse embedding model that is optimized for keyword search.

In [75]:
def search_sparse_index(question):
    # search_records embeds and queries the Pinecone index in one step
    sparse_results = sparse_index.search(
        namespace="amz-products", 
        query={
            # specifies number of results to return
            "top_k":5,
            # specifies the query to embed and search for
            "inputs":{
                "text": USER_QUESTION
            }
        }
    )

    return sparse_results["result"]["hits"]

sparse_results = search_sparse_index(USER_QUESTION)

for num, result in enumerate(sparse_results):
    # Return result and score
    print(f"Result {num+1}:")
    print(result["_id"])
    print(result["fields"]["chunk_text"], result["_score"])
    print("\n")



Result 1:
description_7342
Product Name: Be Amazing! Toys Get Slimed! Science Kit

About Product: Explore the world of polymers by mixing different liquids together and watch them gel, harden, and expand as you create slimey science fun. | Some kid-favorite activities from this kit include Lumpy Slime, making Insta-Worms, making Worm Eggs, and Rainbow Worms. | The polymers and chemicals in this kit have been throughly tested and are safe for kids to use and we recommend adult supervision. | What is S.T.E.M.? STEM stands for Science, Technology, Engineering, and Math, which constitutes many of the areas educators look to cover for science based activities. We are proud to say that his kit has a strong focus on STEM. | The Amazing Science line of products has been designed to peak kid's curiosity for the world around them. These kits encourage kids to wonder, discover and explore in a way that will get the science to the dinner table. Our goal is to teach kids how to be amazing as they s

### Merge and deduplicate the results

Next, we'll merge the dense and sparse results and deduplicated them based on the field we used to link sparse and dense vectors, in our case `_id`.

In [76]:
def merge_chunks(h1, h2):
    """Get the unique hits from two search results and return them as single array of {'_id', 'chunk_text'} dicts, printing each dict on a new line."""
    # Deduplicate by _id
    deduped_hits = {hit['_id']: hit for hit in h1 + h2}.values()
    # Sort by _score descending
    sorted_hits = sorted(deduped_hits, key=lambda x: x['_score'], reverse=True)
    # Transform to format for reranking
    result = [{'_id': hit['_id'], 'chunk_text': hit['fields']['chunk_text']} for hit in sorted_hits]
    return result

merged_results = merge_chunks(sparse_results, dense_results)

print('[\n   ' + ',\n   '.join(str(obj) for obj in merged_results) + '\n]')

[
   {'_id': 'description_7342', 'chunk_text': "Product Name: Be Amazing! Toys Get Slimed! Science Kit\n\nAbout Product: Explore the world of polymers by mixing different liquids together and watch them gel, harden, and expand as you create slimey science fun. | Some kid-favorite activities from this kit include Lumpy Slime, making Insta-Worms, making Worm Eggs, and Rainbow Worms. | The polymers and chemicals in this kit have been throughly tested and are safe for kids to use and we recommend adult supervision. | What is S.T.E.M.? STEM stands for Science, Technology, Engineering, and Math, which constitutes many of the areas educators look to cover for science based activities. We are proud to say that his kit has a strong focus on STEM. | The Amazing Science line of products has been designed to peak kid's curiosity for the world around them. These kits encourage kids to wonder, discover and explore in a way that will get the science to the dinner table. Our goal is to teach kids how 

### Rerank the results

We'll use one of Pinecone’s hosted reranking models, `bge-reranker-v2-m3` to rerank the merged and deduplicated results based on a unified relevance score.

In [77]:
def rerank_results(question, results):
    result = pc.inference.rerank(
        model="bge-reranker-v2-m3",
        query=question,
        documents=results,
        rank_fields=["chunk_text"],
        top_n=5,
        return_documents=True,
        parameters={
            "truncate": "END"
        }
    )
    return result.data

reranked_results = rerank_results(USER_QUESTION, merged_results)

print("Query", USER_QUESTION)
print('-----')
for row in reranked_results:
    print(f"{row['document']['_id']} - {round(row['score'], 2)} - {row['document']['chunk_text']}")

Query I want to get my daughter more interested in science. What kind of gifts should I get her?
-----
description_5128 - 0.12 - Product Name: Hey! Play! Kids Science Kit-Lab Set to Create Solutions, Litmus Paper, & More-Great Fun & Educational Stem Learning Activity for Boys & Girls

About Product: Hands on learning- equipped with 4 test tubes and a holding rack, 2 beakers, dropper, measuring spoon, funnel, 3 grams of purple sweet potato powder and 10 sheets of paper filter, This is an excellent Basic starter science kit for kids! | Uses household items- The items needed for experiments that are not included with the kit are everyday items, that are easily found around the house, like scissors, plastic wrap, vinegar, baking soda, and water. | Stem activity- The science kit is a fantastic STEM (science, technology, engineering, Math) learning toy that will help your kids understand the concepts of mixing substances like acid and alkaline liquids and making things like litmus paper. | H

## 4. Augmentation

Next, we'll prepare the prompt with the search results as context for the next step, generation. Let's format them into a [search template using techniques](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview) Claude has been trained with and add the formatted descriptions to a prompt. We'll use this prompt to send the search results as context to the generation step.

### Setup the Anthropic client

In [78]:
# Formatting search results
def format_results(extracted: list[str]) -> str:
        result = "\n".join(
            [
                f'<item index="{i+1}">\n<page_content>\n{r["document"]["chunk_text"]}\n</page_content>\n</item>'
                for i, r in enumerate(extracted)
            ]
        )
    
        return f"\n<search_results>\n{result}\n</search_results>"

def create_answer_prompt(results_list, question):
    return f"""\n\nHuman: {format_results(results_list)} Using the search results provided within the <search_results></search_results> tags, please answer the following question <question>{question}</question>. Do not reference the search results in your answer.\n\nAssistant:"""


In [79]:
print(create_answer_prompt(reranked_results, USER_QUESTION))



Human: 
<search_results>
<item index="1">
<page_content>
Product Name: Hey! Play! Kids Science Kit-Lab Set to Create Solutions, Litmus Paper, & More-Great Fun & Educational Stem Learning Activity for Boys & Girls

About Product: Hands on learning- equipped with 4 test tubes and a holding rack, 2 beakers, dropper, measuring spoon, funnel, 3 grams of purple sweet potato powder and 10 sheets of paper filter, This is an excellent Basic starter science kit for kids! | Uses household items- The items needed for experiments that are not included with the kit are everyday items, that are easily found around the house, like scissors, plastic wrap, vinegar, baking soda, and water. | Stem activity- The science kit is a fantastic STEM (science, technology, engineering, Math) learning toy that will help your kids understand the concepts of mixing substances like acid and alkaline liquids and making things like litmus paper. | Hours of fun- this set is a wonderful gift for birthdays, holidays, or 

## 5. Generation: Answering with Claude

Finally, let's ask the original user's question and get our answer from Claude.

In [80]:
import anthropic

client = anthropic.Anthropic()
model = "claude-3-5-haiku-latest"

def get_completion(prompt):
    message = client.messages.create(
        model=model,
        max_tokens=1000,
        temperature=1,
        # system="You are a keyword generating assistant. Given a user message, you'll generate keywords to search for products.",
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": f"{prompt}"
                    }
                ]
            }
        ]
    )
    return message.content

In [None]:
answer = get_completion(create_answer_prompt(reranked_results, USER_QUESTION))

print(answer[0].text)


## Putting it all together

In [84]:
def answer_query_with_enriched_search(question):
    # Hybrid search

    ## Search dense index
    dense_results = search_dense_index(question)

    ## Search sparse index
    sparse_results = search_sparse_index(question)

    ## Merge and deduplicate results
    merged_results = merge_chunks(sparse_results, dense_results)
    
    ## Rerank results
    reranked_results = rerank_results(question, merged_results)
    
    ## Augment prompt with search results
    prompt = create_answer_prompt(reranked_results, question)

    ## Answer the question
    answer = get_completion(prompt)
    return answer[0].text


print(answer_query_with_enriched_search(USER_QUESTION))

Based on the search results, here are some great science-related gift ideas to help spark your daughter's interest in science:

1. Science Kits: These are excellent educational toys that make learning fun and interactive. Look for STEM (Science, Technology, Engineering, and Math) kits that offer hands-on experiments and activities. Some options include:
- Chemistry sets with test tubes and experiment materials
- Slime-making kits that explore polymers
- Plant growing kits that teach biology
- Survival science labs that show practical scientific skills

2. Key features to look for:
- Age-appropriate difficulty levels
- Engaging, colorful equipment
- Step-by-step instruction cards
- Experiments that demonstrate scientific concepts in an exciting way

3. Benefits of these science kits:
- Develop problem-solving skills
- Improve fine motor skills
- Encourage curiosity and exploration
- Make learning science enjoyable
- Build confidence in scientific thinking

4. Additional tips:
- Choose k

## Cleanup Indexes

Run these when you're done experimenting, to delete the indexes.

In [127]:
pc.delete_index(name=sparse_index_name)
pc.delete_index(name=dense_index_name)
