# LAB | Extractive Question Answering

This notebook demonstrates how Pinecone helps you build an extractive question-answering application. To build an extractive question-answering system, we need three main components:

- A vector index to store and run semantic search
- A retriever model for embedding context passages
- A reader model to extract answers

We will use the SQuAD dataset, which consists of **questions** and **context** paragraphs containing question **answers**. We generate embeddings for the context passages using the retriever, index them in the vector database, and query with semantic search to retrieve the top k most relevant contexts containing potential answers to our question. We then use the reader model to extract the answers from the returned contexts.

Let's get started by installing the packages needed for notebook to run:

In [None]:
!pip install python-dotenv

Collecting python-dotenv
  Downloading python_dotenv-1.1.0-py3-none-any.whl.metadata (24 kB)
Downloading python_dotenv-1.1.0-py3-none-any.whl (20 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.1.0


In [4]:
# import os
# from dotenv import load_dotenv, find_dotenv
# _ = load_dotenv(find_dotenv())

# OPENAI_API_KEY  = os.getenv('OPENAI_API_KEY')
# PINECONE_API_KEY= os.getenv('PINECONE_API_KEY')

In [5]:
# from google.colab import userdata
# userdata.get('OPENAI_API_KEY')

'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'

In [6]:
# from google.colab import userdata
# userdata.get('PINECONE_API_KEY')

'pcsk_6nSEUQ_SHVpWSQfQBNcusXMKSUBCaB8eQLuFpmy1S7d3AYREKmZTe4iFi1uiCwcLs8rmrn'

In [None]:
import os
from getpass import getpass
os.environ["OPENAI_API_KEY"] = getpass("OpenAI API Key: ")

os.environ["PINECONE_API_KEY"] = getpass("PINECONE_API_KEY:")

# Install Dependencies

In [1]:
!pip install -qU datasets pinecone-client sentence-transformers torch

# Load Dataset

Now let's load the SQUAD dataset from the HuggingFace Model Hub. We load the dataset into a pandas dataframe and filter the title, question, and context columns, and we drop any duplicate context passages.

In [3]:
from datasets import load_dataset

# load the squad dataset into a pandas dataframe
df = load_dataset("squad", split="train").to_pandas()

In [4]:
df = df[['title', 'context']]

# Drop rows with duplicate context passages
df = df.drop_duplicates(subset='context')

# Show the resulting DataFrame
df

Unnamed: 0,title,context
0,University_of_Notre_Dame,"Architecturally, the school has a Catholic cha..."
5,University_of_Notre_Dame,"As at most other universities, Notre Dame's st..."
10,University_of_Notre_Dame,The university is the major seat of the Congre...
15,University_of_Notre_Dame,The College of Engineering was established in ...
20,University_of_Notre_Dame,All of Notre Dame's undergraduate students are...
...,...,...
87574,Kathmandu,"Institute of Medicine, the central college of ..."
87579,Kathmandu,Football and Cricket are the most popular spor...
87584,Kathmandu,The total length of roads in Nepal is recorded...
87589,Kathmandu,The main international airport serving Kathman...


# Initialize Pinecone Index

The Pinecone index stores vector representations of our context passages which we can retrieve using another vector (query vector). We first need to initialize our connection to Pinecone to create our vector index. For this, we need a free [API key]("https://app.pinecone.io/"), and then we initialize the connection like so:

In [5]:
!pip install -qU langchain-pinecone pinecone-notebooks

[0m

In [7]:
# from pinecone import Pinecone, ServerlessSpec

# spec = ServerlessSpec(
#     cloud="aws", region="us-east-1"
# )

# # connect to pinecone environment
# pc = Pinecone(
#     api_key = PINECONE_API_KEY,
#     environment='us-east-1'  # find next to API key in console
# )

Now we create a new index called "question-answering" — we can name the index anything we want. We specify the metric type as "cosine" and dimension as 384 because the retriever we use to generate context embeddings is optimized for cosine similarity and outputs 384-dimension vectors.

In [9]:
# # Initialize Pinecone client
#Set the index name
index_name = "extractive-question-answering"

# Check if the index exists
if index_name not in pc.list_indexes().names():
    # Create the index if it does not exist
    spec = ServerlessSpec(cloud="aws", region="us-east-1")  # Define the serverless specification
    pc.create_index(
        name=index_name,
        dimension=1536,  # Adjust dimension based on your use case
        metric="cosine",  # Using cosine similarity
        spec=spec  # Provide the spec here
    )

# Connect to the 'extractive-question-answering' index
index = pc.Index(index_name)
# 

# Initialize Retriever

Next, we need to initialize our retriever. The retriever will mainly do two things:

- Generate embeddings for all context passages (context vectors/embeddings)
- Generate embeddings for our questions (query vector/embedding)

The retriever will generate embeddings in a way that the questions and context passages containing answers to our questions are nearby in the vector space. We can use cosine similarity to calculate the similarity between the query and context embeddings to find the context passages that contain potential answers to our question.

We will use a SentenceTransformer model named ``multi-qa-MiniLM-L6-cos-v1`` designed for semantic search and trained on 215M (question, answer) pairs from diverse sources as our retriever.

In [10]:
import torch
from sentence_transformers import SentenceTransformer

# Set device to GPU if available
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Load the retriever model from HuggingFace model hub
retriever = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1', device=device)

# Print the model to verify
retriever


2025-04-16 14:32:47.268041: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/11.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/383 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

# Generate Embeddings and Upsert

Next, we need to generate embeddings for the context passages. We will do this in batches to help us more quickly generate embeddings and upload them to the Pinecone index. When passing the documents to Pinecone, we need an id (a unique value), context embedding, and metadata for each document representing context passages in the dataset. The metadata is a dictionary containing data relevant to our embeddings, such as the article title, context passage, etc.

In [12]:
# pc.delete_index(index_name)

In [12]:
pc.create_index(
    name=index_name,
    dimension=384,  # استخدام البُعد 384 بدلاً من 1536
    metric="cosine",  # استخدام مقياس التشابه الكوني
    spec=ServerlessSpec(cloud="aws", region="us-east-1")  # نشر الفهرس على AWS بشكل سيرفرلس
)

{
    "name": "extractive-question-answering",
    "metric": "cosine",
    "host": "extractive-question-answering-93gzhxy.svc.aped-4627-b74a.pinecone.io",
    "spec": {
        "serverless": {
            "cloud": "aws",
            "region": "us-east-1"
        }
    },
    "status": {
        "ready": true,
        "state": "Ready"
    },
    "vector_type": "dense",
    "dimension": 384,
    "deletion_protection": "disabled",
    "tags": null
}

In [13]:
while not pc.describe_index(index_name).status["ready"]:
    time.sleep(1)

In [14]:
index = pc.Index(index_name)

In [15]:
from tqdm.auto import tqdm

# Assuming 'df' is your DataFrame containing the text data
# We'll use batches of 64
batch_size = 64

# Prepare list for upserting data
to_upsert = []

# Loop through DataFrame in batches of 64
for i in tqdm(range(0, len(df), batch_size)):
    # Find the end of the batch
    end = min(i + batch_size, len(df))

    # Extract the batch (texts to embed)
    batch_texts = df['context'][i:end].tolist()

    # Generate embeddings for the batch
    emb = retriever.encode(batch_texts, show_progress_bar=False)

    # Get metadata (optional, assuming you want to store some metadata)
    meta = [{"text": text} for text in batch_texts]  # You can customize the metadata structure

    # Create unique IDs (assuming your DataFrame has an 'id' column or using index)
    ids = [str(idx) for idx in range(i, end)]

    # Combine vectors, metadata, and IDs into a format suitable for upsert
    to_upsert_batch = list(zip(ids, emb, meta))
    to_upsert.extend(to_upsert_batch)

    # Upsert these records to Pinecone
    _ = index.upsert(vectors=to_upsert_batch)

# Check that we have all vectors in the index
index.describe_index_stats()


  0%|          | 0/296 [00:00<?, ?it/s]

{'dimension': 384,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {'': {'vector_count': 18891}},
 'total_vector_count': 18891,
 'vector_type': 'dense'}

In [16]:
from tqdm.auto import tqdm

# we will use batches of 64
batch_size = 64

# Create list to collect all the upsert data
to_upsert = []

# Loop through DataFrame in batches of 64
for i in tqdm(range(0, len(df), batch_size)):
    # Find the end of the batch
    end = min(i + batch_size, len(df))

    # Extract the batch (texts to embed)
    batch_texts = df['context'][i:end].tolist()

    # Generate embeddings for the batch
    emb = retriever.encode(batch_texts, show_progress_bar=False)

    # Get metadata (optional)
    meta = [{"text": text} for text in batch_texts]  # You can customize metadata structure

    # Create unique IDs (assuming your DataFrame has an 'id' column or using index)
    ids = [str(idx) for idx in range(i, end)]

    # Combine vectors, metadata, and IDs into a format suitable for upsert
    to_upsert_batch = list(zip(ids, emb, meta))

    # Add the current batch to the upsert list
    to_upsert.extend(to_upsert_batch)

    # Upsert these records to Pinecone
    _ = index.upsert(vectors=to_upsert_batch)

# Check that we have all vectors in the index
index.describe_index_stats()


  0%|          | 0/296 [00:00<?, ?it/s]

{'dimension': 384,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {'': {'vector_count': 18891}},
 'total_vector_count': 18891,
 'vector_type': 'dense'}

In [17]:
from tqdm.auto import tqdm

# we will use batches of 64
batch_size = 64

# Create list to collect all the upsert data
to_upsert = []

# Loop through DataFrame in batches of 64
for i in tqdm(range(0, len(df), batch_size)):
    # Find the end of the batch
    end = min(i + batch_size, len(df))

    # Extract the batch (texts to embed)
    batch_texts = df['context'][i:end].tolist()

    # Generate embeddings for the batch
    emb = retriever.encode(batch_texts, show_progress_bar=False)

    # Get metadata (optional)
    meta = [{"text": text} for text in batch_texts]  # You can customize metadata structure

    # Create unique IDs (assuming your DataFrame has an 'id' column or using index)
    ids = [str(idx) for idx in range(i, end)]

    # Combine vectors, metadata, and IDs into a format suitable for upsert
    to_upsert_batch = list(zip(ids, emb, meta))

    # Add the current batch to the upsert list
    to_upsert.extend(to_upsert_batch)

    # Upsert these records to Pinecone
    _ = index.upsert(vectors=to_upsert_batch)

# Check that we have all vectors in the index
index.describe_index_stats()


  0%|          | 0/296 [00:00<?, ?it/s]

{'dimension': 384,
 'index_fullness': 0.0,
 'metric': 'cosine',
 'namespaces': {'': {'vector_count': 18891}},
 'total_vector_count': 18891,
 'vector_type': 'dense'}

# Initialize Reader

We use the `deepset/electra-base-squad2` model from the HuggingFace model hub as our reader model. We load this model into a "question-answering" pipeline from HuggingFace transformers and feed it our questions and context passages individually. The model gives a prediction for each context we pass through the pipeline.

In [18]:
from transformers import pipeline

model_name = 'deepset/electra-base-squad2'
# load the reader model into a question-answering pipeline
reader = pipeline(tokenizer=model_name, model=model_name, task='question-answering', device=device)
reader

config.json:   0%|          | 0.00/635 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/436M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Device set to use cuda


<transformers.pipelines.question_answering.QuestionAnsweringPipeline at 0x7bb3d6d90310>

Now all the components we need are ready. Let's write some helper functions to execute our queries. The `get_context` function retrieves the context embeddings containing answers to our question from the Pinecone index, and the `extract_answer` function extracts the answers from these context passages.

In [19]:
# Function to get context passages from Pinecone index
def get_context(question, top_k):
    # Generate embeddings for the question
    xq = retriever.encode([question]).tolist()
    # Search Pinecone index for context passages with the answer
    xc = index.query(xq, top_k=top_k, include_metadata=True)

    # Extract the context passage from Pinecone search result
    c = [item['metadata']['context'] for item in xc['matches']]

    return c

In [29]:
# Function to get context passages from Pinecone index
def get_context(question, top_k):
    # Generate embeddings for the question
    xq = retriever.encode([question])  # Convert to list for easier visualization
    print("Generated Embeddings:", xq)

    # Uncomment the following lines to perform the actual query when ready
    # Search Pinecone index for context passages with the answer
    # xc = index.query(xq[0], top_k=top_k, include_metadata=True)

    # Uncomment the following line to extract context if you have metadata in your index
    # c = [item['metadata']['context'] for item in xc['matches']]

    return xq  # Return the embeddings for now

# Test the function with an example question
question = "What are the latest advancements in AI?"
top_k = 3  # Let's say we want the top 3 results
context = get_context(question, top_k)
embeddings = get_context(question, top_k)

# Check the embeddings output
print("Returned Embeddings:", embeddings)


Generated Embeddings: [[ 1.28709879e-02 -5.51708378e-02  2.52997912e-02 -6.32012337e-02
   3.19342352e-02  9.75179970e-02  5.25438343e-04  2.12336495e-03
  -3.69040109e-02  6.75504804e-02  5.34004066e-03  3.28505673e-02
  -2.58604158e-02  7.19038351e-03 -1.30323917e-02  4.57565077e-02
   1.45177916e-02 -7.18916673e-03 -1.85968764e-02 -1.30328432e-01
  -1.10517688e-01 -7.43645355e-02  3.48295793e-02 -1.96775533e-02
   4.30373661e-02  4.18300331e-02  1.28241293e-02 -8.98071975e-02
   3.08543444e-03 -3.09438761e-02 -2.97223777e-02 -4.43956396e-03
   1.17591508e-01  2.69615389e-02 -2.50881873e-02 -5.90423234e-02
   6.60289600e-02  2.64717322e-02 -3.09597910e-03 -2.69958321e-02
   1.35160098e-02 -7.17167631e-02  5.76887606e-03 -1.01811692e-01
   7.89095610e-02  1.28495889e-02 -2.77430248e-02 -2.68020071e-02
   2.19860347e-03  1.08028119e-02 -9.78449062e-02 -3.87033485e-02
   3.86375934e-02  1.77336875e-02  1.49273896e-03 -1.30784325e-02
   5.24269463e-03 -3.21669914e-02 -2.71244776e-02 -2.0

In [30]:
from pprint import pprint

# extracts answer from the context passage
def extract_answer(question, context):
    results = []
    for c in context:
        # feed the reader the question and contexts to extract answers
        answer = reader(question=question, context=c)
        # add the context to answer dict for printing both together
        answer["context"] = c
        results.append(answer)
    # sort the result based on the score from reader model
    sorted_result = pprint(sorted(results, key=lambda x: x['score'], reverse=True))
    return sorted_result

In [31]:
question = "How much oil is Egypt producing in a day?"
get_context(question, top_k = 1)


Generated Embeddings: [[ 1.11147556e-02  1.12075411e-01 -1.54263461e-02 -6.42907666e-03
   5.34151820e-03 -9.93521512e-02  2.97952052e-02 -4.49646823e-02
  -3.97945605e-02  9.88995563e-03  2.48330627e-02 -1.03421718e-01
  -6.28031231e-03 -3.98416556e-02 -7.92154204e-03  3.52002867e-03
  -1.23599553e-02 -5.66816069e-02  9.99810640e-03 -4.29273993e-02
   2.80498080e-02  1.78608019e-02 -2.23940909e-02  1.26310876e-02
   3.54898162e-02  4.87940237e-02 -4.92118706e-04 -5.92474416e-02
   2.66577620e-02  2.48849541e-02 -6.54957443e-02 -1.71945505e-02
   3.77411246e-02 -1.11657334e-02 -1.56853385e-02  2.41276659e-02
  -3.76038216e-02 -1.74206793e-02  1.03020832e-01  8.29471350e-02
   6.01437725e-02 -8.19898173e-02  5.08118607e-02 -6.55412003e-02
  -4.90792245e-02  3.24179158e-02  5.54118827e-02  3.62975970e-02
   3.58277410e-02  1.90775003e-02 -2.44830661e-02  9.54271201e-03
  -9.00635496e-02 -1.01070985e-01  3.19647300e-03 -1.19907334e-01
  -3.39621417e-02  2.76913159e-02 -9.05647408e-03 -2.0

array([[ 1.11147556e-02,  1.12075411e-01, -1.54263461e-02,
        -6.42907666e-03,  5.34151820e-03, -9.93521512e-02,
         2.97952052e-02, -4.49646823e-02, -3.97945605e-02,
         9.88995563e-03,  2.48330627e-02, -1.03421718e-01,
        -6.28031231e-03, -3.98416556e-02, -7.92154204e-03,
         3.52002867e-03, -1.23599553e-02, -5.66816069e-02,
         9.99810640e-03, -4.29273993e-02,  2.80498080e-02,
         1.78608019e-02, -2.23940909e-02,  1.26310876e-02,
         3.54898162e-02,  4.87940237e-02, -4.92118706e-04,
        -5.92474416e-02,  2.66577620e-02,  2.48849541e-02,
        -6.54957443e-02, -1.71945505e-02,  3.77411246e-02,
        -1.11657334e-02, -1.56853385e-02,  2.41276659e-02,
        -3.76038216e-02, -1.74206793e-02,  1.03020832e-01,
         8.29471350e-02,  6.01437725e-02, -8.19898173e-02,
         5.08118607e-02, -6.55412003e-02, -4.90792245e-02,
         3.24179158e-02,  5.54118827e-02,  3.62975970e-02,
         3.58277410e-02,  1.90775003e-02, -2.44830661e-0

As we can see, the retiever is working fine and gets us the context passage that contains the answer to our question. Now let's use the reader to extract the exact answer from the context passage.

In [39]:
from transformers import pipeline

# تحميل نموذج الإجابة على الأسئلة (Question Answering)
qa_pipeline = pipeline("question-answering")

# تحديد السؤال والسياق
question = "What are the latest advancements in AI?"
context = "Artificial Intelligence (AI) has made significant strides in various fields such as healthcare, robotics, and natural language processing. For example, AI is now capable of diagnosing diseases, automating complex tasks, and generating human-like text."

# استدعاء الدالة لاستخراج الإجابة
answer = qa_pipeline(question=question, context=context)

# طباعة الإجابة المستخرجة
print(f"Answer: {answer['answer']}")


No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

Device set to use cuda:0


Answer: healthcare, robotics, and natural language processing


In [40]:
extract_answer(question, context)


You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


[{'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.24999088048934937,
  'start': 0},
 {'answer': 'e',
  'context': 'e',
  'end': 1,
  'score': 0.

The reader model predicted with 99% accuracy the correct answer *691,000 bbl/d* as seen from the context passage. Let's run few more queries.

In [47]:
# استرجاع السياق (context) من get_context
question = "What are the first names of the men that invented youtube?"
top_k = 1  # نسترجع أفضل نتيجة واحدة
context = get_context(question, top_k)

print("Context:", context)  # تحقق من أن context هو نص


Generated Embeddings: [[-3.25871632e-02 -8.90004858e-02 -2.40452457e-02 -9.44203883e-02
  -2.62179989e-02  1.35318339e-02  8.74210745e-02 -2.66191270e-02
  -2.85127182e-02  6.72550057e-04  1.15057966e-02  3.41613702e-02
   1.52549697e-02 -1.41664436e-02  3.68564762e-02  5.88587523e-02
  -5.17912507e-02  2.69979462e-02  6.93734512e-02 -5.82064502e-02
   2.61781588e-02 -6.05556695e-03  3.00621893e-02 -3.70516814e-02
  -7.92192854e-03 -3.28192711e-02 -3.20224836e-02  8.57615918e-02
   8.84323269e-02 -2.28717048e-02  3.57003254e-03 -1.80714987e-02
   7.16505498e-02 -1.35072460e-02 -4.74275425e-02 -6.55596005e-03
  -6.55885180e-03 -8.86109099e-03 -4.63062301e-02 -5.35677047e-03
  -1.91476941e-02 -3.65998298e-02 -3.31841893e-02 -8.30877805e-04
   1.73920505e-02  1.18092876e-02 -6.03031134e-03  4.23692800e-02
   5.57004362e-02  3.33940573e-02 -5.07558659e-02 -4.74842042e-02
  -4.20629047e-03 -5.62548228e-02  2.66201813e-02 -4.56965202e-03
   5.77119878e-03  3.70330699e-02  3.58117446e-02 -2.1

In [50]:
# استرجاع السياق (context) من get_context
question = "What is Albert Einstein famous for?"
top_k = 1  # نسترجع أفضل نتيجة واحدة
context = get_context(question, top_k)

# تحقق من أن context هو نص وليس تمثيل متجهي
print("Context:", context)


Generated Embeddings: [[-2.13202368e-03  4.68071513e-02 -3.65614034e-02  1.90333053e-02
  -6.34508021e-03  6.21492751e-02  9.37853456e-02  4.59525660e-02
  -4.47247084e-03 -5.00176214e-02 -4.15679291e-02 -8.93010944e-02
   1.80421602e-02 -1.35908071e-02 -2.26547867e-02  6.08458631e-02
   2.63079368e-02  1.01336412e-01 -4.43377271e-02 -7.59927332e-02
  -2.29652715e-03 -1.03090024e-02  2.62243785e-02 -3.57812382e-02
  -7.69983232e-02 -1.04440991e-02  1.08457617e-02 -4.68443483e-02
   3.65764312e-02  8.91236775e-03  7.76334945e-03 -1.01852104e-01
   8.41919184e-02 -3.46610919e-02 -1.76275410e-02  8.88205785e-03
   4.16174502e-04  8.15657824e-02  3.23080719e-02 -1.55227883e-02
   2.11766511e-02 -3.23867127e-02  1.68393776e-02 -4.18713242e-02
   4.22275700e-02 -5.09946235e-02  8.85939077e-02 -3.09881438e-02
   2.61133276e-02  4.90331091e-02 -3.19729410e-02  2.85035484e-02
   2.27929372e-02 -1.01609051e-01  3.20707113e-02  8.27551761e-04
   4.57258672e-02 -1.84472390e-02  3.94579768e-02 -4.2

Let's run another question. This time for top 3 context passages from the retriever.

In [51]:
# استرجاع السياق (context) من get_context
question = "Who was the first person to step foot on the moon?"
top_k = 3  # نسترجع أفضل 3 نتائج
context = get_context(question, top_k)

# تحقق من أن context هو نص وليس تمثيل متجهي
print("Context:", context)

Generated Embeddings: [[ 2.85361242e-03  5.20563573e-02  2.22939923e-02 -2.34315842e-02
  -3.60241719e-02  4.45477897e-03  2.07253955e-02 -9.46046133e-03
  -8.97998139e-02  7.52887689e-03  7.26019889e-02 -3.65170948e-02
   4.15867493e-02 -7.00750053e-02 -4.96413670e-02  2.37628352e-02
  -1.09400712e-01  7.14018419e-02  4.26613577e-02  8.77660513e-03
  -3.66729274e-02 -6.62560808e-03  1.70401949e-02  4.74360809e-02
   7.92241469e-03  2.17704158e-02  3.84075642e-02  3.70849781e-02
   6.87435502e-03  9.49292444e-03 -3.18561494e-02 -5.09122722e-02
  -1.80235486e-02  9.65706204e-05  2.93865744e-02  2.01187767e-02
   2.63525974e-02 -5.47964871e-02  1.33743472e-02 -5.27317636e-04
   6.60705045e-02 -4.62665632e-02  4.20596749e-02  1.29377737e-03
   3.77812535e-02  4.05998342e-02  3.50584313e-02 -9.78049710e-02
   5.42777637e-03  5.12748063e-02  3.95054370e-02  2.52864063e-02
  -1.45415282e-02  1.02410801e-02  1.24488268e-02 -4.54429016e-02
   1.98964681e-02 -8.12041014e-03  4.12587002e-02 -8.9

The result looks pretty good.

In [52]:
pc.delete_index(index_name)

### Add a few more questions. What did you observe?

In [53]:
# استرجاع السياق (context) من get_context
question = "Who invented the light bulb?"
top_k = 3  # نسترجع أفضل 3 نتائج
context = get_context(question, top_k)

# تحقق من أن context هو نص وليس تمثيل متجهي
print("Context:", context)

Generated Embeddings: [[-4.04228233e-02  4.90614325e-02 -4.97700684e-02  3.28756273e-02
  -4.65912819e-02 -1.41462469e-02  6.69592395e-02  2.92668976e-02
  -4.96339872e-02  3.16705182e-02  7.76974559e-02  2.76088398e-02
   4.14532945e-02 -3.81837040e-02 -5.75622953e-02  6.22946247e-02
  -1.04693420e-01 -3.82263819e-03  3.51033034e-03 -7.90760294e-02
   9.28712487e-02 -7.74235046e-03  9.93649941e-03  8.21632042e-04
   6.46235421e-02 -2.13993099e-02  1.63579844e-02  1.94132831e-02
   3.55608128e-02  1.95540488e-02  1.16601875e-02 -2.53309216e-02
  -3.42973806e-02  3.71146947e-02 -2.96126660e-02 -1.57713313e-02
   7.78800473e-02 -1.14777191e-02  6.01579770e-02  4.41792570e-02
  -4.79219444e-02 -3.88208441e-02 -1.27620967e-02 -1.15181888e-02
   3.45381945e-02  1.73361581e-02  6.96749687e-02  4.29197075e-03
  -3.53334658e-02 -1.50640542e-02  6.55347854e-02 -2.33882722e-02
   4.17637452e-02 -4.18097787e-02  7.71186426e-02  9.04201418e-02
   2.71775443e-02  4.18625809e-02  6.33898452e-02 -7.3