# Lesson 2: Iterative Retrieval for Enhanced RAG Pipelines


Welcome back to this second lesson in the **Beyond Basic RAG: Improving our Pipeline** course! In the previous lesson, we explored ways to ensure that your language model stays grounded by responding only with information taken from retrieved context. That approach helps us avoid hallucinations and keeps the output reliable.

In this lesson, we’ll improve the pipeline even further by making the retrieval process more **iterative**. Instead of collecting chunks of text just once before moving on to generation, we’ll refine our queries step by step. This multi-stage retrieval can pinpoint the most relevant information and produce a more helpful final context.

---

## The Concept of Iterative Retrieval

Imagine a scenario where a user asks:

> “Tell me about the regulations for staff members.”

The question might be too broad. A typical one-shot retrieval might find some relevant chunks, but you might also want to narrow in on “internal policies” or “mandatory forms” for more precision.

**Iterative retrieval** does exactly that:

1. **Retrieve** an initial chunk based on the user’s query.
2. **Refine** that query with a new keyword from the retrieved chunk (e.g., “internal” or “policies”).
3. **Repeat** until you’ve gathered a set of chunks that thoroughly answers the question—or until improvements level off.

This multi-pass approach can drastically improve the depth and breadth of the retrieved information, making your final context more complete.

---

## Practical Example: Iterative Retrieval in Action

**User query:**

> “Tell me about the regulations for staff members.”

**Sample chunks in DB:**

* **Chunk 1:**

  > “Our company requires that all staff members adhere to internal policies such as punctuality, dress code, and ethical behavior...”

* **Chunk 2:**

  > “Regulations for staff emphasize adherence to both internal policies and government standards, covering conduct, reporting, ...”

**Iteration 1**

* **Query:** “Tell me about the regulations for staff members”
* **Best match:** Chunk 1 (score: 0.87)
* **Extracted keyword:** “internal”

**Iteration 2**

* **Updated Query:** “Tell me about the regulations for staff members internal”
* **Best match:** Chunk 2 (score: 0.93)
* Since further refinement doesn’t significantly improve the score, the process stops.

The system then uses these accumulated chunks to generate a grounded and comprehensive answer.

---

### Retrieving the Best Chunk

Define a function to fetch the best matching chunk given a query:

```python
def retrieve_best_chunk(query_text, collection, n_results=1):
    """
    Retrieve the best matching chunk from the collection based on the given query.
    Returns:
      best_chunk_text, best_chunk_score, best_chunk_metadata
      (or None, None, None if retrieval fails)
    """
    # Perform a similarity search for the provided query
    retrieval = collection.query(query_texts=[query_text], n_results=n_results)
    
    # If nothing is found, return None
    if not retrieval['documents'][0]:
        return None, None, None

    # Extract the best match from the results
    best_chunk_text = retrieval['documents'][0][0]
    best_distance   = retrieval['distances'][0][0]
    
    # Convert 'distance' to a simple similarity score
    best_chunk_score    = 1 / (1 + best_distance)
    best_chunk_metadata = retrieval['metadatas'][0][0]
    
    return best_chunk_text, best_chunk_score, best_chunk_metadata
```

**Key points:**

* We query our collection (chunks in a vector DB) for `query_text`.
* If no chunks are returned, we exit gracefully with `None`.
* Otherwise, we pick the top chunk and compute a similarity score via `1 / (1 + distance)`.

---

### Extracting and Refining Queries

Once you’ve retrieved a chunk, extract a refinement keyword to make the next query more specific:

```python
def extract_refinement_keyword(chunk_text, current_query):
    """
    Extract a single keyword from the chunk that is not already in the current query.
    - Ignores stopwords and short words.
    - Picks the longest remaining candidate.
    """
    # Convert text to lowercase words
    chunk_words = re.findall(r'\b\w+\b', chunk_text.lower())
    query_words = set(re.findall(r'\b\w+\b', current_query.lower()))

    # Filter out stopwords, words already in query, or very short words
    candidate_words = [
        w for w in chunk_words
        if w not in STOPWORDS and w not in query_words and len(w) > 4
    ]

    if not candidate_words:
        return ""

    # Pick the longest candidate word
    refine_word = max(candidate_words, key=len)
    return refine_word

def refine_query(current_query, refine_word):
    """
    Append the chosen refine_word to the current query if it exists.
    """
    if not refine_word:
        return current_query
    return f"{current_query} {refine_word}"
```

**Why this helps:**

* **Filtering stopwords** reduces noise.
* **Skipping existing words** avoids repetition.
* **Choosing longer words** often yields more meaningful refinements.

---

### Putting It All Together: Iterative Retrieval

Here’s a condensed multi-step loop:

```python
def iterative_retrieval(query, collection, steps=3):
    """
    Multi-step retrieval with a simple query refinement approach:
      1) Retrieve the best chunk for the current query.
      2) Extract and append one new keyword.
      3) Repeat until no improvements or max steps reached.
    """
    accumulated_chunks  = []
    current_query       = query
    best_score_so_far   = 0.0

    for step in range(steps):
        best_chunk_text, best_chunk_score, metadata = retrieve_best_chunk(current_query, collection)
        
        # Stop if no improvement
        if not best_chunk_text or best_chunk_score <= best_score_so_far + IMPROVEMENT_THRESHOLD:
            break

        best_score_so_far = best_chunk_score
        accumulated_chunks.append({
            'step': step + 1,
            'query': current_query,
            'retrieved_chunk': best_chunk_text,
            'score': best_chunk_score
        })

        refine_word   = extract_refinement_keyword(best_chunk_text, current_query)
        current_query = refine_query(current_query, refine_word)
    
    return accumulated_chunks
```

* We limit to a fixed number of iterations (e.g., 3).
* Each pass retrieves, scores, and—if improved—refines the query.
* The result is a list of chunks offering richer context than a single retrieval.

---

## Practical Considerations

* **Heuristics:** Longest-word extraction and score thresholds are simple; adjust them to your needs.
* **Performance Trade-offs:** More iterations = better context but higher cost.
* **Limitations:** If the initial chunk is irrelevant, further steps might not help much.
* **Advanced Refinements:** Consider NLP techniques (e.g., keyphrase extraction) for smarter query expansion.

---

## Conclusion and Next Steps

You’ve now seen how **iterative retrieval** can make your RAG pipeline more robust by gradually honing in on the most relevant information. This lesson builds on our previous work grounding generation, and now you have an even better way to gather the right context in the first place.

Coming up next, you’ll get hands-on practice implementing and tweaking iterative retrieval strategies. Experiment with different thresholds, iteration counts, or NLP-based refiners—each tweak can significantly boost your RAG system’s performance.

Stay curious and keep refining! You’re making great strides toward a complete, high-performing Retrieval-Augmented Generation pipeline.


## Inverted Distance Similarity Score

In this activity, you'll dive into the heart of the retrieve_best_chunk function. Your mission is to complete the calculation of the inverted-distance similarity score. This score is crucial as it transforms the distance metric into a similarity measure, helping us identify the most relevant text chunk for a given query.

Yor task today is to fill in the missing part of the formula to compute the similarity score using the inverted distance method. Remember, the formula should convert a smaller distance into a higher similarity score.

By completing this exercise, you'll gain a deeper understanding of how similarity scores are derived and their role in refining query results. Once you're done, test your code to see how well it retrieves the best matching chunk. Happy coding!

```python
import re
from data import load_and_chunk_corpus
from vector_db import build_chroma_collection

# Optional: set this to 0.0 if you don't need to enforce a minimum improvement.
IMPROVEMENT_THRESHOLD = 0.02

# Common stopwords to ignore when extracting keywords
STOPWORDS = set([
    "the","and","is","in","of","to","a","that","for","on","with","as","it","by",
    "this","are","was","at","from","or","be","which","not","can","also","have",
    "has","had","we","they","you","he","she","his","her","its","our","us","their",
    "them","i","do","does","did","just","so","if","may","will","shall","more","most",
    "some","many","any","all","what","about","would","could","should","where","when",
    "why","how"
])

def retrieve_best_chunk(query_text, collection, n_results=1):
    """
    Retrieve the best matching chunk from the collection based on the given query.
    Returns:
      best_chunk_text, best_chunk_score, best_chunk_metadata
      (or None, None, None if retrieval fails)
    """
    retrieval = collection.query(query_texts=[query_text], n_results=n_results)
    if not retrieval['documents'][0]:
        return None, None, None

    best_chunk_text = retrieval['documents'][0][0]
    best_distance = retrieval['distances'][0][0]
    # TODO: Complete the inverted-distance similarity score formula.
    # Hint: Use 1 / (1 + best_distance) to convert distance to similarity.
    best_chunk_score = _____
    best_chunk_metadata = retrieval['metadatas'][0][0]
    return best_chunk_text, best_chunk_score, best_chunk_metadata

def extract_refinement_keyword(chunk_text, current_query):
    """
    Extract a single keyword from the chunk that is not in the current query.
    - Ignores stopwords and short words.
    - Chooses the longest candidate word found in the chunk.
    If no suitable keyword is found, returns an empty string.
    """
    chunk_words = re.findall(r'\b\w+\b', chunk_text.lower())
    query_words = set(re.findall(r'\b\w+\b', current_query.lower()))

    # Filter out stopwords, words already in the query, or very short words
    candidate_words = [
        w for w in chunk_words
        if w not in STOPWORDS and w not in query_words and len(w) > 4
    ]

    if not candidate_words:
        return ""

    # Pick the longest candidate
    refine_word = max(candidate_words, key=len)
    return refine_word

def refine_query(current_query, refine_word):
    """
    Append the chosen refine_word to the current query if it exists.
    """
    if not refine_word:
        return current_query
    return f"{current_query} {refine_word}"

def iterative_retrieval(query, collection, steps=3):
    """
    Multi-step retrieval with a simple query refinement approach:
      1) Retrieve the best chunk for the current query.
      2) Extract one new keyword from that chunk and add it to the query (optional).
      3) Stop if no improvement in similarity or no new keyword is found.
      4) Collect the chunks from each step for final context.

    This version uses a global IMPROVEMENT_THRESHOLD to decide if we should keep refining.
    """
    accumulated_chunks = []
    current_query = query
    best_score_so_far = 0.0

    for step in range(steps):
        print(f"Iteration {step+1}, current query: '{current_query}'")

        best_chunk_text, best_chunk_score, best_chunk_metadata = retrieve_best_chunk(current_query, collection)
        if not best_chunk_text:
            print("No chunks found at this step. Ending refinement.")
            break

        print(f"Best chunk text (first 50 chars): '{best_chunk_text[:50]}...' | Score: {best_chunk_score:.4f}")

        # Check if this step yields improvement, if that's a point of interest
        if best_chunk_score - best_score_so_far < IMPROVEMENT_THRESHOLD:
            print("Improvement threshold not met. Stopping refinements.")
            break

        best_score_so_far = best_chunk_score

        # Store the chunk information
        accumulated_chunks.append({
            'step': step + 1,
            'query': current_query,
            'retrieved_chunk': {
                'text': best_chunk_text,
                'metadata': best_chunk_metadata
            },
            'score': best_chunk_score
        })

        # Attempt to refine the query by extracting a new keyword
        refine_word = extract_refinement_keyword(best_chunk_text, current_query)
        if not refine_word:
            print("No suitable keyword found for further refinement.")
            break

        # Update the current query with the chosen keyword
        print(f"Refining query with keyword: {refine_word}")
        current_query = refine_query(current_query, refine_word)

    return accumulated_chunks

def build_final_context(iteration_results):
    """
    Combine all retrieved chunks from each iteration into one context block.
    Return a fallback message if no chunks were retrieved.
    """
    if not iteration_results:
        return "No relevant information was found after iterative retrieval."

    lines = []
    for result in iteration_results:
        lines.append(
            f"- Step {result['step']} (Score={result['score']:.4f}): {result['retrieved_chunk']['text']}"
        )
    return "\n".join(lines)

if __name__ == "__main__":
    # Small demonstration with a sample corpus
    chunked_docs = load_and_chunk_corpus("data/corpus.json", chunk_size=40)
    collection = build_chroma_collection(chunked_docs, collection_name="iterative_collection")

    initial_query = "What internal policies apply specifically to employees?"
    iteration_results = iterative_retrieval(initial_query, collection, steps=3)

    final_context = build_final_context(iteration_results)
    print(f"\nFinal combined context:\n{final_context}")
    # Next step could be feeding 'final_context' to your LLM with the refined or original query.

```

Thought for a couple of seconds


Here’s the updated `retrieve_best_chunk` with the inverted‐distance similarity score filled in:

```python
def retrieve_best_chunk(query_text, collection, n_results=1):
    """
    Retrieve the best matching chunk from the collection based on the given query.
    Returns:
      best_chunk_text, best_chunk_score, best_chunk_metadata
      (or None, None, None if retrieval fails)
    """
    retrieval = collection.query(query_texts=[query_text], n_results=n_results)
    if not retrieval['documents'][0]:
        return None, None, None

    best_chunk_text = retrieval['documents'][0][0]
    best_distance   = retrieval['distances'][0][0]
    # Convert distance into similarity: smaller distance → higher score
    best_chunk_score = 1 / (1 + best_distance)
    best_chunk_metadata = retrieval['metadatas'][0][0]
    return best_chunk_text, best_chunk_score, best_chunk_metadata
```

With that in place, you can run your iterative workflow and you’ll see that chunks with smaller distances now yield higher similarity scores. Happy coding!


## Limit Chunks in Iterative Retrieval

n the previous exercise, you successfully refined queries by prioritizing keywords. Now, let's elevate your skills by adding a new feature to our iterative retrieval process! Your mission is to introduce a maximum chunk limit to the iterative_retrieval function, ensuring the retrieval process remains efficient and manageable.

Here's what you need to do:

Expand the iterative_retrieval function to accept a new parameter, max_chunks.
Modify the loop so that it stops if the number of retrieved chunks reaches this max_chunks limit.
Once you've made the changes, test your code to see how it gracefully handles the chunk limit. Enjoy the challenge, and happy coding!

```python
import re
from data import load_and_chunk_corpus
from vector_db import build_chroma_collection

# Optional: set this to 0.0 if you don't need to enforce a minimum improvement.
IMPROVEMENT_THRESHOLD = 0.02

# Common stopwords to ignore when extracting keywords
STOPWORDS = set([
    "the","and","is","in","of","to","a","that","for","on","with","as","it","by",
    "this","are","was","at","from","or","be","which","not","can","also","have",
    "has","had","we","they","you","he","she","his","her","its","our","us","their",
    "them","i","do","does","did","just","so","if","may","will","shall","more","most",
    "some","many","any","all","what","about","would","could","should","where","when",
    "why","how"
])

def retrieve_best_chunk(query_text, collection, n_results=1):
    """
    Retrieve the best matching chunk from the collection based on the given query.
    Returns:
      best_chunk_text, best_chunk_score, best_chunk_metadata
      (or None, None, None if retrieval fails)
    """
    retrieval = collection.query(query_texts=[query_text], n_results=n_results)
    if not retrieval['documents'][0]:
        return None, None, None

    best_chunk_text = retrieval['documents'][0][0]
    best_distance = retrieval['distances'][0][0]
    best_chunk_score = 1 / (1 + best_distance)  # Simple inverted distance
    best_chunk_metadata = retrieval['metadatas'][0][0]
    return best_chunk_text, best_chunk_score, best_chunk_metadata

def extract_refinement_keyword(chunk_text, current_query):
    """
    Extract a single keyword from the chunk that is not in the current query.
    - Ignores stopwords and short words.
    - Chooses the longest candidate word found in the chunk.
    If no suitable keyword is found, returns an empty string.
    """
    chunk_words = re.findall(r'\b\w+\b', chunk_text.lower())
    query_words = set(re.findall(r'\b\w+\b', current_query.lower()))

    candidate_words = [
        w for w in chunk_words
        if w not in STOPWORDS and w not in query_words and len(w) > 4
    ]

    if not candidate_words:
        return ""
    refine_word = max(candidate_words, key=len)
    return refine_word

def refine_query(current_query, refine_word):
    """
    Append the chosen refine_word to the current query if it exists.
    """
    if not refine_word:
        return current_query
    return f"{current_query} {refine_word}"

# TODO: Add the max_chunks parameter to control the maximum number of chunks to retrieve
def iterative_retrieval(query, collection, steps=3):
    """
    Multi-step retrieval with a simple query refinement approach:
      1) Retrieve the best chunk for the current query.
      2) Extract one new keyword from that chunk and add it to the query (optional).
      3) Stop if no improvement in similarity or no new keyword is found.
      4) Collect the chunks from each step for final context.

    This version uses a global IMPROVEMENT_THRESHOLD to decide if we should keep refining.
    """
    accumulated_chunks = []
    current_query = query
    best_score_so_far = 0.0

    for step in range(steps):
        print(f"Iteration {step+1}, current query: '{current_query}'")

        best_chunk_text, best_chunk_score, best_chunk_metadata = retrieve_best_chunk(current_query, collection)
        if not best_chunk_text:
            print("No chunks found at this step. Ending refinement.")
            break

        print(f"Best chunk text (first 50 chars): '{best_chunk_text[:50]}...' | Score: {best_chunk_score:.4f}")

        if best_chunk_score - best_score_so_far < IMPROVEMENT_THRESHOLD:
            print("Improvement threshold not met. Stopping refinements.")
            break

        best_score_so_far = best_chunk_score

        # Store the chunk information
        accumulated_chunks.append({
            'step': step + 1,
            'query': current_query,
            'retrieved_chunk': {
                'text': best_chunk_text,
                'metadata': best_chunk_metadata
            },
            'score': best_chunk_score
        })

        # TODO: Add a check here to stop the loop if we've reached max_chunks

        refine_word = extract_refinement_keyword(best_chunk_text, current_query)
        if not refine_word:
            print("No suitable keyword found for further refinement.")
            break

        print(f"Refining query with keyword: {refine_word}")
        current_query = refine_query(current_query, refine_word)

    return accumulated_chunks

def build_final_context(iteration_results):
    """
    Combine all retrieved chunks from each iteration into one context block.
    Return a fallback message if no chunks were retrieved.
    """
    if not iteration_results:
        return "No relevant information was found after iterative retrieval."

    lines = []
    for result in iteration_results:
        lines.append(
            f"- Step {result['step']} (Score={result['score']:.4f}): {result['retrieved_chunk']['text']}"
        )
    return "\n".join(lines)

if __name__ == "__main__":
    # Small demonstration with a sample corpus
    chunked_docs = load_and_chunk_corpus("data/corpus.json", chunk_size=40)
    collection = build_chroma_collection(chunked_docs, collection_name="iterative_collection")

    initial_query = "What internal policies apply specifically to employees?"
    # TODO: Update the function call to include the max_chunks parameter
    iteration_results = iterative_retrieval(initial_query, collection, steps=5)

    final_context = build_final_context(iteration_results)
    print(f"\nFinal combined context:\n{final_context}")


```

**Deciding how to respond with code**

Okay, I need to provide the updated code for iterative retrieval. It seems the user gave code with TODO comments and wants me to modify it. We have a document called "rag\_context\_management.py" with related code, but it doesn't cover iterative retrieval.

I’m wondering whether to create a new document for this specific code or just provide the update right in a markdown code snippet. I'll choose the direct code snippet approach for now, since that seems to be what the user needs.


Thought for 4 seconds


Here’s the `iterative_retrieval` with a new `max_chunks` parameter and a stop‐condition once that limit is reached:

```python
def iterative_retrieval(query, collection, steps=3, max_chunks=None):
    """
    Multi-step retrieval with optional chunk‐limit:
      1) Retrieve the best chunk for the current query.
      2) Extract one new keyword and refine the query.
      3) Stop if no improvement, no new keyword, max steps, or max_chunks reached.
    """
    accumulated_chunks = []
    current_query      = query
    best_score_so_far  = 0.0

    for step in range(steps):
        print(f"Iteration {step+1}, current query: '{current_query}'")

        best_chunk_text, best_chunk_score, best_chunk_metadata = retrieve_best_chunk(current_query, collection)
        if not best_chunk_text:
            print("No chunks found at this step. Ending refinement.")
            break

        print(f"Best chunk (score={best_chunk_score:.4f}): '{best_chunk_text[:50]}...'")

        # Stop if improvement threshold not met
        if best_chunk_score - best_score_so_far < IMPROVEMENT_THRESHOLD:
            print("Improvement threshold not met. Stopping refinements.")
            break

        best_score_so_far = best_chunk_score

        # Record this chunk
        accumulated_chunks.append({
            'step': step + 1,
            'query': current_query,
            'retrieved_chunk': {
                'text':     best_chunk_text,
                'metadata': best_chunk_metadata
            },
            'score': best_chunk_score
        })

        # **New:** stop if we've collected enough chunks
        if max_chunks is not None and len(accumulated_chunks) >= max_chunks:
            print(f"Reached max_chunks={max_chunks}. Ending retrieval.")
            break

        # Try to extract a refinement keyword
        refine_word = extract_refinement_keyword(best_chunk_text, current_query)
        if not refine_word:
            print("No suitable keyword found for further refinement.")
            break

        print(f"Refining query with keyword: {refine_word}")
        current_query = refine_query(current_query, refine_word)

    return accumulated_chunks
```

And in your `__main__`, you’d call it like this:

```python
if __name__ == "__main__":
    chunked_docs = load_and_chunk_corpus("data/corpus.json", chunk_size=40)
    collection   = build_chroma_collection(chunked_docs, collection_name="iterative_collection")

    initial_query    = "What internal policies apply specifically to employees?"
    # Set steps=5 but stop as soon as max_chunks=3 is reached
    iteration_results = iterative_retrieval(
        initial_query,
        collection,
        steps=5,
        max_chunks=3
    )

    final_context = build_final_context(iteration_results)
    print(f"\nFinal combined context:\n{final_context}")
```

This way, no matter how many `steps` you allow, the loop will exit once you’ve gathered `max_chunks` chunks.


## Enhance Iterative Retrieval Process

Welcome back! You've done a fantastic job refining queries by prioritizing keywords in the previous exercise. Now, let's elevate your RAG pipeline by enhancing the iterative retrieval process to handle multiple keywords.

Your objective is to modify the iterative_retrieval function to extract and append up to two new keywords from the best chunk in each iteration. This will make your retrieval process more robust and precise.

Here's what you need to consider:

The current setup only extracts a single keyword per iteration. Adjust the logic to gather multiple keywords.
Ensure that these keywords are meaningful and not already part of the current query.
Append these keywords to the query to refine it further for the next retrieval step.
By implementing this enhancement, you'll gain valuable experience in building a more sophisticated retrieval system. Once you've made the changes, observe how the retrieval process becomes more effective with each iteration. Enjoy the challenge and happy coding!

```python
import re
from data import load_and_chunk_corpus
from vector_db import build_chroma_collection

# Optional: set this to 0.0 if you don't need to enforce a minimum improvement.
IMPROVEMENT_THRESHOLD = 0.02

# Common stopwords to ignore when extracting keywords
STOPWORDS = set([
    "the","and","is","in","of","to","a","that","for","on","with","as","it","by",
    "this","are","was","at","from","or","be","which","not","can","also","have",
    "has","had","we","they","you","he","she","his","her","its","our","us","their",
    "them","i","do","does","did","just","so","if","may","will","shall","more","most",
    "some","many","any","all","what","about","would","could","should","where","when",
    "why","how"
])

def retrieve_best_chunk(query_text, collection, n_results=1):
    """
    Retrieve the best matching chunk from the collection based on the given query.
    Returns:
      best_chunk_text, best_chunk_score, best_chunk_metadata
      (or None, None, None if retrieval fails)
    """
    retrieval = collection.query(query_texts=[query_text], n_results=n_results)
    if not retrieval['documents'][0]:
        return None, None, None

    best_chunk_text = retrieval['documents'][0][0]
    best_distance = retrieval['distances'][0][0]
    best_chunk_score = 1 / (1 + best_distance)  # Simple inverted distance
    best_chunk_metadata = retrieval['metadatas'][0][0]
    return best_chunk_text, best_chunk_score, best_chunk_metadata

# TODO: Rename this function to extract_refinement_keywords and modify it to:
# 1. Accept a max_keywords parameter
# 2. Return a list of keywords instead of a single word
# 3. Sort candidates by length and return the top max_keywords
def extract_refinement_keyword(chunk_text, current_query):
    """
    Extract a single keyword from the chunk that is not in the current query.
    - Ignores stopwords and short words.
    - Chooses the longest candidate word found in the chunk.
    If no suitable keyword is found, returns an empty string.
    """
    chunk_words = re.findall(r'\b\w+\b', chunk_text.lower())
    query_words = set(re.findall(r'\b\w+\b', current_query.lower()))

    candidate_words = [
        w for w in chunk_words
        if w not in STOPWORDS and w not in query_words and len(w) > 4
    ]

    if not candidate_words:
        return ""

    # Pick the longest candidate
    refine_word = max(candidate_words, key=len)
    return refine_word

def refine_query(current_query, refine_word):
    """
    Append the chosen refine_word to the current query if it exists.
    """
    if not refine_word:
        return current_query
    return f"{current_query} {refine_word}"

def iterative_retrieval(query, collection, steps=3):
    """
    Multi-step retrieval with a simple query refinement approach:
      1) Retrieve the best chunk for the current query.
      2) Extract one new keyword from that chunk and add it to the query (optional).
      3) Stop if no improvement in similarity or no new keyword is found.
      4) Collect the chunks from each step for final context.

    This version uses a global IMPROVEMENT_THRESHOLD to decide if we should keep refining.
    """
    accumulated_chunks = []
    current_query = query
    best_score_so_far = 0.0

    for step in range(steps):
        print(f"Iteration {step+1}, current query: '{current_query}'")

        best_chunk_text, best_chunk_score, best_chunk_metadata = retrieve_best_chunk(current_query, collection)
        if not best_chunk_text:
            print("No chunks found at this step. Ending refinement.")
            break

        print(f"Best chunk text (first 50 chars): '{best_chunk_text[:50]}...' | Score: {best_chunk_score:.4f}")

        if best_chunk_score - best_score_so_far < IMPROVEMENT_THRESHOLD:
            print("Improvement threshold not met. Stopping refinements.")
            break

        best_score_so_far = best_chunk_score

        accumulated_chunks.append({
            'step': step + 1,
            'query': current_query,
            'retrieved_chunk': {
                'text': best_chunk_text,
                'metadata': best_chunk_metadata
            },
            'score': best_chunk_score
        })

        # TODO: Update this section to:
        # 1. Call the renamed extract_refinement_keywords function
        # 2. Loop through the returned keywords and add each to the query
        refine_word = extract_refinement_keyword(best_chunk_text, current_query)
        if not refine_word:
            print("No suitable keyword found for further refinement.")
            break

        print(f"Refining query with keyword: {refine_word}")
        current_query = refine_query(current_query, refine_word)

    return accumulated_chunks

def build_final_context(iteration_results):
    """
    Combine all retrieved chunks from each iteration into one context block.
    Return a fallback message if no chunks were retrieved.
    """
    if not iteration_results:
        return "No relevant information was found after iterative retrieval."

    lines = []
    for result in iteration_results:
        lines.append(
            f"- Step {result['step']} (Score={result['score']:.4f}): {result['retrieved_chunk']['text']}"
        )
    return "\n".join(lines)

if __name__ == "__main__":
    # Small demonstration with a sample corpus
    chunked_docs = load_and_chunk_corpus("data/corpus.json", chunk_size=40)
    collection = build_chroma_collection(chunked_docs, collection_name="iterative_collection")

    initial_query = "What internal policies apply specifically to employees?"
    iteration_results = iterative_retrieval(initial_query, collection, steps=3)

    final_context = build_final_context(iteration_results)
    print(f"\nFinal combined context:\n{final_context}")
```

**Refining iterative\_retrieval logic**

I’ll need to modify the `iterative_retrieval` to fetch and add up to two new keywords from the best chunk in each loop iteration. I'll adjust the `extract_refinement_keywords` function to accept a `max_keywords` parameter, rename it accordingly, and return the top N keywords sorted by length. We’ll refine the query by appending these keywords one by one, ensuring they are not duplicates. I'll check if the new approach improves the result based on the score.


Thought for 5 seconds


Here’s how you can extend your pipeline to pull up to two new keywords each iteration:

1. **Rename & expand** `extract_refinement_keyword` → `extract_refinement_keywords` to accept `max_keywords` and return a list.
2. **Sort** candidates by length (longer → shorter) and pick the top N.
3. In **`iterative_retrieval`**, call the new extractor, bail out if it returns nothing, then loop over each returned keyword to refine the query.

```python
import re
from data import load_and_chunk_corpus
from vector_db import build_chroma_collection

IMPROVEMENT_THRESHOLD = 0.02

STOPWORDS = set([
    "the","and","is","in","of","to","a","that","for","on","with","as","it","by",
    "this","are","was","at","from","or","be","which","not","can","also","have",
    "has","had","we","they","you","he","she","his","her","its","our","us","their",
    "them","i","do","does","did","just","so","if","may","will","shall","more","most",
    "some","many","any","all","what","about","would","could","should","where","when",
    "why","how"
])

def retrieve_best_chunk(query_text, collection, n_results=1):
    retrieval = collection.query(query_texts=[query_text], n_results=n_results)
    if not retrieval['documents'][0]:
        return None, None, None

    best_chunk_text = retrieval['documents'][0][0]
    best_distance = retrieval['distances'][0][0]
    best_chunk_score = 1 / (1 + best_distance)
    best_chunk_metadata = retrieval['metadatas'][0][0]
    return best_chunk_text, best_chunk_score, best_chunk_metadata

def extract_refinement_keywords(chunk_text, current_query, max_keywords=2):
    """
    Extract up to `max_keywords` new keywords from `chunk_text`:
    - Excludes stopwords, words ≤4 chars, and anything already in `current_query`.
    - Returns the longest candidates first.
    """
    chunk_words = re.findall(r'\b\w+\b', chunk_text.lower())
    query_words = set(re.findall(r'\b\w+\b', current_query.lower()))

    candidates = {
        w for w in chunk_words
        if w not in STOPWORDS and w not in query_words and len(w) > 4
    }
    # sort by length descending, pick top max_keywords
    sorted_candidates = sorted(candidates, key=len, reverse=True)
    return sorted_candidates[:max_keywords]

def refine_query(current_query, refine_word):
    if not refine_word:
        return current_query
    return f"{current_query} {refine_word}"

def iterative_retrieval(query, collection, steps=3, max_new_keywords=2):
    accumulated_chunks = []
    current_query = query
    best_score_so_far = 0.0

    for step in range(steps):
        print(f"Iteration {step+1}, current query: '{current_query}'")
        best_chunk_text, best_chunk_score, best_chunk_metadata = retrieve_best_chunk(current_query, collection)
        if not best_chunk_text:
            print("No chunks found. Ending.")
            break

        print(f" → Chunk: '{best_chunk_text[:50]}...' | Score: {best_chunk_score:.4f}")
        if best_chunk_score - best_score_so_far < IMPROVEMENT_THRESHOLD:
            print("No significant improvement. Stopping.")
            break
        best_score_so_far = best_chunk_score

        accumulated_chunks.append({
            'step': step + 1,
            'query': current_query,
            'retrieved_chunk': {
                'text': best_chunk_text,
                'metadata': best_chunk_metadata
            },
            'score': best_chunk_score
        })

        # **NEW**: grab up to 2 new keywords
        new_keywords = extract_refinement_keywords(best_chunk_text, current_query, max_new_keywords)
        if not new_keywords:
            print("No new keywords found. Ending.")
            break

        for kw in new_keywords:
            print(f"Refining query with: {kw}")
            current_query = refine_query(current_query, kw)

    return accumulated_chunks

def build_final_context(iteration_results):
    if not iteration_results:
        return "No relevant information found."

    return "\n".join(
        f"- Step {r['step']} (Score={r['score']:.4f}): {r['retrieved_chunk']['text']}"
        for r in iteration_results
    )

if __name__ == "__main__":
    chunked_docs = load_and_chunk_corpus("data/corpus.json", chunk_size=40)
    collection = build_chroma_collection(chunked_docs, collection_name="iterative_collection")

    initial_query = "What internal policies apply specifically to employees?"
    results = iterative_retrieval(initial_query, collection, steps=3, max_new_keywords=2)
    print("\nFinal context:\n", build_final_context(results))
```

**What changed?**

* **`extract_refinement_keywords`** now returns a list of up to `max_keywords` new terms, sorted by length.
* **`iterative_retrieval`** loops over each new keyword in one iteration—appending both before the next retrieval.
* Everything else (threshold check, chunk collection) remains the same. This should make your query sharpening faster and more nuanced.
