# Delegating to different tools

In this notebook we try different search tools

* An embedding search
* A keyword search
* A keyword search with category filter

We delegate to different sub agents that do these searches and aggregate the results.

In [None]:
!pip install git+https://github.com/softwaredoug/cheat-at-search.git
from cheat_at_search.data_dir import mount
mount(use_gdrive=True)    # colab, share data across notebook runs on gdrive
# mount(use_gdrive=False) # <- colab without gdrive
# mount(use_gdrive=False, manual_path="/path/to/directory")  # <- force data path to specific directory, ie you're running locally.


Collecting git+https://github.com/softwaredoug/cheat-at-search.git
  Cloning https://github.com/softwaredoug/cheat-at-search.git to /tmp/pip-req-build-wok5i1bq
  Running command git clone --filter=blob:none --quiet https://github.com/softwaredoug/cheat-at-search.git /tmp/pip-req-build-wok5i1bq
  Resolved https://github.com/softwaredoug/cheat-at-search.git to commit 6a08d097f1d6eaa068fb61af47c621df1682f5e2
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Get an OpenAI Key

This will prompt you for an OpenAI Key to interact with GPT-5

In [None]:
from cheat_at_search.data_dir import key_for_provider
from openai import OpenAI

OPENAI_KEY = key_for_provider("openai")

openai = OpenAI(api_key=OPENAI_KEY)

## Load the Wayfair corpus

We'll recommend products only from this corpus

In [None]:
from cheat_at_search.wands_data import corpus, product_embeddings, judgments

corpus['category'] = corpus['category'].str.strip()
corpus['sub_category'] = corpus['sub_category'].str.strip()

corpus

Unnamed: 0,product_id,product_name,product_class,category hierarchy,product_description,product_features,rating_count,average_rating,review_count,features,doc_id,title,description,category,sub_category,cat_subcat,title_snowball,description_snowball,category_snowball
0,0,solid wood platform bed,Beds,Furniture / Bedroom Furniture / Beds & Headboa...,"good , deep sleep can be quite difficult to ha...",overallwidth-sidetoside:64.7|dsprimaryproducts...,15.0,4.5,15.0,"[overallwidth-sidetoside:64.7, dsprimaryproduc...",0,solid wood platform bed,"good , deep sleep can be quite difficult to ha...",Furniture,Bedroom Furniture,Furniture / Bedroom Furniture,"Terms({'bed', 'platform', 'wood', 'solid'})","Terms({'act', 'frame', 'for', 'includ', 'natur...",Terms({'furnitur'})
1,1,all-clad 7 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,"create delicious slow-cooked meals , from tend...",capacityquarts:7|producttype : slow cooker|pro...,100.0,2.0,98.0,"[capacityquarts:7, producttype : slow cooker, ...",1,all-clad 7 qt . slow cooker,"create delicious slow-cooked meals , from tend...",Kitchen & Tabletop,Small Kitchen Appliances,Kitchen & Tabletop / Small Kitchen Appliances,"Terms({'qt', '7', 'cooker', 'clad', 'slow', 'a...","Terms({'base', 'your', 'make', 'cook', 'meat',...","Terms({'tabletop', 'kitchen'})"
2,2,all-clad electrics 6.5 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,prepare home-cooked meals on any schedule with...,features : keep warm setting|capacityquarts:6....,208.0,3.0,181.0,"[features : keep warm setting, capacityquarts:...",2,all-clad electrics 6.5 qt . slow cooker,prepare home-cooked meals on any schedule with...,Kitchen & Tabletop,Small Kitchen Appliances,Kitchen & Tabletop / Small Kitchen Appliances,"Terms({'qt', 'electr', 'cooker', '5', '6', 'cl...","Terms({'ani', 'on', 'and', 'home', 'cook', 'me...","Terms({'tabletop', 'kitchen'})"
3,3,all-clad all professional tools pizza cutter,"Slicers, Peelers And Graters",Browse By Brand / All-Clad,this original stainless tool was designed to c...,overallwidth-sidetoside:3.5|warrantylength : l...,69.0,4.5,42.0,"[overallwidth-sidetoside:3.5, warrantylength :...",3,all-clad all professional tools pizza cutter,this original stainless tool was designed to c...,Browse By Brand,All-Clad,Browse By Brand / All-Clad,"Terms({'profession', 'cutter', 'pizza', 'clad'...","Terms({'and', 'cutter', 'crust', 'cookwar', 'p...","Terms({'brows', 'by', 'brand'})"
4,4,baldwin prestige alcott passage knob with roun...,Door Knobs,Home Improvement / Doors & Door Hardware / Doo...,the hardware has a rich heritage of delivering...,compatibledoorthickness:1.375 '' |countryofori...,70.0,5.0,42.0,"[compatibledoorthickness:1.375 '' , countryofo...",4,baldwin prestige alcott passage knob with roun...,the hardware has a rich heritage of delivering...,Home Improvement,Doors & Door Hardware,Home Improvement / Doors & Door Hardware,"Terms({'prestig', 'alcott', 'with', 'round', '...","Terms({'inspir', 'can', 'ani', 'offer', 'an', ...","Terms({'improv', 'home'})"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
42989,42989,malibu pressure balanced diverter fixed shower...,Shower Panels,Home Improvement / Bathroom Remodel & Bathroom...,the malibu pressure balanced diverter fixed sh...,producttype : shower panel|spraypattern : rain...,3.0,4.5,2.0,"[producttype : shower panel, spraypattern : ra...",42989,malibu pressure balanced diverter fixed shower...,the malibu pressure balanced diverter fixed sh...,Home Improvement,Bathroom Remodel & Bathroom Fixtures,Home Improvement / Bathroom Remodel & Bathro...,"Terms({'divert', 'fix', 'malibu', 'panel', 'sh...","Terms({'includ', 'sleek', 'ani', 'an', 'and', ...","Terms({'improv', 'home'})"
42990,42990,emmeline 5 piece breakfast dining set,Dining Table Sets,Furniture / Kitchen & Dining Furniture / Dinin...,,basematerialdetails : steel| : gray wood|ofhar...,1314.0,4.5,864.0,"[basematerialdetails : steel, : gray wood, of...",42990,emmeline 5 piece breakfast dining set,,Furniture,Kitchen & Dining Furniture,Furniture / Kitchen & Dining Furniture,"Terms({'piec', 'dine', 'set', '5', 'breakfast'...",Terms(set()),Terms({'furnitur'})
42991,42991,maloney 3 piece pub table set,Dining Table Sets,Furniture / Kitchen & Dining Furniture / Dinin...,this pub table set includes 1 counter height t...,additionaltoolsrequirednotincluded : power dri...,49.0,4.0,41.0,[additionaltoolsrequirednotincluded : power dr...,42991,maloney 3 piece pub table set,this pub table set includes 1 counter height t...,Furniture,Kitchen & Dining Furniture,Furniture / Kitchen & Dining Furniture,"Terms({'piec', 'pub', 'maloney', 'tabl', 'set'...","Terms({'includ', 'for', 'ani', 'candlelit', 'a...",Terms({'furnitur'})
42992,42992,fletcher 27.5 '' wide polyester armchair,Teen Lounge Furniture|Accent Chairs,Furniture / Living Room Furniture / Chairs & S...,"bring iconic , modern style to your space in a...",legmaterialdetails : rubberwood|backheight-sea...,1746.0,4.5,1226.0,"[legmaterialdetails : rubberwood, backheight-s...",42992,fletcher 27.5 '' wide polyester armchair,"bring iconic , modern style to your space in a...",Furniture,Living Room Furniture,Furniture / Living Room Furniture,"Terms({'polyest', '27', 'wide', '5', 'fletcher...","Terms({'flare', 'frame', 'for', 'fill', 'showc...",Terms({'furnitur'})


### Index the furniture

We'll index title and description with basic stemming to be able to retrieve them

In [None]:
from searcharray import SearchArray
from cheat_at_search.tokenizers import snowball_tokenizer

corpus['title_snowball'] = SearchArray.index(corpus['title'].fillna(''), snowball_tokenizer)
corpus['description_snowball'] = SearchArray.index(corpus['description'].fillna(''), snowball_tokenizer)
corpus['category_snowball'] = SearchArray.index(corpus['category'].fillna(''), snowball_tokenizer)

2026-02-11 03:53:28,798 - searcharray.indexing - INFO - Indexing begins w/ 4 workers


INFO:searcharray.indexing:Indexing begins w/ 4 workers


2026-02-11 03:53:28,829 - searcharray.indexing - INFO - 0 Batch Start tokenization


INFO:searcharray.indexing:0 Batch Start tokenization


2026-02-11 03:53:28,836 - searcharray.indexing - INFO - Tokenizing 42994 documents


INFO:searcharray.indexing:Tokenizing 42994 documents


2026-02-11 03:53:29,575 - searcharray.indexing - INFO - Tokenized 10000 (23.259059403637718%)


INFO:searcharray.indexing:Tokenized 10000 (23.259059403637718%)


2026-02-11 03:53:30,108 - searcharray.indexing - INFO - Tokenized 20000 (46.518118807275435%)


INFO:searcharray.indexing:Tokenized 20000 (46.518118807275435%)


2026-02-11 03:53:30,758 - searcharray.indexing - INFO - Tokenized 30000 (69.77717821091315%)


INFO:searcharray.indexing:Tokenized 30000 (69.77717821091315%)


2026-02-11 03:53:31,329 - searcharray.indexing - INFO - Tokenized 40000 (93.03623761455087%)


INFO:searcharray.indexing:Tokenized 40000 (93.03623761455087%)


2026-02-11 03:53:31,889 - searcharray.indexing - INFO - Tokenization -- vstacking


INFO:searcharray.indexing:Tokenization -- vstacking


2026-02-11 03:53:31,898 - searcharray.indexing - INFO - Tokenization -- DONE


INFO:searcharray.indexing:Tokenization -- DONE


2026-02-11 03:53:31,908 - searcharray.indexing - INFO - Inverting docs->terms


INFO:searcharray.indexing:Inverting docs->terms


2026-02-11 03:53:32,032 - searcharray.indexing - INFO - Encoding positions to bit array


INFO:searcharray.indexing:Encoding positions to bit array


2026-02-11 03:53:32,114 - searcharray.indexing - INFO - Batch tokenization complete


INFO:searcharray.indexing:Batch tokenization complete


2026-02-11 03:53:32,119 - searcharray.indexing - INFO - (main thread) Processing 1 batch results


INFO:searcharray.indexing:(main thread) Processing 1 batch results


2026-02-11 03:53:32,230 - searcharray.indexing - INFO - Indexing from tokenization complete


INFO:searcharray.indexing:Indexing from tokenization complete


2026-02-11 03:53:32,296 - searcharray.indexing - INFO - Indexing begins w/ 4 workers


INFO:searcharray.indexing:Indexing begins w/ 4 workers


2026-02-11 03:53:32,315 - searcharray.indexing - INFO - 0 Batch Start tokenization


INFO:searcharray.indexing:0 Batch Start tokenization


2026-02-11 03:53:32,338 - searcharray.indexing - INFO - Tokenizing 42994 documents


INFO:searcharray.indexing:Tokenizing 42994 documents


2026-02-11 03:53:36,347 - searcharray.indexing - INFO - Tokenized 10000 (23.259059403637718%)


INFO:searcharray.indexing:Tokenized 10000 (23.259059403637718%)


2026-02-11 03:53:38,089 - searcharray.indexing - INFO - Tokenized 20000 (46.518118807275435%)


INFO:searcharray.indexing:Tokenized 20000 (46.518118807275435%)


2026-02-11 03:53:39,122 - searcharray.indexing - INFO - Tokenized 30000 (69.77717821091315%)


INFO:searcharray.indexing:Tokenized 30000 (69.77717821091315%)


2026-02-11 03:53:40,195 - searcharray.indexing - INFO - Tokenized 40000 (93.03623761455087%)


INFO:searcharray.indexing:Tokenized 40000 (93.03623761455087%)


2026-02-11 03:53:40,766 - searcharray.indexing - INFO - Tokenization -- vstacking


INFO:searcharray.indexing:Tokenization -- vstacking


2026-02-11 03:53:40,967 - searcharray.indexing - INFO - Tokenization -- DONE


INFO:searcharray.indexing:Tokenization -- DONE


2026-02-11 03:53:41,028 - searcharray.indexing - INFO - Inverting docs->terms


INFO:searcharray.indexing:Inverting docs->terms


2026-02-11 03:53:41,727 - searcharray.indexing - INFO - Encoding positions to bit array


INFO:searcharray.indexing:Encoding positions to bit array


2026-02-11 03:53:41,971 - searcharray.indexing - INFO - Batch tokenization complete


INFO:searcharray.indexing:Batch tokenization complete


2026-02-11 03:53:41,974 - searcharray.indexing - INFO - (main thread) Processing 1 batch results


INFO:searcharray.indexing:(main thread) Processing 1 batch results


2026-02-11 03:53:42,144 - searcharray.indexing - INFO - Indexing from tokenization complete


INFO:searcharray.indexing:Indexing from tokenization complete


2026-02-11 03:53:42,290 - searcharray.indexing - INFO - Indexing begins w/ 4 workers


INFO:searcharray.indexing:Indexing begins w/ 4 workers


2026-02-11 03:53:42,298 - searcharray.indexing - INFO - 0 Batch Start tokenization


INFO:searcharray.indexing:0 Batch Start tokenization


2026-02-11 03:53:42,300 - searcharray.indexing - INFO - Tokenizing 42994 documents


INFO:searcharray.indexing:Tokenizing 42994 documents


2026-02-11 03:53:42,467 - searcharray.indexing - INFO - Tokenized 10000 (23.259059403637718%)


INFO:searcharray.indexing:Tokenized 10000 (23.259059403637718%)


2026-02-11 03:53:42,661 - searcharray.indexing - INFO - Tokenized 20000 (46.518118807275435%)


INFO:searcharray.indexing:Tokenized 20000 (46.518118807275435%)


2026-02-11 03:53:42,823 - searcharray.indexing - INFO - Tokenized 30000 (69.77717821091315%)


INFO:searcharray.indexing:Tokenized 30000 (69.77717821091315%)


2026-02-11 03:53:43,012 - searcharray.indexing - INFO - Tokenized 40000 (93.03623761455087%)


INFO:searcharray.indexing:Tokenized 40000 (93.03623761455087%)


2026-02-11 03:53:43,177 - searcharray.indexing - INFO - Tokenization -- vstacking


INFO:searcharray.indexing:Tokenization -- vstacking


2026-02-11 03:53:43,179 - searcharray.indexing - INFO - Tokenization -- DONE


INFO:searcharray.indexing:Tokenization -- DONE


2026-02-11 03:53:43,183 - searcharray.indexing - INFO - Inverting docs->terms


INFO:searcharray.indexing:Inverting docs->terms


2026-02-11 03:53:43,190 - searcharray.indexing - INFO - Encoding positions to bit array


INFO:searcharray.indexing:Encoding positions to bit array


2026-02-11 03:53:43,196 - searcharray.indexing - INFO - Batch tokenization complete


INFO:searcharray.indexing:Batch tokenization complete


2026-02-11 03:53:43,197 - searcharray.indexing - INFO - (main thread) Processing 1 batch results


INFO:searcharray.indexing:(main thread) Processing 1 batch results


2026-02-11 03:53:43,217 - searcharray.indexing - INFO - Indexing from tokenization complete


INFO:searcharray.indexing:Indexing from tokenization complete


## Add a search tracker

Here we track which searches we've seen to prevent researching

## Create a furniture products search function

Here is a function that searches a Wayfair product dataset. It's just a Python function that returns top 10 pieces of furniture.

Right now we'll call it directly, soon we'll help ChatGPT interact with this.

In [None]:
import numpy as np
from typing import Optional
from pydantic import BaseModel, Field

from sentence_transformers import SentenceTransformer

minilm = SentenceTransformer('all-MiniLM-L6-v2')


class SearchTracker:
    def __init__(self, similarity_threshold=0.95):
        self.queries = []
        self.query_embeddings = []
        self.query_log = []
        self.similarity_threshold = similarity_threshold

    def similar_search(self, new_query: str) -> Optional[str]:
        if not new_query:
            return False

        new_embedding = minilm.encode(new_query, convert_to_numpy=True)

        for i, existing_embedding in enumerate(self.query_embeddings):
            # Calculate dot product for similarity
            similarity = np.dot(new_embedding, existing_embedding) / (np.linalg.norm(new_embedding) * np.linalg.norm(existing_embedding))
            if similarity >= self.similarity_threshold:
                return self.queries[i]

        self.queries.append(new_query)
        self.query_embeddings.append(new_embedding)
        return None

# Example usage (not wired up to the main agent logic yet)
tracker = SearchTracker()
print(tracker.similar_search("red couch"))
print(tracker.similar_search("red couch"))
# print(tracker.has_run_search("blue chair"))
# print(tracker.has_run_search("large red couch"))

Loading weights:   0%|          | 0/103 [00:00<?, ?it/s]

BertModel LOAD REPORT from: sentence-transformers/all-MiniLM-L6-v2
Key                     | Status     |  | 
------------------------+------------+--+-
embeddings.position_ids | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


None
red couch


In [None]:
import numpy as np
from typing import Union

from pydantic import BaseModel, Field
from typing import Optional, Literal


Categories = Literal['Furniture', 'Kitchen & Tabletop', 'Browse By Brand',
                     'Home Improvement', 'Décor & Pillows', 'Outdoor',
                     'Storage & Organization', 'Bed & Bath', 'Baby & Kids',
                     'Pet', 'Lighting', 'Rugs', 'School Furniture and Supplies',
                     'Commercial Business Furniture', 'Holiday Décor', 'Fountains',
                     'Contractor', 'Appliances', 'Sale', 'Reception Area',
                     'Foodservice', 'Institutional Furniture Parts & Accessories',
                     'Landscaping Screens & Bridges', 'Shop Product Type', 'Clips',
                     'Slicers, Peelers And Graters', 'Bed Accessories',
                     'Accommodations', 'Buffet Accessories', 'Specialty Serving',
                     'Display Cases', 'Key Organizers', 'Ergonomic Accessories',
                     'Slow Cookers', 'Bath Rugs & Mats', 'Furniture Cushions',
                     'Early Education', 'Learning Resources',
                     'Physical Education Equipment', 'Faux Plants and Trees',
                     'Desk Parts', 'Serving Dishes & Platters', 'Water Filter Pitchers',
                     'Shower Curtain Rods', 'Table Accessories',
                     'Sandboxes & Sand Toys', 'Meeting & Collaborative Spaces',
                     'Desktop Organizers & Desk Pads',
                     'Napkin Rings, Place Card Holders & Food Markers',
                     'Partition & Panel Hardware Accessories', 'Cash Handling', 'Hooks',
                     'Novelty Lighting', 'Protection Plans',
                     'Stages, Risers and Accessories']


class ToolSearchResult(BaseModel):
    id: int = Field(description="The id of the product")
    title: str = Field(description="The title of the product")
    description: str = Field(description="The description of the product")
    category: str = Field(description="The category of the product")
    score: float = Field(description="The score of the product")


class ToolSearchResults(BaseModel):
    search_results: list[ToolSearchResult] = Field(description="The search results")
    error: Optional[str] = Field(description="Any error message", default=None)



def _track_search(keywords, category, agent_state) -> Optional[ToolSearchResults]:
    query_log = []
    try:
        query_log = agent_state['log']
    except KeyError:
        agent_state['log'] = query_log

    category_lookup = "no_category"
    if category is not None:
        category_lookup = category.lower()
    search_tracker = None
    if agent_state is not None:
        search_trackers = {}
        try:
            search_trackers = agent_state['search_tracker']
        except KeyError:
            agent_state['search_tracker'] = search_trackers

        # Get tracker for category
        try:
            search_tracker = search_trackers[category_lookup]
        except KeyError:
            search_trackers[category_lookup] = SearchTracker()
            search_tracker = search_trackers[category_lookup]

        assert agent_state['search_tracker'][category_lookup] is not None
        duplicate_search = search_tracker.similar_search(keywords)
        if duplicate_search is not None:
            error_msg = f"""
                You searched '{keywords}', but you already searched for very similar '{duplicate_search}' in the past in category {category_lookup}

                Try exploring a different query to explore more of the space
                (different / no categories, or different keywords in this category)
            """
            print(error_msg)
            query_log.append((keywords, category, True))
            return ToolSearchResults(search_results=[],
                                     error=error_msg)
    query_log.append((keywords, category, False))
    agent_state['search_tracker'][category_lookup] = search_tracker



def search_with_category(keywords: str,
                         category: Optional[Categories] = None,
                         top_k: int = 5,
                         agent_state: Optional[dict] = None
                         ) -> ToolSearchResults:
    """Search the wayfair home goods + furniture catalog, get top_k results

    This is direct keyword search along with optional category filtering.

    Args:
        keywords: The search query string.
        category: category to filter products by, unfiltered when not present
        top_k: The number of top results to return.

    Returns:
        Search results or error message

    """
    print("search", keywords, category)
    # Check search tracker and reject if too similar
    result = _track_search(keywords, category, agent_state)
    if result is not None:
        return result
    # BM25 search

    bm25_scores = np.zeros(len(corpus))
    for term in snowball_tokenizer(keywords):
        bm25_scores += corpus['title_snowball'].array.score(term) * 10
        bm25_scores += corpus['description_snowball'].array.score(term) * 1

    # Filter by category
    if category:
        print("Filtering by category:", category)
        cat_tokenized = snowball_tokenizer(category)
        category_mask = corpus['category_snowball'].array.score(cat_tokenized) > 0
        bm25_scores = bm25_scores * category_mask


    top_k_indices = np.argsort(bm25_scores)[-top_k:][::-1]
    bm25_scores = bm25_scores[top_k_indices]
    top_movies = corpus.iloc[top_k_indices].copy()
    top_movies.loc[:, 'score'] = bm25_scores

    results = []
    for id, row in top_movies.iterrows():
        results.append(ToolSearchResult(
            id=row['doc_id'],
            title=row['title'],
            description=row['description'],
            category=row['category'],
            score=row['score']
        ))
    search_results = ToolSearchResults(search_results=results, error=None)
    return search_results




In [None]:
agent_state={}
search_with_category("geometric style couch", top_k=5, agent_state=agent_state,
                     category="Furniture")

search geometric style couch Furniture
Filtering by category: Furniture


ToolSearchResults(search_results=[ToolSearchResult(id=824, title='double chaise lounge floor couch', description="the multi-functional lazy sofa is becoming a popular trend for people to enjoy themselves after their tiring work every day . it is great for almost every place , such as living room , bedroom , home office , dorm room , balcony , and outdoor space , and suitable for watching tv , play games , working on a laptop , or take a nap in it . you wo n't be disappointed with this purchase . lean back and get comfortable after a long day with the leisure sofa bed with a built-in 5 gear back adjuster system and take floor comfort to the next level . the built-in back adjuster system allows the chair to easily take on 5 different angled positions from 90 degrees in an upright chair position to 180 degrees flatbed position , satisfying any and all posture requirements for a customized seating experience .", category='Furniture', score=36.46202087402344), ToolSearchResult(id=23758, tit

In [None]:
search_with_category("geometric style couch", top_k=5, agent_state=agent_state,
                     category="Furniture")

search geometric style couch Furniture

                You searched 'geometric style couch', but you already searched for very similar 'geometric style couch' in the past in category furniture

                Try exploring a different query to explore more of the space
                (different / no categories, or different keywords in this category)
            


ToolSearchResults(search_results=[], error="\n                You searched 'geometric style couch', but you already searched for very similar 'geometric style couch' in the past in category furniture\n\n                Try exploring a different query to explore more of the space\n                (different / no categories, or different keywords in this category)\n            ")

## Create some alternative search functions

Alternative search strageties for exploring data

In [None]:
def search_with_keywords(keywords: str,
                         top_k: int = 5,
                         agent_state: Optional[dict] = None
                         ) -> ToolSearchResults:
    """Search the wayfair home goods + furniture catalog, get top_k results

    This is direct keyword search

    Args:
        keywords: The search query string.
        top_k: The number of top results to return.

    Returns:
        Search results or error message

    """
    print("search", keywords)
    # Check search tracker and reject if too similar
    result = _track_search(keywords, "no_category", agent_state)
    if result is not None:
        return result
    # BM25 search

    bm25_scores = np.zeros(len(corpus))
    for term in snowball_tokenizer(keywords):
        bm25_scores += corpus['title_snowball'].array.score(term) * 10
        bm25_scores += corpus['description_snowball'].array.score(term) * 1
    top_k_indices = np.argsort(bm25_scores)[-top_k:][::-1]
    bm25_scores = bm25_scores[top_k_indices]
    top_movies = corpus.iloc[top_k_indices].copy()
    top_movies.loc[:, 'score'] = bm25_scores

    results = []
    for id, row in top_movies.iterrows():
        results.append(ToolSearchResult(
            id=row['doc_id'],
            title=row['title'],
            description=row['description'],
            category=row['category'],
            score=row['score']
        ))
    search_results = ToolSearchResults(search_results=results, error=None)
    return search_results


def search_embeddings(keywords: str,
                      top_k: int = 5,
                      agent_state: Optional[dict] = None) -> ToolSearchResults:
    """Search the wayfair home goods + furniture catalog, get top_k results

    This is an embedding search of the keywords into the embeddings of product title + description

    Args:
        keywords: The search query string.
        top_k: The number of top results to return.

    Returns:
        Search results or error message

    """
    print("search emb", keywords)
    # Check search tracker and reject if too similar
    result = _track_search(keywords, "no_category", agent_state)
    if result is not None:
        return result

    # Search product_embeddings with. nump
    query_embedding = minilm.encode(keywords, convert_to_numpy=True)
    scores = np.dot(query_embedding, product_embeddings.T)
    top_k_indices = np.argsort(scores)[-top_k:][::-1]
    scores = scores[top_k_indices]
    top_movies = corpus.iloc[top_k_indices].copy()
    top_movies.loc[:, 'score'] = scores

    results = []
    for id, row in top_movies.iterrows():
        results.append(ToolSearchResult(
            id=row['doc_id'],
            title=row['title'],
            description=row['description'],
            category=row['category'],
            score=row['score']
        ))
    search_results = ToolSearchResults(search_results=results, error=None)
    return search_results


agent_state={}
search_embeddings("geometric style couch", top_k=5, agent_state=agent_state)

search emb geometric style couch


ToolSearchResults(search_results=[ToolSearchResult(id=27548, title="jarrett 112 '' wide sofa & chaise", description="gathering everyone for a movie marathon , or hosting a game night ? whatever your living room needs , it 's hard to beat sectional sofas when it comes to giving everyone a seat ( and your living room an on-trend look ) . take this one , for example , ideal for anchoring your space with a touch of mid-century-inspired style , this piece features a streamlined silhouette , recessed arms , and dowel feet . crafted with a solid wood frame , this piece features foam and fiber filling for an inviting feel , and is wrapped in neutral polyester blend upholstery that allows it o easily join a variety of color schemes .", category='Furniture', score=0.71204674243927), ToolSearchResult(id=42254, title="hillcrest 85 '' linen square arm sofa bed", description='the classic design of this sofa gives it a distinctive look that elevates your well-curated collection . the back cushion can

In [None]:
agent_state = {}
search_with_keywords("geometric style couch", top_k=5, agent_state=agent_state)

search geometric style couch


ToolSearchResults(search_results=[ToolSearchResult(id=6346, title='kaat 3 - light candle style geometric chandelier', description='welcome guests to your home with a splash of statement lighting , illuminate your bedroom or light up your dining room table with this charismatic geometric chandelier . made from steel in a handsome metallic finish , this alluring design showcases a simple circular canopy , a straight down rod , and a distinctive openwork geometric frame around a contemporary candelabra-style base . this hardwired modern luminary accommodates three lightbulbs of up to 60 w each ( bulbs not included ) .', category='Lighting', score=45.974431455135345), ToolSearchResult(id=39358, title='deloris 4 - light candle style geometric chandelier', description='this 4-light pendant features a unique design that enhances the contemporary . it also adds a modern style atmosphere to your home for a more fashionable feel .', category='Lighting', score=43.722705125808716), ToolSearchResul

In [None]:
def call_tool(tool_info, agent_state, item) -> dict:

    # Lookup how the agent wants to call the tool
    tool_name = item.name
    tool = tool_info[tool_name]
    ToolArgsModel = tool[0]
    tool_fn = tool[2]
    fn_args: ToolArgsModel = ToolArgsModel.model_validate_json(item.arguments)

    print(f"Calling {tool_name} with args {fn_args}")
    # The tool call function itself (ie search)
    # wrapped in something helping with serialization
    py_resp, json_resp = tool_fn(fn_args, agent_state=agent_state)
    print("output", py_resp)

    # 4. Provide function call results to the model
    return {
        "type": "function_call_output",
        "call_id": item.call_id,
        "output": json_resp,
    }


## The full agentic loop

The main difference from previous notebooks here -

we're experimenting with the orchestrator. There's a couple of different patterns to explore

* Restart the agent each time, but with feedback?
* Share results between runs of each search backend?

In [None]:

import textwrap
from time import sleep
from cheat_at_search.wands_data import labeled_query_products
from cheat_at_search.agent.pydantize import make_tool_adapter
from typing import Tuple


system_prompt = """
    You take user search queries and use a search tool to find furniture / home goods products.

    Look at the search tools you have, their limitations, how they work, etc when forming your plan.

    Finally return results to the user per the SearchResults schema, ranked best to worst.

    Gather results until you have 10 best matches you can find. It's important to return at least 10.

    You'll get feedback from the user on whether results satisfies them:

    ☹️ - ACTIVELY FRUSTRATES USER, do not include unless absolutely necesarry
    😑 - Meh results. OK in a pinch. But there could be better.
    😃 - Solves users problem. Good job! Rank this highest

    If you get this feedback, try again to:

    * Find results that can replace the ☹️ / 😑 with 😃
    * Hypothesize new queries that might find 😃
    * Use the content in known 😃 results to create new queries to the search tool

    Consider possibly

    * Not searching categories if no relevant results found

    It's very important you consider carefully the correct ranking as you'll be evaluated on
    how close that is to the average furniture shoppers ideal ranking.

"""
# system_prompt = build_few_shot_prompt(system_prompt)



class SearchResults(BaseModel):
    """The ranked, top 10 search results ordered most relevant to least."""
    results_summary: str = Field(description="The message from you summarizing what you found")

    ranked_results: list[int] = Field(description="Top ranked search results (their doc_ids)")


def agent_run(tool_info,
              text_format,
              inputs,
              agent_state,
              model='gpt-5-mini',
              summary=True) -> str:

    assert inputs
    tool_calls = True
    resp = None
    while tool_calls:
        failing = True
        num_failures = 0
        while failing:
            try:
                # print(inputs)
                resp = openai.responses.parse(
                    model=model,
                    input=inputs,
                    tools=[tool[1] for tool in tool_info.values()],
                    reasoning={
                        "effort": "medium",
                        "summary": "auto" if summary else "none"
                    },
                    text_format=text_format
                )
                failing = False
            except Exception:
                failing = True
                num_failures += 1
                if num_failures > 3:
                    raise
                sleep(1)
        inputs += resp.output
        if summary:
            usage = resp.usage
            print("--")
            print(f"InpTok: {usage.input_tokens}")
            print(f"OutTok: {usage.output_tokens}")
            for item in resp.output:
                if item.type == "reasoning":
                    print("Reasoning:")
                    for summary_item in item.summary:
                        print(textwrap.fill(summary_item.text, 80), "\n")
                    item.summary = []

        for item in resp.output:
            tool_calls = False
            if item.type == "function_call":
                tool_calls = True
                # *** Get the tool, and package
                # up the call to the tool (our python function)
                tool_response = call_tool(tool_info,
                                          agent_state=agent_state,
                                          item=item)

                # 4. Provide function call results to the model
                inputs.append(tool_response)
    return resp, inputs


def _error_msg(error):
    print(error)
    return {"role": "user", "content": f"Oh this isn't good, it turns out: {error}. Please try again"}


def _grade_to_emoji(grade):
    if grade == 0:
        return '☹️'
    elif grade == 1:
        return '😑'
    elif grade == 2:
        return '😃'
    return '☹️'


def grades(query, search_results: SearchResults):
    """Get array of grades for each search result."""
    query_judgments = labeled_query_products[labeled_query_products['query'] == query]
    r_value = []
    for doc_id in search_results.ranked_results:
        title = corpus[corpus['doc_id'] == doc_id]['title'].iloc[0]
        doc_judgments = query_judgments[query_judgments['doc_id'] == doc_id]
        if len(doc_judgments) == 0:
            r_value.append((title, doc_id, _grade_to_emoji(None)))
        else:
            r_value.append((title, doc_id, _grade_to_emoji(int(doc_judgments['grade'].values[0]))))
    return r_value


def search(query,
           inputs,
           search_tool_fn) -> Tuple[SearchResults, list[str]]:
    """A little search harness."""

    error = True
    resp = None
    search_tool = make_tool_adapter(search_tool_fn)

    agent_state = {'log': []}
    tool_info = {search_tool_fn.__name__: search_tool}
    tool_info
    while error:
        resp, inputs = agent_run(tool_info,
                                 text_format=SearchResults,
                                 inputs=inputs,
                                 agent_state=agent_state)
        # Validate what came back, ensure it fits
        # Lookup each doc id in the response
        num_results = len(resp.output_parsed.ranked_results)
        if num_results != 10:
            inputs.append(_error_msg(f"Expected 10 ranked_results, got {num_results}"))
            error = True
            continue
        for item in resp.output_parsed.ranked_results:
            if item not in corpus['doc_id'].values:
                inputs.append(_error_msg(f"Doc id {item} is not in corpus"))
                error = True
                continue
        graded_results = grades(query, resp.output_parsed)
        error = False

    return resp.output_parsed, graded_results



### The orchestrator

Here we orchestrate search carefully to share context when we want to (and isolate the task when we don't want to share contxt)

In [None]:

def merge_results(labeled1, labeled2):
    """Combine labeled1 and labeled2 into a single result list"""
    doc_ids = set()
    merged = []

    for labeled in [labeled1, labeled2]:

        for title, doc_id, grade in labeled:
            if doc_id not in doc_ids:
                merged.append((title, doc_id, grade))
                doc_ids.add(doc_id)

    return merged


def orchestrate_search(keywords, top_k=10):
    """Essentially orchestate each strategy as a subagent."""
    feedback = []

    def _reset_inputs(system_prompt, keywords):
        return [{"role": "system", "content": system_prompt},
                {"role": "user", "content": keywords}]


    def _label_resp(inputs, labeled):
        if not labeled:
            return inputs
        merged = labeled
        graded_summary = "Here are your results reflecting whether they satisfy the user:\n"
        for title, doc_id, grade in merged:
            graded_summary += f"\n{title} (doc_id:{doc_id}) {grade}"
        graded_summary += "\n\n Given this, try some additional searches to find more candidates and rerank using this information."
        print(graded_summary)
        inputs.append({"role": "user", "content": graded_summary})
        return inputs

    labeled_cat = None
    labeled_lex = None
    labeled_emb = None

    # Below, we've isolated each to have its own context
    inputs = _reset_inputs(system_prompt, keywords)
    for attempt in range(5):
        inputs = _label_resp(inputs, labeled_cat)

        resp_cat, labeled_cat = search(keywords,
                                       inputs,
                                       search_with_category)

    inputs = _reset_inputs(system_prompt, keywords)
    for attempt in range(5):
        inputs = _label_resp(inputs, labeled_lex)
        resp_lex, labeled_lex = search(keywords,
                                       inputs,
                                       search_with_keywords)

    inputs = _reset_inputs(system_prompt, keywords)
    for attempt in range(5):
        inputs = _label_resp(inputs, labeled_emb)
        resp_emb, labeled_emb = search(keywords,
                                       inputs,
                                       search_embeddings)

    # Concat them
    ranked = resp_cat.ranked_results + resp_lex.ranked_results + resp_emb.ranked_results
    labeled = labeled_cat + labeled_lex + labeled_emb

    # Sort on labels, happiest first
    def to_numerical_score(emoji):
        if emoji == '☹️':
            return 0
        elif emoji == '😑':
            return 1
        elif emoji == '😃':
            return 2
        return 0
    scored = [to_numerical_score(label) for _, _, label in labeled]

    # Dedup
    together = {doc_id: score for doc_id, score in zip(ranked, scored)}

    # Sort togetherbased on scored
    together = sorted(together.items(), key=lambda x: x[1], reverse=True)

    # Limit to top_k
    together = together[:top_k]

    sorted_scores = [item[1] for item in together]
    sorted_ranked = [item[0] for item in together]

    final_ranked_results = SearchResults(ranked_results=sorted_ranked,
                                         results_summary="")
    return final_ranked_results, sorted_scores

    # Sort on labels, happiest first


resp = orchestrate_search("queen wingback chair")
resp

--
InpTok: 843
OutTok: 197
Reasoning:
**Searching for wingback chairs**  I need to use the search tool to find the
best matches for "queen wingback chair." It’s a little unclear if the user means
a queen-sized version or just a color related to queens. Maybe they want a
royal-style chair? I'll focus on the keywords "wingback chair queen" and
categorize it under Furniture. Since I only need one tool, I’ll call the
function directly to get the top 10 results. 

Calling search_with_category with args keywords='queen wingback chair' category='Furniture' top_k=10
search queen wingback chair Furniture
Filtering by category: Furniture
output {'search_results': [{'id': 25898, 'title': 'wingback chair', 'description': "this classic wing chair provides the perfect refuge for relaxation with a tall back and side panels to cradle a sleeping head . not only is this piece comfortable , but also elegant ; fully upholstered with top grain leather and beautifully detailed with an intricate -head . this

(SearchResults(results_summary='', ranked_results=[9265, 1514, 1572, 22017, 20408, 29950, 21295, 23749, 39544, 4506]),
 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

## Compare to BM25 baseline

In [None]:
from cheat_at_search.strategy import SearchStrategy
from cheat_at_search.search import run_strategy
from cheat_at_search.wands_data import judgments

class AgenticSearchStrategy(SearchStrategy):
    def __init__(self,
                 corpus,
                 workers=4):
        super().__init__(corpus, workers=workers)

    def search(self, query, k):
        agentic_query = "Find me: " + query
        print("_________---___________")
        print(agentic_query)
        print("_________---___________")
        top_k, scores = orchestrate_search(agentic_query, top_k=k)
        return top_k.ranked_results, scores


# Get 20 random queries from judgments
seed = 1234
np.random.seed(seed)
random_queries = np.random.choice(judgments['query'].unique(), 8)
selected_judgments = judgments[judgments['query'].isin(random_queries)]
selected_judgments

strategy = AgenticSearchStrategy(corpus, workers=1)
graded_agentic = run_strategy(strategy, selected_judgments)

_________---___________
Find me: carolyn console table
_________---___________


Searching:   0%|          | 0/8 [00:00<?, ?it/s]

--
InpTok: 847
OutTok: 89
Reasoning:
**Searching for console table**  I need to search the Wayfair catalog for the
"carolyn console table." I’ll use the search tool and pick the Furniture
category since it fits best. I’m thinking to query for this specific item
without specifying any additional categories. I’ll set the top_k to around 10 to
get a decent number of results. Let’s go ahead and call the search function now! 

Calling search_with_category with args keywords='carolyn console table' category='Furniture' top_k=10
search carolyn console table Furniture
Filtering by category: Furniture
output {'search_results': [{'id': 24572, 'title': "carolyn 48 '' solid wood console table", 'description': "offering up space for staging , serving , and stowing , three-tier tables are ideal for any living room or entryway ensemble . take this console table for example : pairing a dark brown-finished metal frame with a trio of pinewood levels , it 's sure to fit right into industrial and farmhous

Searching:  12%|█▎        | 1/8 [05:36<39:17, 336.73s/it]

--
InpTok: 846
OutTok: 91
Reasoning:
**Searching for furniture items**  I need to use the
functions.search_with_category tool to find the top results for "queen wingback
chair." The user might be looking for a queen-size wingback, but I want to
confirm that the furniture category is indeed "Furniture." I plan to gather 10
of the best matches using the top_k parameter set to 10. Okay, time to call the
tool and see what I can find! 

Calling search_with_category with args keywords='queen wingback chair' category='Furniture' top_k=10
search queen wingback chair Furniture
Filtering by category: Furniture
output {'search_results': [{'id': 25898, 'title': 'wingback chair', 'description': "this classic wing chair provides the perfect refuge for relaxation with a tall back and side panels to cradle a sleeping head . not only is this piece comfortable , but also elegant ; fully upholstered with top grain leather and beautifully detailed with an intricate -head . this chair 's design and functio

Searching:  25%|██▌       | 2/8 [12:16<37:22, 373.78s/it]

--
InpTok: 29315
OutTok: 815
Reasoning:
**Asking for clarification**  It seems the user was expecting to see 10 ranked
results, and now they're feeling disappointed with the previous outputs. They've
provided a list of items but are still uncertain about their request related to
"queen wingback chair." I should ask a direct question: Do they mean a queen-
sized wingback chair or perhaps a "Queen Anne" style chair? I’ll also clarify
materials, tufting, color preferences, maximum width, reclining options, and
price. This will help guide any additional searches more effectively. 

**Clarifying user preferences**  I should reply to the user by asking them to
clarify what they mean and their preferences regarding the chair. I can propose
options and ask for confirmation before continuing with searches. If they want
"Queen Anne style" chairs, I could look for ones with Queen Anne backs, or if
they're interested in something more "oversized throne-like," I could search for
"throne," "oversize

Searching:  38%|███▊      | 3/8 [19:58<34:29, 413.90s/it]

--
InpTok: 29422
OutTok: 1026
Reasoning:
**Searching for alternatives**  I'm noticing that the searches are returning a
lot of irrelevant results, and the catalog appears sparse for this product. The
user has asked for additional searches and reranking. I need to be transparent:
we haven't found exact matches for Wishbone chairs. Still, I can provide some
alternatives that mention "Y chair," "Ming chairs," or "paper cord seat."
While I found specific items that include paper cord, we must still present the
ten best matches. I'll suggest searching other retailers like Amazon or Wayfair
for further options. 

**Reviewing past results**  I’m scanning through earlier results, noting the
items we looked at, like 40170 Donnellson and 33336 Handcraft Tufted Parsons. It
seems we've identified ten different items, but the user keeps marking them as
unsatisfactory. They've asked for more searches, but I feel we've exhausted the
catalog. I need to communicate that we've searched thoroughly withou

Searching:  50%|█████     | 4/8 [26:46<27:27, 411.87s/it]

--
InpTok: 40824
OutTok: 956
Reasoning:
_________---___________
Find me: light wood nightstand with silver accents
_________---___________
--
InpTok: 849
OutTok: 98
Reasoning:
**Searching for furniture**  I need to use the search function specifically for
categories. The query is "light wood nightstand with silver accents," and I’m
thinking the category should be Furniture. Since the default tool gives me 5
results, I should request at least 10 this time to get a broader selection.
Alright, let’s go ahead and call the search function now! This should help me
find exactly what I’m looking for. 

Calling search_with_category with args keywords='light wood nightstand with silver accents' category='Furniture' top_k=10
search light wood nightstand with silver accents Furniture
Filtering by category: Furniture
output {'search_results': [{'id': 9005, 'title': 'dolo 1 - drawer solid wood nightstand in silver', 'description': 'simple , straight lines give our nightstand its appealing , straight

Searching:  62%|██████▎   | 5/8 [34:25<21:26, 428.84s/it]

--
InpTok: 37518
OutTok: 617
Reasoning:
**Assembling hardware IDs**  I'm looking at IDs that feature silver hardware or
light wood. I have several, like Jerry 22910 (silver hardware), Emily 1484
(brushed nickel), and Rae 36396 (silver handles), among others. I want to create
a top 10 list of these IDs, ensuring to include those with consistent silver
finishes.  Since many were marked negatively by the user, I need to suggest
alternative options, like searching external stores or recommending swapping
hardware. Alright, let’s put together that final SearchResults JSON. 

_________---___________
Find me: small woven pouf
_________---___________
--
InpTok: 846
OutTok: 170
Reasoning:
**Searching for poufs**  The user is asking for a "small woven pouf." I’ll need
to use the search tool to look through the Wayfair catalog and gather at least
10 of the best matches. The default setting gives me five options, but I can
adjust that to get up to ten. Since a pouf fits in decor, I should categori

Searching:  75%|███████▌  | 6/8 [41:44<14:24, 432.06s/it]

--
InpTok: 25463
OutTok: 731
Reasoning:
**Considering user feedback**  I see that the user isn't satisfied and has given
all sad faces. It's clear I need to rethink my approach here. I should strive to
replace those sad emojis with happy ones! My next step is to ask for
clarification on specific details like size, material, shape, style, or price
since the user didn’t respond earlier. Also, I'll need to ensure I provide
ranked results as required, but it seems the searches didn't yield much. Let's
refine my queries! 

**Clarifying user preferences**  I need to keep this simple: let's ask the user
if they want a pouf under a certain size or a specific material. It’s important
to be concise! I remember that the user flagged an error previously when I
returned zero results, so I must include 10 ranked results even if some are
repeats from earlier searches. I’ll use the doc IDs from promising items and
provide a summary asking for details on size and material while ensuring
everything foll

Searching:  88%|████████▊ | 7/8 [47:13<06:38, 398.51s/it]

--
InpTok: 26434
OutTok: 737
Reasoning:
**Gathering recent document IDs**  I need to include the 10 most recent document
IDs we searched. They are: 41040, 17269, 23917, 16024, 28565, 30214, 27469,
29265, 30860, and 35987. I'll summarize this information and ask a clarifying
question about relevance. My goal is to ensure I capture the most pertinent
document IDs from our searches, then return them in a JSON format while asking
for any options or further input. 

_________---___________
Find me: cliffside 5 light candle style drum chandelier
_________---___________
--
InpTok: 851
OutTok: 97
Reasoning:
**Searching Wayfair catalog**  I need to use functions.search_with_category to
find items in the Wayfair catalog. The query I’ll use is "cliffside 5 light
candle style drum chandelier," and I think the category should be Lighting since
that fits. I know that it's valid because Lighting is one of the available
categories. I’ll set the top_k parameter to 10 to retrieve 10 results. Now it's
ti

Searching: 100%|██████████| 8/8 [53:04<00:00, 398.10s/it]

--
InpTok: 18814
OutTok: 772
Reasoning:
**Searching for chandelier options**  I’m continuing my search because the user
seems disappointed with the results. It looks like they want a specific product
called the "Cliffside 5 light candle style drum chandelier." Since my search
tool didn't find "Cliffside," I should ask if they want the exact model.
However, I should also explore broader queries. Maybe "Cliffside" refers to a
brand like "Possini"? I could ask if they want me to explore outside the catalog
for more options, too. 

**Clarifying search results**  The developer mentioned that I should return
results using the SearchResults format, including a summary and a ranked list of
IDs. However, since I need to clarify the user's needs, I think it might be
smarter to ask a clarifying question instead of performing more searches that
haven’t worked. They've requested additional searches and I’ve already explored
many. So, I could ask if they want me to look outside the catalog, or if th




In [None]:
from cheat_at_search.search import ndcgs, graded_bm25, vs_ideal
# 5,5,4,7
ndcgs(graded_agentic), ndcgs(graded_agentic).mean()

(query
 wishbone chair                                    0.751253
 small woven pouf                                  0.437655
 cliffside 5 light candle style drum chandelier    0.333333
 led 60                                            0.333333
 queen wingback chair                              0.333333
 light wood nightstand with silver accents         0.333333
 carolyn console table                             0.309308
 dull bed with shirt head board                    0.035406
 Name: ndcg, dtype: float64,
 np.float64(0.3583694621325023))

In [None]:
ndcgs(graded_bm25[graded_bm25['query'].isin(random_queries)]), ndcgs(graded_bm25[graded_bm25['query'].isin(random_queries)]).mean()

(query
 wishbone chair                                    0.674750
 small woven pouf                                  0.468003
 carolyn console table                             0.333333
 cliffside 5 light candle style drum chandelier    0.333333
 led 60                                            0.333333
 queen wingback chair                              0.307727
 light wood nightstand with silver accents         0.288624
 dull bed with shirt head board                    0.000000
 Name: ndcg, dtype: float64,
 np.float64(0.3423881136250733))