## Testing WANDS in Quepid via API

Objective:

Set up a local Quepid instance and programmatically load the WANDS dataset into Quepid as one or more cases.

## Initial set up of Quepid and custom API

`docker compose build`

`cp .env.example .env`

`vi .env`

`docker compose run quepid-api-quepid bin/rake db:migrate`

`docker compose run quepid-api-quepid bin/rake db:seed`

`docker compose run quepid-api-quepid bundle exec thor user:create -a admin@example.com "Admin User" supersecret`

`docker compose run quepid-api-quepid bundle exec thor user:add_api_key admin@example.com`

Or folllow more detailed instrutions in [README](https://github.com/frutik/quepid-api-unofficial?tab=readme-ov-file#run-locally-connecting-to-local-quepid)

When you create API key, store it and use later in this notebook.

- for api: http://localhost:8081/api/docs
- for quepid: http://localhost:3000/

#### Config

In [1]:
WANDS_INDEX = 'http://localhost:9200/wands'
QUEPID_TOKEN = 'aead3c86f788688d86e06ebc9c5ed24e717d72922a35b231a189cf6847f4a1b3'  # past the token you created earlier
AUTH = {
    "Authorization": f"Bearer {QUEPID_TOKEN}"
}

## Python dependencis

In [3]:
!pip install pandas requests tqdm

fatal: destination path 'WANDS' already exists and is not an empty directory.


In [2]:
import requests
import json

from tqdm import tqdm
import pandas as pd

## WANDS

WANDS is a human-annotated dataset from Wayfair for evaluating product search relevance. It includes 480 queries, ~43K products, and 233K query-product relevance labels (Exact, Partial, Irrelevant), plus rich product metadata—ideal for training and benchmarking search models.

In [3]:
!git clone https://github.com/wayfair/WANDS.git

fatal: destination path 'WANDS' already exists and is not an empty directory.


In [3]:
query_df = pd.read_csv("WANDS/dataset/query.csv", sep='\t')
query_df

Unnamed: 0,query_id,query,query_class
0,0,salon chair,Massage Chairs
1,1,smart coffee table,Coffee & Cocktail Tables
2,2,dinosaur,Kids Wall Décor
3,3,turquoise pillows,Accent Pillows
4,4,chair and a half recliner,Recliners
...,...,...,...
475,483,rustic twig,Faux Plants and Trees
476,484,nespresso vertuo next premium by breville with...,Espresso Machines
477,485,pedistole sink,Kitchen Sinks
478,486,54 in bench cushion,Furniture Cushions


In [4]:
product_df = pd.read_csv("WANDS/dataset/product.csv", sep='\t')
product_df

Unnamed: 0,product_id,product_name,product_class,category hierarchy,product_description,product_features,rating_count,average_rating,review_count
0,0,solid wood platform bed,Beds,Furniture / Bedroom Furniture / Beds & Headboa...,"good , deep sleep can be quite difficult to ha...",overallwidth-sidetoside:64.7|dsprimaryproducts...,15.0,4.5,15.0
1,1,all-clad 7 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,"create delicious slow-cooked meals , from tend...",capacityquarts:7|producttype : slow cooker|pro...,100.0,2.0,98.0
2,2,all-clad electrics 6.5 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,prepare home-cooked meals on any schedule with...,features : keep warm setting|capacityquarts:6....,208.0,3.0,181.0
3,3,all-clad all professional tools pizza cutter,"Slicers, Peelers And Graters",Browse By Brand / All-Clad,this original stainless tool was designed to c...,overallwidth-sidetoside:3.5|warrantylength : l...,69.0,4.5,42.0
4,4,baldwin prestige alcott passage knob with roun...,Door Knobs,Home Improvement / Doors & Door Hardware / Doo...,the hardware has a rich heritage of delivering...,compatibledoorthickness:1.375 '' |countryofori...,70.0,5.0,42.0
...,...,...,...,...,...,...,...,...,...
42989,42989,malibu pressure balanced diverter fixed shower...,Shower Panels,Home Improvement / Bathroom Remodel & Bathroom...,the malibu pressure balanced diverter fixed sh...,producttype : shower panel|spraypattern : rain...,3.0,4.5,2.0
42990,42990,emmeline 5 piece breakfast dining set,Dining Table Sets,Furniture / Kitchen & Dining Furniture / Dinin...,,basematerialdetails : steel| : gray wood|ofhar...,1314.0,4.5,864.0
42991,42991,maloney 3 piece pub table set,Dining Table Sets,Furniture / Kitchen & Dining Furniture / Dinin...,this pub table set includes 1 counter height t...,additionaltoolsrequirednotincluded : power dri...,49.0,4.0,41.0
42992,42992,fletcher 27.5 '' wide polyester armchair,Teen Lounge Furniture|Accent Chairs,Furniture / Living Room Furniture / Chairs & S...,"bring iconic , modern style to your space in a...",legmaterialdetails : rubberwood|backheight-sea...,1746.0,4.5,1226.0


In [5]:
labels_df = pd.read_csv("WANDS/dataset/label.csv", sep='\t')
labels_df

Unnamed: 0,id,query_id,product_id,label
0,0,0,25434,Exact
1,1,0,12088,Irrelevant
2,2,0,42931,Exact
3,3,0,2636,Exact
4,4,0,42923,Exact
...,...,...,...,...
233443,234010,478,15439,Partial
233444,234011,478,451,Partial
233445,234012,478,30764,Irrelevant
233446,234013,478,16796,Partial


## Set up Elasticsearch

### Create index

In [6]:
idx = requests.put(
    WANDS_INDEX,
    json={
        "mappings": {
            "properties": {
                "name": {
                    "type": "text"
                },
                "description": {
                    "type": "text"
                }
            }
        }
    }
)
idx.json()

{'acknowledged': True, 'shards_acknowledged': True, 'index': 'wands'}

### Index products

In [7]:
def index_record(id, name, description):
    if id and name and description:
        try:
            return requests.post(
                f"{WANDS_INDEX}/_doc/{id}",
                json={
                    'name': name,
                    'description': description    
                }
            )
        except:
            pass

In [8]:
for index, row in tqdm(product_df.iterrows(), total=len(product_df)):
    _ = index_record(row['product_id'], row['product_name'], row['product_description'])

100%|████████████████████████████████████████████████████████████████████████████████████████████| 42994/42994 [02:25<00:00, 296.28it/s]


### Search in data

In [9]:
response = requests.post(
    f"{WANDS_INDEX}/_search",
    json={
        "size": 0,
        "track_total_hits": True
    }
)
response.json()

{'took': 172,
 'timed_out': False,
 'terminated_early': False,
 '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
 'hits': {'total': {'value': 36985, 'relation': 'eq'},
  'max_score': None,
  'hits': []}}

In [10]:
def search_query(query='#$query##'):
    return {
      "query": {
        "multi_match": {
          "query": query,
          "fields": [f"name", "description"]
        }
      }
    }


def search_query_boosted(query='#$query##'):
    return {
      "query": {
        "multi_match": {
          "query": query,
          "fields": ["name^2", "description"]
        }
      }
    }


def search(query):
    response = requests.post(
        f"{WANDS_INDEX}/_search",
        json=search_query(query)
    )
    return response.json()

In [11]:
search('dinosaur')

{'took': 39,
 'timed_out': False,
 '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
 'hits': {'total': {'value': 131, 'relation': 'eq'},
  'max_score': 11.643906,
  'hits': [{'_index': 'wands',
    '_id': '24094',
    '_score': 11.643906,
    '_source': {'name': 'dinosaur throw pillow',
     'description': 'this dinosaur throw pillow is the plushiest and friendliest stuffed dinosaur you can find . dinosaur features bright and vibrant chenille accents .'}},
   {'_index': 'wands',
    '_id': '14418',
    '_score': 11.423598,
    '_source': {'name': 'dinosaur institute diy assemble dinosaur home toy appliance set',
     'description': 'dinosaur institute diy assemble dinosaur home toy set perfect party supplies to feature : dinosaur playset : this great value dinosaur play is perfectly matched with the simulated dinosaur figurines , and the rich jurassic dinosaur period is immersed in a fascinating past . this is a complete set of 35pcs dinosaur action figures and scen

## Loading data into Quepid

### Create team

In [12]:
team = requests.post(
    'http://localhost:8081/api/teams/', 
    headers = AUTH,
    json={
        "name": "wands"
    }   
)

team = team.json()

In [13]:
team

{'id': 1,
 'name': 'wands',
 'created_at': '2025-07-20T15:42:44.041Z',
 'updated_at': '2025-07-20T15:42:44.041Z'}

### Create search endpoint

In [14]:
endpoint = requests.post(
    'http://localhost:8081/api/search_endpoints/', 
    headers = AUTH,
    json={
        "name": "wands",
        "endpoint_url": "http://quepid-api-elasticsearch:9200/wands/_search",
        "search_engine": "es",
        "api_method": "POST",
        "proxy_requests": 1,   
    }   
)

endpoint = endpoint.json()


In [59]:
endpoint

{'id': 1,
 'name': 'wands',
 'owner': 1,
 'search_engine': 'es',
 'endpoint_url': 'http://quepid-api-elasticsearch:9200/wands/_search',
 'api_method': 'POST',
 'custom_headers': None,
 'archived': 0,
 'created_at': '2025-07-20T11:34:17.238Z',
 'updated_at': '2025-07-20T11:34:17.238Z',
 'basic_auth_credential': None,
 'mapper_code': None,
 'proxy_requests': 1,
 'options': None}

### Create cases

In [15]:
# list scorers
scorers = requests.get(
    'http://localhost:8081/api/scorers/', 
    headers = AUTH
)
{s['id']: s['name'] for s in scorers.json()['items']}

{1: 'nDCG@10',
 2: 'DCG@10',
 3: 'CG@10',
 4: 'P@10',
 5: 'AP@10',
 6: 'RR@10',
 7: 'ERR@10'}

In [16]:
case1 = requests.post(
    'http://localhost:8081/api/case/', 
    headers = AUTH,
    json={
        "name": "wands",
        "scorer_id": 1,
        "book_id": 0,
        "search_endpoint_id": endpoint.get('id'),
        "search_query": json.dumps(search_query())
    }   
)

case2 = requests.post(
    'http://localhost:8081/api/case/', 
    headers = AUTH,
    json={
        "name": "wands boosted",
        "scorer_id": 1,
        "book_id": 0,
        "search_endpoint_id": endpoint.get('id'),
        "search_query": json.dumps(search_query_boosted())
    }   
)

case1 = case1.json()
case2 = case2.json()

In [17]:
print(case1)
print(case2)

{'id': 1, 'case_name': 'wands', 'last_try_number': 1, 'owner': 1, 'archived': 0, 'scorer_id': 1, 'created_at': '2025-07-20T15:43:01.072Z', 'updated_at': '2025-07-20T15:43:01.072Z', 'book_id': None, 'public': None, 'options': None, 'nightly': 1}
{'id': 2, 'case_name': 'wands boosted', 'last_try_number': 1, 'owner': 1, 'archived': 0, 'scorer_id': 1, 'created_at': '2025-07-20T15:43:01.102Z', 'updated_at': '2025-07-20T15:43:01.102Z', 'book_id': None, 'public': None, 'options': None, 'nightly': 1}


### Load queries and Judgements

In [18]:
def add_query(case, query):
    quepid_query = requests.post(
        f'http://localhost:8081/api/query/{case.get("id")}/', 
        headers = AUTH,
        json={
            "query_text": query
        }   
    )
    if quepid_query.status_code == 200:
        return quepid_query.json() 


def add_label(query_id, doc_id, label):
    # print([query, doc_id, label])
    return requests.post(
        f'http://localhost:8081/api/rating/query/{query_id}/rating/', 
        headers = AUTH,
        json={
            "doc_id": str(doc_id),
            "rating": label_to_rating(label)
        }   
    )


def label_to_rating(label):
    if label == 'Partial':
        return 2
    if label == 'Exact':
        return 3
    return 0


def add_labels(quepid_query, query_labels):
    for _, label in query_labels.iterrows():
        add_label(quepid_query, label['product_id'], label['label'])


In [19]:
for index, row in tqdm(query_df.iterrows(), total=len(query_df)):
    query_labels_df = labels_df[labels_df['query_id'] == row['query_id']]
    for case in [case1, case2]:
        if quepid_query := add_query(case, row['query']):
            add_labels(quepid_query.get('id'), query_labels_df)

100%|███████████████████████████████████████████████████████████████████████████████████████████████| 480/480 [1:04:42<00:00,  8.09s/it]


what latest score is?

In [21]:
requests.get(
    'http://localhost:8081/api/case/1/', 
    headers = AUTH   
).json()

{'id': 1,
 'case_name': 'wands',
 'last_try_number': 1,
 'owner': 1,
 'archived': 0,
 'scorer_id': 1,
 'created_at': '2025-07-20T15:43:01Z',
 'updated_at': '2025-07-20T16:32:43Z',
 'book_id': None,
 'public': None,
 'options': None,
 'nightly': 1}

![Title](quepid-wands.png)