# Query -> Full classification

<small>
(from <a href="http://maven.com/softwaredoug/cheat-at-search">Cheat at Search with LLMs</a> training course by Doug Turnbull.)
</small>

In this refinement on [past examples](https://colab.research.google.com/drive/1AfK3uGV3Lbrv5henj995YV4XEpVmLpBf) we will get away from classifying to category / subcategory independently and classify to the full classification, ie a query to one of the following


```
Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs,
Rugs / Area Rugs,
...
```

We will recompute the previous stats to see how well this works.

## Boilerplate

Install deps, mount GDrive, prompt for your OpenAI Key (placed in your GDrive)

In [None]:
!pip install git+https://github.com/softwaredoug/cheat-at-search.git
from cheat_at_search.data_dir import mount
mount(use_gdrive=True)
from cheat_at_search.search import run_strategy, graded_bm25, ndcgs, ndcg_delta, vs_ideal
from cheat_at_search.wands_data import products

products

Collecting git+https://github.com/softwaredoug/cheat-at-search.git
  Cloning https://github.com/softwaredoug/cheat-at-search.git to /tmp/pip-req-build-870ksit2
  Running command git clone --filter=blob:none --quiet https://github.com/softwaredoug/cheat-at-search.git /tmp/pip-req-build-870ksit2
  Resolved https://github.com/softwaredoug/cheat-at-search.git to commit 8b45aa193bb58c42c89e1aed1c6279be0be54afa
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting pystemmer<4.0.0,>=3.0.0 (from cheat_at_search==0.1.0)
  Downloading PyStemmer-3.0.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.8 kB)
Collecting searcharray<0.0.74,>=0.0.73 (from cheat_at_search==0.1.0)
  Downloading searcharray-0.0.73-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Downloading PyStemmer-3.0.0-cp3

Unnamed: 0,product_id,product_name,product_class,category hierarchy,product_description,product_features,rating_count,average_rating,review_count,features,category,sub_category,cat_subcat
0,0,solid wood platform bed,Beds,Furniture / Bedroom Furniture / Beds & Headboa...,"good , deep sleep can be quite difficult to ha...",overallwidth-sidetoside:64.7|dsprimaryproducts...,15.0,4.5,15.0,"[overallwidth-sidetoside:64.7, dsprimaryproduc...",Furniture,Bedroom Furniture,Furniture / Bedroom Furniture
1,1,all-clad 7 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,"create delicious slow-cooked meals , from tend...",capacityquarts:7|producttype : slow cooker|pro...,100.0,2.0,98.0,"[capacityquarts:7, producttype : slow cooker, ...",Kitchen & Tabletop,Small Kitchen Appliances,Kitchen & Tabletop / Small Kitchen Appliances
2,2,all-clad electrics 6.5 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,prepare home-cooked meals on any schedule with...,features : keep warm setting|capacityquarts:6....,208.0,3.0,181.0,"[features : keep warm setting, capacityquarts:...",Kitchen & Tabletop,Small Kitchen Appliances,Kitchen & Tabletop / Small Kitchen Appliances
3,3,all-clad all professional tools pizza cutter,"Slicers, Peelers And Graters",Browse By Brand / All-Clad,this original stainless tool was designed to c...,overallwidth-sidetoside:3.5|warrantylength : l...,69.0,4.5,42.0,"[overallwidth-sidetoside:3.5, warrantylength :...",Browse By Brand,All-Clad,Browse By Brand / All-Clad
4,4,baldwin prestige alcott passage knob with roun...,Door Knobs,Home Improvement / Doors & Door Hardware / Doo...,the hardware has a rich heritage of delivering...,compatibledoorthickness:1.375 '' |countryofori...,70.0,5.0,42.0,"[compatibledoorthickness:1.375 '' , countryofo...",Home Improvement,Doors & Door Hardware,Home Improvement / Doors & Door Hardware
...,...,...,...,...,...,...,...,...,...,...,...,...,...
42989,42989,malibu pressure balanced diverter fixed shower...,Shower Panels,Home Improvement / Bathroom Remodel & Bathroom...,the malibu pressure balanced diverter fixed sh...,producttype : shower panel|spraypattern : rain...,3.0,4.5,2.0,"[producttype : shower panel, spraypattern : ra...",Home Improvement,Bathroom Remodel & Bathroom Fixtures,Home Improvement / Bathroom Remodel & Bathro...
42990,42990,emmeline 5 piece breakfast dining set,Dining Table Sets,Furniture / Kitchen & Dining Furniture / Dinin...,,basematerialdetails : steel| : gray wood|ofhar...,1314.0,4.5,864.0,"[basematerialdetails : steel, : gray wood, of...",Furniture,Kitchen & Dining Furniture,Furniture / Kitchen & Dining Furniture
42991,42991,maloney 3 piece pub table set,Dining Table Sets,Furniture / Kitchen & Dining Furniture / Dinin...,this pub table set includes 1 counter height t...,additionaltoolsrequirednotincluded : power dri...,49.0,4.0,41.0,[additionaltoolsrequirednotincluded : power dr...,Furniture,Kitchen & Dining Furniture,Furniture / Kitchen & Dining Furniture
42992,42992,fletcher 27.5 '' wide polyester armchair,Teen Lounge Furniture|Accent Chairs,Furniture / Living Room Furniture / Chairs & S...,"bring iconic , modern style to your space in a...",legmaterialdetails : rubberwood|backheight-sea...,1746.0,4.5,1226.0,"[legmaterialdetails : rubberwood, backheight-s...",Furniture,Living Room Furniture,Furniture / Living Room Furniture


## Query -> Full classification

We'll first setup the model of query -> the full classification.

Here we provide a long list of valid full classifications for the LLM to use.

**Warning** this gets expensive. It'll cost $5 or so to run. We'll quickly switch to some ways of saving costs, so no worries

In [None]:
from pydantic import BaseModel, Field
from typing import List, Literal
from cheat_at_search.enrich import AutoEnricher


from typing import Literal, get_args
# Ameer says to try:
# 'Furniture / Bedroom Furniture / Beds & Headboards / Beds',
#-> ' Beds / Beds & Headboards / Bedroom Furniture / Furniture'
FullyQualifiedClassifications = Literal[
 'Furniture / Bedroom Furniture / Beds & Headboards / Beds',
 'Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs',
 'Rugs / Area Rugs',
 'Furniture / Office Furniture / Desks',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Coffee Tables',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / End & Side Tables',
 'Décor & Pillows / Decorative Pillows & Blankets / Throw Pillows',
 'Furniture / Bedroom Furniture / Dressers & Chests',
 'Outdoor / Outdoor & Patio Furniture / Patio Furniture Sets / Patio Conversation Sets',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Vanities / All Bathroom Vanities',
 'Furniture / Living Room Furniture / Console Tables',
 'Décor & Pillows / Art / All Wall Art',
 'Furniture / Kitchen & Dining Furniture / Bar Furniture / Bar Stools & Counter Stools / All Bar Stools & Counter Stools',
 'Furniture / Kitchen & Dining Furniture / Dining Tables & Seating / Kitchen & Dining Chairs',
 'Furniture / Office Furniture / Office Chairs',
 'Décor & Pillows / Mirrors / All Mirrors',
 'Bed & Bath / Bedding / All Bedding',
 'Décor & Pillows / Wall Décor / Wall Accents',
 'Furniture / Living Room Furniture / Chairs & Seating / Recliners',
 'Furniture / Kitchen & Dining Furniture / Dining Tables & Seating / Kitchen and Dining Sets',
 'Décor & Pillows / Window Treatments / Curtains & Drapes',
 'Furniture / Living Room Furniture / Sectionals',
 'Baby & Kids / Toddler & Kids Bedroom Furniture / Kids Beds',
 'Furniture / Living Room Furniture / TV Stands & Media Storage Furniture / TV Stands & Entertainment Centers',
 'Lighting / Ceiling Lights / Chandeliers',
 'Furniture / Bedroom Furniture / Nightstands',
 'Baby & Kids / Toddler & Kids Bedroom Furniture / Kids Desks',
 'Décor & Pillows / Home Accessories / Decorative Objects',
 'Furniture / Bedroom Furniture / Beds & Headboards / Headboards',
 'Furniture / Living Room Furniture / Sofas',
 'Furniture / Living Room Furniture / Cabinets & Chests',
 'Décor & Pillows / Clocks / Wall Clocks',
 'Storage & Organization / Bathroom Storage & Organization / Bathroom Cabinets & Shelving',
 'Lighting / Table & Floor Lamps / Table Lamps',
 'Furniture / Living Room Furniture / Ottomans & Poufs',
 'Furniture / Kitchen & Dining Furniture / Kitchen Islands & Carts',
 'Furniture / Living Room Furniture / Bookcases',
 'Outdoor / Outdoor & Patio Furniture / Outdoor Seating & Patio Chairs / Patio Seating / Patio Sofas & Sectionals',
 'Furniture / Office Furniture / Office Storage Cabinets',
 'Furniture / Kitchen & Dining Furniture / Dining Tables & Seating / Kitchen & Dining Tables',
 'Contractor / Entry & Hallway / Coat Racks & Umbrella Stands',
 'Bed & Bath / Bedding Essentials / Mattress Pads & Toppers',
 'Home Improvement / Hardware / Home Hardware / Switch Plates',
 'Baby & Kids / Toddler & Kids Playroom / Playroom Furniture / Toddler & Kids Chairs & Seating',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Outdoor Covers / Patio Furniture Covers',
 'Rugs / Doormats',
 'Rugs / Kitchen Mats',
 'Furniture / Bedroom Furniture / Beds & Headboards / Beds / Queen Size Beds',
 'Furniture / Bedroom Furniture / Daybeds',
 'Furniture / Living Room Furniture / Living Room Sets',
 'Outdoor / Outdoor & Patio Furniture / Patio Furniture Sets / Patio Dining Sets',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Sinks & Faucet Components / Bathroom Sink Faucets / Single Hole Bathroom Sink Faucets',
 'Outdoor / Outdoor Décor / Statues & Sculptures',
 'Décor & Pillows / Art / All Wall Art / Green Wall Art',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Coffee Table Sets',
 'Furniture / Living Room Furniture / Chairs & Seating / Chaise Lounge Chairs',
 'Storage & Organization / Wall Shelving & Organization / Wall and Display Shelves',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Coffee Tables / Rectangle Coffee Tables',
 'Décor & Pillows / Art / All Wall Art / Brown Wall Art',
 'Furniture / Kitchen & Dining Furniture / Bar Furniture / Bar Stools & Counter Stools / All Bar Stools & Counter Stools / Counter (24-27) Bar Stools & Counter Stools',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Plant Stands & Tables',
 'Décor & Pillows / Window Treatments / Curtain Hardware & Accessories',
 'Furniture / Kitchen & Dining Furniture / Dining Tables & Seating / Kitchen & Dining Chairs / Side Kitchen & Dining Chairs',
 'Outdoor / Outdoor & Patio Furniture / Outdoor Seating & Patio Chairs / Patio Seating / Outdoor Club Chairs',
 'Furniture / Living Room Furniture / Chairs & Seating / Benches',
 'Home Improvement / Kitchen Remodel & Kitchen Fixtures / Kitchen Sinks & Faucet Components / Kitchen Sinks / Farmhouse & Apron Kitchen Sinks',
 'Kitchen & Tabletop / Kitchen Organization / Food Pantries',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Towel Storage / Towel & Robe Hooks / Black Towel & Robe Hooks',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Deck Boxes & Patio Storage',
 'Outdoor / Garden / Planters',
 'Lighting / Wall Lights / Bathroom Vanity Lighting',
 'Furniture / Kitchen & Dining Furniture / Sideboards & Buffets',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Storage Racks & Shelving Units',
 'Home Improvement / Hardware / Cabinet Hardware / Cabinet & Drawer Pulls / Bronze Cabinet & Drawer Pulls',
 'Storage & Organization / Storage Containers & Drawers / All Storage Containers',
 'Bed & Bath / Shower Curtains & Accessories / Shower Curtains & Shower Liners',
 'Storage & Organization / Bathroom Storage & Organization / Hampers & Laundry Baskets',
 'Lighting / Light Bulbs & Hardware / Light Bulbs / All Light Bulbs / LED Light Bulbs',
 'Décor & Pillows / Art / All Wall Art / Blue Wall Art',
 'Bed & Bath / Mattresses & Foundations / Innerspring Mattresses',
 'Lighting / Outdoor Lighting / Outdoor Wall Lighting',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Natural Material Storage / Log Storage',
 'Bed & Bath / Bathroom Accessories & Organization / Countertop Bath Accessories',
 'Storage & Organization / Shoe Storage / All Shoe Storage',
 'Home Improvement / Flooring, Walls & Ceiling / Floor Tiles & Wall Tiles / Ceramic Floor Tiles & Wall Tiles',
 'Home Improvement / Hardware / Cabinet Hardware / Cabinet & Drawer Pulls / Black Cabinet & Drawer Pulls',
 'Bed & Bath / Mattresses & Foundations / Adjustable Beds',
 "Rugs / Area Rugs / 2' x 3' Area Rugs",
 'Commercial Business Furniture / Commercial Office Furniture / Office Storage & Filing / Office Carts & Stands / All Carts & Stands',
 'Furniture / Bedroom Furniture / Beds & Headboards / Beds / Twin Beds',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Sinks & Faucet Components / Bathroom Sink Faucets / Widespread Bathroom Sink Faucets',
 "Rugs / Area Rugs / 4' x 6' Area Rugs",
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Sinks & Faucet Components / Bathroom Sink Faucets',
 'Kitchen & Tabletop / Tableware & Drinkware / Table & Kitchen Linens / All Table Linens',
 'Kitchen & Tabletop / Kitchen Organization / Food Storage & Canisters / Food Storage Containers',
 'Décor & Pillows / Flowers & Plants / Faux Flowers',
 'Bed & Bath / Bedding / All Bedding / Twin Bedding',
 'Furniture / Bedroom Furniture / Dressers & Chests / White Dressers & Chests',
 'Home Improvement / Flooring, Walls & Ceiling / Floor Tiles & Wall Tiles / Porcelain Floor Tiles & Wall Tiles',
 'Home Improvement / Flooring, Walls & Ceiling / Flooring Installation & Accessories / Molding & Millwork / Wall Molding & Millwork',
 'Home Improvement / Doors & Door Hardware / Door Hardware & Accessories / Barn Door Hardware',
 'Bed & Bath / Bedding / Sheets & Pillowcases',
 'Furniture / Office Furniture / Chair Mats / Hard Floor Chair Mats',
 'Outdoor / Outdoor Fencing & Flooring / All Fencing',
 'Storage & Organization / Closet Storage & Organization / Clothes Racks & Garment Racks',
 'Kitchen & Tabletop / Kitchen Utensils & Tools / Colanders, Strainers, & Salad Spinners',
 'Outdoor / Hot Tubs & Saunas / Saunas',
 'Décor & Pillows / Decorative Pillows & Blankets / Throw Pillows / Blue Throw Pillows',
 'Bed & Bath / Bedding Essentials / Bed Pillows',
 'Lighting / Wall Lights / Wall Sconces',
 'Outdoor / Front Door Décor & Curb Appeal / Mailboxes',
 'Outdoor / Garden / Greenhouses',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Showers & Bathtubs / Showers & Bathtubs Plumbing / Shower Faucets & Systems',
 'Bed & Bath / Mattresses & Foundations / Queen Mattresses',
 'Furniture / Bedroom Furniture / Jewelry Armoires',
 'Outdoor / Outdoor Shades / Awnings',
 'Baby & Kids / Nursery Bedding / Crib Bedding Sets',
 'Home Improvement / Hardware / Cabinet Hardware / Cabinet & Drawer Knobs / Brass Cabinet & Drawer Knobs',
 'Décor & Pillows / Art / All Wall Art / Red Wall Art',
 'Lighting / Ceiling Lights / All Ceiling Lights',
 'Lighting / Light Bulbs & Hardware / Lighting Components',
 'Furniture / Game Tables & Game Room Furniture / Poker & Card Tables',
 'Appliances / Kitchen Appliances / Range Hoods / All Range Hoods',
 'Home Improvement / Flooring, Walls & Ceiling / Floor Tiles & Wall Tiles / Natural Stone Floor Tiles & Wall Tiles',
 'Furniture / Kitchen & Dining Furniture / Bar Furniture / Bar Stools & Counter Stools / All Bar Stools & Counter Stools / Bar (28-33) Bar Stools & Counter Stools',
 'Outdoor / Outdoor Cooking & Tableware / Outdoor Serving & Tableware / Coolers, Baskets & Tubs / Picnic Baskets & Backpacks',
 'Décor & Pillows / Picture Frames & Albums / All Picture Frames',
 'Bed & Bath / Shower Curtains & Accessories / Shower Curtain Hooks',
 'Outdoor / Outdoor Shades / Outdoor Umbrellas / Patio Umbrella Stands & Bases',
 'Outdoor / Outdoor & Patio Furniture / Patio Bar Furniture / Patio Bar Stools',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Toilets & Bidets / Toilet Paper Holders / Free Standing Toilet Paper Holders',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Bike & Sport Racks',
 'Appliances / Kitchen Appliances / Refrigerators & Freezers / All Refrigerators / French Door Refrigerators',
 'Décor & Pillows / Home Accessories / Decorative Trays',
 'School Furniture and Supplies / School Spaces / Computer Lab Furniture / Podiums & Lecterns',
 'Lighting / Light Bulbs & Hardware / Lighting Shades',
 'Furniture / Kitchen & Dining Furniture / Bar Furniture / Home Bars & Bar Sets',
 'Lighting / Table & Floor Lamps / Floor Lamps',
 'Décor & Pillows / Wall Décor / Wall Accents / Brown Wall Accents',
 'Kitchen & Tabletop / Small Kitchen Appliances / Pressure & Slow Cookers / Slow Cookers / Slow Slow Cookers',
 'Décor & Pillows / Window Treatments / Curtains & Drapes / 90 Inch Curtains & Drapes',
 'Furniture / Bedroom Furniture / Armoires & Wardrobes',
 'Kitchen & Tabletop / Tableware & Drinkware / Flatware & Cutlery / Serving Utensils',
 'Baby & Kids / Baby & Kids Décor & Lighting / All Baby & Kids Wall Art',
 'Furniture / Office Furniture / Desks / Writing Desks',
 'Furniture / Office Furniture / Office Chairs / Task Office Chairs',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Showers & Bathtubs / Shower & Bathtub Doors',
 'Outdoor / Outdoor & Patio Furniture / Outdoor Seating & Patio Chairs / Patio Seating / Patio Rocking Chairs & Gliders',
 'Home Improvement / Flooring, Walls & Ceiling / Walls & Ceilings / Wall Paneling',
 'Outdoor / Garden / Plant Stands & Accessories',
 'Furniture / Kitchen & Dining Furniture / Dining Tables & Seating / Kitchen & Dining Tables / 4 Seat Kitchen & Dining Tables',
 'Décor & Pillows / Home Accessories / Vases, Urns, Jars & Bottles',
 'Lighting / Wall Lights / Under Cabinet Lighting / Strip Under Cabinet Lighting',
 'Furniture / Bedroom Furniture / Bedroom and Makeup Vanities',
 'Pet / Dog / Dog Bowls & Feeding Supplies / Pet Bowls & Feeders',
 'Décor & Pillows / Candles & Holders / Candle Holders',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Showers & Bathtubs / Shower & Bathtub Accessories',
 'Furniture / Office Furniture / Office Chair Accessories / Seat Cushion Office Chair Accessories',
 'Furniture / Office Furniture / Chair Mats',
 'Furniture / Living Room Furniture / Chairs & Seating / Massage Chairs',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Vanities / All Bathroom Vanities / Modern & Contemporary Bathroom Vanities',
 'Lighting / Ceiling Fans / All Ceiling Fans',
 'Home Improvement / Kitchen Remodel & Kitchen Fixtures / Kitchen Sinks & Faucet Components / Kitchen Faucets / Black Kitchen Faucets',
 'Lighting / Light Bulbs & Hardware / Light Bulbs / All Light Bulbs / Incandescent Light Bulbs',
 'Home Improvement / Flooring, Walls & Ceiling / Flooring Installation & Accessories / Molding & Millwork',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Showers & Bathtubs / Bathtubs',
 'Décor & Pillows / Art / All Wall Art / Yellow Wall Art',
 'Pet / Dog / Pet Gates, Fences & Doors / Pet Gates',
 'Furniture / Bedroom Furniture / Beds & Headboards / Bed Frames / Twin Bed Frames',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Towel Storage / Towel Bars, Racks, and Stands / Metal Towel Bars, Racks, and Stands',
 'Décor & Pillows / Art / All Wall Art / Pink Wall Art',
 'Home Improvement / Kitchen Remodel & Kitchen Fixtures / Smoke Detectors / Wall & Ceiling Mounted Smoke Detectors',
 'Outdoor / Garden / Planters / Plastic Planters',
 'Décor & Pillows / Mirrors / All Mirrors / Accent Mirrors',
 'Appliances / Kitchen Appliances / Range Hoods / All Range Hoods / Wall Mount Range Hoods',
 'Outdoor / Garden / Garden Décor / Lawn & Garden Accents',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Coffee Tables / Round Coffee Tables',
 'Kitchen & Tabletop / Tableware & Drinkware / Dinnerware / Dining Bowls',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Showers & Bathtubs / Showers & Bathtubs Plumbing / Shower Heads / Dual Shower Heads',
 'Home Improvement / Flooring, Walls & Ceiling / Floor Tiles & Wall Tiles / Glass Floor Tiles & Wall Tiles',
 'School Furniture and Supplies / Facilities & Maintenance / Trash & Recycling',
 'Home Improvement / Hardware / Cabinet Hardware / Cabinet & Drawer Pulls / Nickel Cabinet & Drawer Pulls',
 'Storage & Organization / Closet Storage & Organization / Closet Systems',
 'Furniture / Bedroom Furniture / Beds & Headboards / Beds / Full & Double Beds',
 'Commercial Business Furniture / Commercial Office Furniture / Office Storage & Filing / Office Carts & Stands / All Carts & Stands / Printer Carts & Stands',
 'Storage & Organization / Closet Storage & Organization / Closet Accessories',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Vanities / All Bathroom Vanities / Traditional Bathroom Vanities',
 'Home Improvement / Plumbing / Core Plumbing / Parts & Components',
 'Holiday Décor / Christmas / Christmas Trees / All Christmas Trees',
 'Décor & Pillows / Decorative Pillows & Blankets / Throw Pillows / Black Throw Pillows',
 'Furniture / Game Tables & Game Room Furniture / Sports Team Fan Shop & Memorabillia / Life Size Cutouts',
 'Lighting / Ceiling Lights / Pendant Lighting',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Towel Storage / Towel & Robe Hooks',
 'Appliances / Washers & Dryers / Dryers / All Dryers / Gas Dryers',
 'Outdoor / Outdoor Recreation / Backyard Play / Kids Cars & Ride-On Toys',
 'Kitchen & Tabletop / Small Kitchen Appliances / Coffee, Espresso, & Tea / Coffee Makers',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Showers & Bathtubs / Showers & Bathtubs Plumbing / Shower Heads',
 'Outdoor / Outdoor & Patio Furniture / Outdoor Seating & Patio Chairs / Patio Seating / Patio Sofas & Sectionals / Sectional Patio Sofas & Sectionals',
 'Lighting / Wall Lights / Under Cabinet Lighting',
 'Foodservice / Foodservice Tables / Table Parts',
 'Lighting / Outdoor Lighting / Landscape Lighting / All Landscape Lighting / Fence Post Cap Landscape Lighting',
 'Lighting / Outdoor Lighting / Landscape Lighting / All Landscape Lighting',
 'Outdoor / Outdoor & Patio Furniture / Outdoor Tables / All Patio Tables',
 'Commercial Business Furniture / Commercial Office Furniture / Office Storage & Filing / Office Carts & Stands / All Carts & Stands / Utility Carts & Stands',
 'Outdoor / Outdoor & Patio Furniture / Outdoor Seating & Patio Chairs / Outdoor Chaise & Lounge Chairs',
 'Furniture / Living Room Furniture / Chairs & Seating / Recliners / Brown Recliners',
 'Pet / Bird / Bird Perches & Play Gyms',
 'Décor & Pillows / Picture Frames & Albums / All Picture Frames / Single Picture Picture Frames',
 'Lighting / Outdoor Lighting / Outdoor Lanterns & Lamps',
 'Home Improvement / Hardware / Cabinet Hardware / Cabinet & Drawer Pulls',
 'Bed Accessories',
 'Clips/Clamps',
 'Décor & Pillows / Wall Décor / Wall Decals',
 'Home Improvement / Flooring, Walls & Ceiling / Floor Tiles & Wall Tiles',
 'Bed & Bath / Bedding / Sheets & Pillowcases / Twin XL Sheets & Pillowcases',
 'Kitchen & Tabletop / Tableware & Drinkware / Serveware / Serving Trays & Boards / Serving Trays & Platters / Serving Serving Trays & Platters',
 'Holiday Décor / Holiday Lighting',
 'Décor & Pillows / Wall Décor / Memo Boards',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Toilets & Bidets / Toilet Paper Holders / Wall Mounted Toilet Paper Holders',
 'Décor & Pillows / Window Treatments / Curtains & Drapes / 63 Inch and Less Curtains & Drapes',
 'Home Improvement / Doors & Door Hardware / Door Hardware & Accessories / Door Knobs / Egg Door Knobs',
 'Décor & Pillows / Clocks / Wall Clocks / Analog Wall Clocks',
 'Home Improvement / Doors & Door Hardware / Interior Doors / Sliding Interior Doors',
 'Outdoor / Outdoor Recreation / Outdoor Games / All Outdoor Games',
 'Home Improvement / Doors & Door Hardware / Door Hardware & Accessories / Door Levers / Round Door Levers',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Sheds / Storage Sheds',
 'Home Improvement / Doors & Door Hardware / Door Hardware & Accessories / Door Levers',
 'School Furniture and Supplies / School Furniture / School Tables / Folding Tables / Wood Folding Tables',
 'Décor & Pillows / Wall Décor / Wall Accents / Green Wall Accents',
 'School Furniture and Supplies / Facilities & Maintenance / Commercial Signage',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Garage Storage Cabinets',
 'Furniture / Bedroom Furniture / Dressers & Chests / Beige Dressers & Chests',
 'Storage & Organization / Wall Shelving & Organization / Wall & Display Shelves',
 'Furniture / Game Tables & Game Room Furniture / Dartboards & Cabinets',
 'Outdoor / Outdoor Décor / Outdoor Pillows & Cushions / Patio Furniture Cushions / Lounge Chair Patio Furniture Cushions',
 'Outdoor / Outdoor & Patio Furniture / Patio Furniture Sets / Patio Dining Sets / Two Person Patio Dining Sets',
 'Décor & Pillows / Decorative Pillows & Blankets / Throw Pillows / Ivory & Cream Throw Pillows',
 'Appliances / Washers & Dryers / Washer & Dryer Sets / Black Washer & Dryer Sets',
 'School Furniture and Supplies / School Furniture / School Chairs & Seating / Stackable Chairs',
 'Home Improvement / Hardware / Cabinet Hardware / Cabinet & Drawer Pulls / Brass Cabinet & Drawer Pulls',
 'School Furniture and Supplies / School Boards & Technology / AV, Mounts & Tech Accessories / Electronic Mounts & Stands / Computer Mounts',
 'Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs / Papasan Accent Chairs',
 'Storage & Organization / Shoe Storage / All Shoe Storage / Rack Shoe Storage',
 'Storage & Organization / Shoe Storage / All Shoe Storage / Cabinet Shoe Storage',
 'Storage & Organization / Storage Containers & Drawers / Storage Drawers',
 'Appliances / Kitchen Appliances / Wine & Beverage Coolers / Water Coolers',
 'Furniture / Living Room Furniture / Chairs & Seating / Rocking Chairs',
 'Kitchen & Tabletop / Tableware & Drinkware / Serveware / Serving Bowls & Baskets / Serving Bowls / NA Serving Bowls',
 'Furniture / Living Room Furniture / TV Stands & Media Storage Furniture / Projection Screens / Inflatable Projection Screens',
 'Appliances / Kitchen Appliances / Large Appliance Parts & Accessories',
 'Storage & Organization / Bathroom Storage & Organization / Hampers & Laundry Baskets / Laundry Hampers & Laundry Baskets',
 'Furniture / Office Furniture / Office Stools',
 'Outdoor / Outdoor & Patio Furniture / Outdoor Seating & Patio Chairs / Patio Seating / Outdoor Club Chairs / Metal Outdoor Club Chairs',
 'School Furniture and Supplies / School Furniture / School Tables / Folding Tables',
 'Lighting / Wall Lights / Bathroom Vanity Lighting / Traditional Bathroom Vanity Lighting',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Sinks & Faucet Components / Bathroom Sink Faucets / Centerset Bathroom Sink Faucets',
 'Décor & Pillows / Flowers & Plants / Faux Flowers / Orchid Faux Flowers',
 'Home Improvement / Flooring, Walls & Ceiling / Floor Tiles & Wall Tiles / Metal Floor Tiles & Wall Tiles',
 'Home Improvement / Kitchen Remodel & Kitchen Fixtures / Kitchen Sinks & Faucet Components / Kitchen Sinks',
 'Storage & Organization / Garage & Outdoor Storage & Organization / Outdoor Covers / Grill Covers / Charcoal Grill Grill Covers',
 'Outdoor / Outdoor Décor / Outdoor Wall Décor',
 'Storage & Organization / Cleaning & Laundry Organization / Laundry Room Organizers',
 'Reception Area / Reception Seating / Reception Sofas & Loveseats',
 'Kitchen & Tabletop / Cookware & Bakeware / Baking Sheets & Pans / Bread & Loaf Pans / Steel Bread & Loaf Pans',
 'Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs / Wingback Accent Chairs',
 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Showers & Bathtubs / Showers & Bathtubs Plumbing / Shower Heads / Fixed Shower Heads',
 'Kitchen & Tabletop / Kitchen Utensils & Tools / Kitchen Gadgets / Pasta Makers & Accessories',
 'School Furniture and Supplies / School Furniture / School Chairs & Seating / Classroom Chairs / High School & College Classroom Chairs',
 'Furniture / Living Room Furniture / Sectionals / Stationary Sectionals',
 'Furniture / Kitchen & Dining Furniture / Sideboards & Buffets / Drawer Equipped Sideboards & Buffets',
 'Kitchen & Tabletop / Cookware & Bakeware / Baking Sheets & Pans / Bread & Loaf Pans',
 'Kitchen & Tabletop / Kitchen Utensils & Tools / Cooking Utensils / All Cooking Utensils / Kitchen Cooking Utensils',
 'Décor & Pillows / Flowers & Plants / Live Plants',
 'Furniture / Living Room Furniture / TV Stands & Media Storage Furniture / Projection Screens / Folding Frame Projection Screens',
 'Kitchen & Tabletop / Kitchen Organization / Food Storage & Canisters / Kitchen Canisters & Jars / Metal Kitchen Canisters & Jars',
 'Outdoor / Outdoor Décor / Outdoor Fountains',
 'Outdoor / Outdoor Shades / Pergolas / Wood Pergolas',
 'Décor & Pillows / Candles & Holders / Candle Holders / Sconce Candle Holders',
 'Kitchen & Tabletop / Tableware & Drinkware / Serveware / Cake & Tiered Stands',
 'Home Improvement / Kitchen Remodel & Kitchen Fixtures / Kitchen Sinks & Faucet Components / Kitchen Faucets / Chrome Kitchen Faucets',
 'Décor & Pillows / Decorative Pillows & Blankets / Throw Pillows / White Throw Pillows',
 'Outdoor / Outdoor Fencing & Flooring / Turf',
 'Décor & Pillows / Window Treatments / Valances & Kitchen Curtains',
 'Home Improvement / Hardware / Cabinet Hardware / Cabinet & Drawer Knobs / Black Cabinet & Drawer Knobs',
 'Home Improvement / Kitchen Remodel & Kitchen Fixtures / Kitchen Sinks & Faucet Components / Kitchen Faucets / Bronze Kitchen Faucets',
 'Appliances / Washers & Dryers / Washer & Dryer Sets',
 'Décor & Pillows / Clocks / Mantel & Tabletop Clocks',
 'Home Improvement / Doors & Door Hardware / Interior Doors',
 'Storage & Organization / Wall Shelving & Organization / Wall & Display Shelves / Floating Wall & Display Shelves',
 'Outdoor / Outdoor Recreation / Backyard Play / Climbing Toys & Slides',
 'Home Improvement / Building Equipment / Dollies / Hand Truck Dollies',
 'Baby & Kids / Toddler & Kids Bedroom Furniture / Baby & Kids Dressers',
 'Décor & Pillows / Mirrors / All Mirrors / Leaning & Floor Mirrors',
 'Kitchen & Tabletop / Tableware & Drinkware / Drinkware / Mugs & Teacups',
 'Décor & Pillows / Flowers & Plants / Wreaths',
 'Outdoor / Outdoor Shades / Pergolas / Metal Pergolas',
 'Bed & Bath / Bedding / Sheets & Pillowcases / Twin Sheets & Pillowcases',
 'Outdoor / Outdoor Shades / Pergolas',
 'Reception Area / Reception Seating / Office Sofas & Loveseats',
 'Décor & Pillows / Home Accessories / Indoor Fountains',
 'Kitchen & Tabletop / Kitchen Organization / Food Storage & Canisters / Kitchen Canisters & Jars / Ceramic Kitchen Canisters & Jars',
 'Décor & Pillows / Window Treatments / Curtain Hardware & Accessories / Bracket Curtain Hardware & Accessories',
 'Home Improvement / Flooring, Walls & Ceiling / Walls & Ceilings / Accent Tiles / Ceramic Accent Tiles',
 'Home Improvement / Flooring, Walls & Ceiling / Walls & Ceilings / Accent Tiles',
 'Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs / Arm Accent Chairs',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Coffee Tables / Free Form Coffee Tables',
 'Décor & Pillows / Flowers & Plants / Faux Flowers / Rose Faux Flowers',
 'Bed & Bath / Mattresses & Foundations / Innerspring Mattresses / Twin Innerspring Mattresses',
 'Outdoor / Outdoor Décor / Outdoor Pillows & Cushions / Patio Furniture Cushions / Dining Chair Patio Furniture Cushions',
 'Furniture / Living Room Furniture / TV Stands & Media Storage Furniture / TV Stands & Entertainment Centers / Traditional TV Stands & Entertainment Centers',
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Plant Stands & Tables / Square Plant Stands & Tables',
 'Storage & Organization / Wall Shelving & Organization / Wall & Display Shelves / Corner Wall & Display Shelves',
 "Rugs / Area Rugs / 3' x 5' Area Rugs",
 'Kitchen & Tabletop / Tableware & Drinkware / Drinkware / Mugs & Teacups / Coffee Mugs & Teacups',
 'Contractor / Entry & Hallway / Coat Racks & Umbrella Stands / Wall Mounted Coat Racks & Umbrella Stands',
 "Baby & Kids / Toddler & Kids Playroom / Indoor Play / Kids' Playhouses",
 'Furniture / Living Room Furniture / Coffee Tables & End Tables / Coffee Tables / Square Coffee Tables',
 'Baby & Kids / Toddler & Kids Playroom / Indoor Play / Dollhouses & Accessories',
 'Bed & Bath / Bedding / All Bedding / Queen Bedding',
 'No Classification Fits'
]

classifications_list = sorted(get_args(FullyQualifiedClassifications))

known_categories = set([c.split(" / ")[0].strip() for c in classifications_list])
known_sub_categories = set([c.split(" / ")[1].strip() for c in classifications_list if len(c.split(" / ")) > 1])

known_sub_categories


class Query(BaseModel):
    """
    Base model for search queries, containing common query attributes.
    """
    keywords: str = Field(
        ...,
        description="The original search query keywords sent in as input"
    )


class QueryClassification(Query):
    """
    Structured representation of a search query for furniture e-commerce.
    Inherits keywords from the base Query model and adds category and sub-category.
    """
    classification: FullyQualifiedClassifications = Field(
        description="A classification for the product. Use 'No Classification Fits' if not a clear best classification."
    )

    @property
    def category(self):
        if self.classification == "No Classification Fits":
            return "No Category Fits"
        return self.classification.split(" / ")[0]

    @property
    def sub_category(self):
        if self.classification == "No Classification Fits":
            return "No SubCategory Fits"
        if len(self.classification.split(" / ")) < 2:
            return "No SubCategory Fits"
        return self.classification.split(" / ")[1]



### Query classification code

This code is quite similar to previous iterations, **change** -- we're returning the classification now. Then we use this to parse out the category (top level) and sub category (second level)

In [None]:
enricher = AutoEnricher(
     model="openai/gpt-4o",
     system_prompt="You are a helpful furniture shopping agent that helps users construct search queries.",
     response_model=QueryClassification
)

def get_prompt_fully_qualified(query):
        prompt = f"""
        As a helpful agent, you'll recieve requests from users looking for furniture products.

        Your task is to search with a structured query against a furniture product catalog.

        Here is the users request:

        {query}

        Return the 'Classification': the best classification in the schema for this user's query.
        Return 'No Classification Fits' if no classification fits or if its ambiguous
        """

        return prompt

def fully_classified(query):
    prompt = get_prompt_fully_qualified(query)
    return enricher.enrich(prompt)

fully_classified("dinosaur"), fully_classified("sofa loveseat")

(QueryClassification(keywords='dinosaur', classification='No Classification Fits'),
 QueryClassification(keywords='sofa loveseat', classification='Furniture / Living Room Furniture / Sofas'))

### Redefine ground truth

This is the same ground truth from previous notebooks.

We map the query to relevant products, then look to see the dominant category is for that query

In [None]:
from cheat_at_search.wands_data import labeled_query_products, queries

def get_top_category(column, no_fit_label, cutoff=0.8):
    # Get relevant products per query
    top_products = labeled_query_products[labeled_query_products['grade'] == 2]

    # Aggregate top categories
    categories_per_query_ideal = top_products.groupby('query')[column].value_counts().reset_index()

    # Get as percentage of all categories for this query
    top_cat_proportion = categories_per_query_ideal.groupby(['query', column]).sum() / categories_per_query_ideal.groupby('query').sum()
    top_cat_proportion = top_cat_proportion.drop(columns=column).reset_index()

    # Only look at cases where the category is > 0.8
    top_cat_proportion = top_cat_proportion[top_cat_proportion['count'] > cutoff]
    top_cat_proportion[column].fillna(no_fit_label, inplace=True)
    ground_truth_cat = top_cat_proportion
    # Give No Category Fits to all others without dominant category
    ground_truth_cat = ground_truth_cat.merge(queries, how='right', on='query')[['query', column, 'count']]
    ground_truth_cat[column].fillna(no_fit_label, inplace=True)
    return ground_truth_cat

def get_pred(cat, column):
    if column == 'category':
        return cat.category
    elif column == 'sub_category':
        return cat.sub_category
    else:
        raise ValueError(f"Unknown column {column}")


def prec_cat(ground_truth, column, no_fit_label, categorized):
    hits = []
    misses = []
    for _, row in ground_truth.sample(frac=1).iterrows():
        query = row['query']
        expected_category = row[column]

        cat = categorized(query)
        pred = get_pred(cat, column)
        if pred == no_fit_label:
            print(f"Skipping {query}")
            continue
        if pred == expected_category.strip():
            hits.append((expected_category, cat))
        else:
            print("***")
            print(f"{query} -- predicted:{cat.category} != expected:{expected_category.strip()}")
            misses.append((expected_category, cat))
            num_so_far = len(hits) + len(misses)
            print(f"prec (N={num_so_far}) -- {len(hits) / (len(hits) + len(misses))}")
            print(f"coverage {num_so_far / len(ground_truth)}")

    return len(hits) / (len(hits) + len(misses)), num_so_far / len(ground_truth)

ground_truth_cat = get_top_category('category', 'No Category Fits')
ground_truth_sub_cat = get_top_category('sub_category', 'No SubCategory Fits')

prec, coverage = prec_cat(ground_truth_cat,
                          'category', 'No Category Fits', fully_classified)

prec, coverage

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  top_cat_proportion[column].fillna(no_fit_label, inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  ground_truth_cat[column].fillna(no_fit_label, inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on

***
foutains with brick look -- predicted:Outdoor != expected:
prec (N=2) -- 0.5
coverage 0.004166666666666667
Skipping kelly clarkson light fixtures
***
chair pillow cushion -- predicted:Décor & Pillows != expected:No Category Fits
prec (N=6) -- 0.6666666666666666
coverage 0.0125
***
ottoman bed queen -- predicted:Furniture != expected:No Category Fits
prec (N=7) -- 0.5714285714285714
coverage 0.014583333333333334
Skipping refrigerator with ice an water in door
***
pennfield playhouse -- predicted:Baby & Kids != expected:
prec (N=12) -- 0.6666666666666666
coverage 0.025
Skipping bedroom accessories
Skipping outdoor clock
***
urban outfitters duvet -- predicted:Bed & Bath != expected:No Category Fits
prec (N=15) -- 0.6666666666666666
coverage 0.03125
***
moen multi function dual shower head -- predicted:Home Improvement != expected:No Category Fits
prec (N=16) -- 0.625
coverage 0.03333333333333333
Skipping camper
***
self enclosed planters -- predicted:Outdoor != expected:No Category F

(0.6685878962536023, 0.7125)

### On impacted queries

We know from the "perfect classifier" these queries would most benefit from this change. Let's see how we perform on these queries.


In [None]:

impacted_queries = [
 'drum picture',
 'bathroom freestanding cabinet',
 'outdoor lounge chair',
 'wood rack wide',
 'outdoor light fixtures',
 'bathroom vanity knobs',
 'door jewelry organizer',
 'beds that have leds',
 'non slip shower floor tile',
 'turquoise chair',
 'modern outdoor furniture',
 'podium with locking cabinet',
 'closet storage with zipper',
 'barstool patio sets',
 'ayesha curry kitchen',
 'led 60',
 'wisdom stone river 3-3/4',
 'liberty hardware francisco',
 'french molding',
 'glass doors for bath',
 'accent leather chair',
 'dark gray dresser',
 'wainscoting ideas',
 'floating bed',
 'dining table vinyl cloth',
 'entrance table',
 'storage dresser',
 'almost heaven sauna',
 'toddler couch fold out',
 'outdoor welcome rug',
 'wooden chair outdoor',
 'emma headboard',
 'outdoor privacy wall',
 'driftwood mirror',
 'white abstract',
 'bedroom accessories',
 'bathroom lighting',
 'light and navy blue decorative pillow',
 'gnome fairy garden',
 'medium size chandelier',
 'above toilet cabinet',
 'odum velvet',
 'ruckus chair',
 'modern farmhouse lighting semi flush mount',
 'teal chair',
 'bedroom wall decor floral, multicolored with some teal (prints)',
 'big basket for dirty cloths',
 'milk cow chair',
 'small wardrobe grey',
 'glow in the dark silent wall clock',
 'medium clips',
 'desk for kids tjat ate 10 year old',
 'industrial pipe dining  table',
 'itchington butterfly',
 'midcentury tv unit',
 'gas detector',
 'fleur de lis living candle wall sconce bronze',
 'zodiac pillow',
 'papasan chair frame only',
 'bed side table']


prec, coverage = prec_cat(ground_truth_cat[ground_truth_cat['query'].isin(impacted_queries)],
                          'category', 'No Category Fits', fully_classified)
prec, coverage

Skipping odum velvet
Skipping outdoor privacy wall
Skipping toddler couch fold out
Skipping liberty hardware francisco
Skipping gas detector
Skipping bedroom accessories
Skipping drum picture
Skipping closet storage with zipper
***
medium clips -- predicted:Clips/Clamps != expected:Clips
prec (N=12) -- 0.9166666666666666
coverage 0.2
Skipping dining table vinyl cloth
Skipping milk cow chair
Skipping white abstract
Skipping floating bed
Skipping led 60
Skipping ayesha curry kitchen
Skipping wisdom stone river 3-3/4
Skipping door jewelry organizer
Skipping ruckus chair
***
emma headboard -- predicted:Furniture != expected:Baby & Kids
prec (N=34) -- 0.9411764705882353
coverage 0.5666666666666667
Skipping zodiac pillow
Skipping itchington butterfly
Skipping wainscoting ideas


(0.95, 0.5666666666666667)

In [None]:
prec, coverage = prec_cat(ground_truth_cat[~ground_truth_cat['query'].isin(impacted_queries)],
                          'category', 'No Category Fits', fully_classified)
prec, coverage

Skipping mid century modern
***
cake plates with tops -- predicted:Kitchen & Tabletop != expected:No Category Fits
prec (N=4) -- 0.75
coverage 0.009523809523809525
Skipping wire basket with dividers
***
coffee container -- predicted:Kitchen & Tabletop != expected:No Category Fits
prec (N=6) -- 0.6666666666666666
coverage 0.014285714285714285
***
bistro table and chairs -- predicted:Furniture != expected:Outdoor
prec (N=8) -- 0.625
coverage 0.01904761904761905
***
outdoor lounge cushions -- predicted:Outdoor != expected:
prec (N=9) -- 0.5555555555555556
coverage 0.02142857142857143
***
semi flush foyer light -- predicted:Lighting != expected:No Category Fits
prec (N=11) -- 0.5454545454545454
coverage 0.02619047619047619
***
porcelain loaf pan -- predicted:Kitchen & Tabletop != expected:No Category Fits
prec (N=13) -- 0.5384615384615384
coverage 0.030952380952380953
Skipping bee
***
meade mirror -- predicted:Décor & Pillows != expected:No Category Fits
prec (N=16) -- 0.5625
coverage 0.03

(0.6319218241042345, 0.7309523809523809)

## Run Category search strategy with classifier

Here we have an identical strategy as before to boost category / sub category

Notice in the `search` method we go through the following flow:

1. Classify query -> classification, using the passed function `query_to_cat`
2. Perform a normal BM25 boost
3. Boost category matches by category_boost
4. Boost subcategory matches by subcategory boost

**Idea to try** -- try REQUIRING a BM25 match before applying the category boost instead of just adding to the BM25 score -- which would include anything with BM25 score of 0.

In [None]:
from searcharray import SearchArray
from cheat_at_search.tokenizers import snowball_tokenizer
from cheat_at_search.strategy.strategy import SearchStrategy
import numpy as np


class CategorySearch(SearchStrategy):
    def __init__(self, products, query_to_cat,
                 name_boost=9.3,
                 description_boost=4.1,
                 category_boost=10,
                 sub_category_boost=5):
        super().__init__(products)
        self.index = products
        self.index['product_name_snowball'] = SearchArray.index(
            products['product_name'], snowball_tokenizer)
        self.index['product_description_snowball'] = SearchArray.index(
            products['product_description'], snowball_tokenizer)

        cat_split = products['category hierarchy'].fillna('').str.split("/")

        products['category'] = cat_split.apply(
            lambda x: x[0].strip() if len(x) > 0 else ""
        )
        products['subcategory'] = cat_split.apply(
            lambda x: x[1].strip() if len(x) > 1 else ""
        )
        self.index['category_snowball'] = SearchArray.index(
            products['category'], snowball_tokenizer
        )
        self.index['subcategory_snowball'] = SearchArray.index(
            products['subcategory'], snowball_tokenizer
        )

        self.query_to_cat = query_to_cat
        self.name_boost = name_boost
        self.description_boost = description_boost
        self.category_boost = category_boost
        self.sub_category_boost = sub_category_boost

    def search(self, query, k=10):
        """Dumb baseline lexical search, but add a constant boost when
           the desired category or subcategory"""
        bm25_scores = np.zeros(len(self.index))
        structured = self.query_to_cat(query)
        tokenized = snowball_tokenizer(query)

        # ****
        # Baseline BM25 search from before
        for token in tokenized:
            bm25_scores += self.index['product_name_snowball'].array.score(token) * self.name_boost
            bm25_scores += self.index['product_description_snowball'].array.score(
                token) * self.description_boost


        # ****
        # If there's a subcategory, boost that by a constant amount
        if structured.sub_category and structured.sub_category != "No SubCategory Fits":
            tokenized_subcategory = snowball_tokenizer(structured.sub_category)
            subcategory_match = np.zeros(len(self.index))
            if tokenized_subcategory:
                subcategory_match = self.index['subcategory_snowball'].array.score(tokenized_subcategory) > 0
            bm25_scores[subcategory_match] += self.sub_category_boost

        # ****
        # If there's a category, boost that by a constant amount
        if structured.category and structured.category != "No Category Fits":
            tokenized_category = snowball_tokenizer(structured.category)
            category_match = np.zeros(len(self.index))
            if tokenized_category:
                category_match = self.index['category_snowball'].array.score(tokenized_category) > 0
            bm25_scores[category_match] += self.category_boost

        top_k = np.argsort(-bm25_scores)[:k]
        scores = bm25_scores[top_k]

        return top_k, scores


In [None]:
categorized_search = CategorySearch(products, fully_classified)
graded_fully_classified = run_strategy(categorized_search)
graded_fully_classified

2025-10-15 12:07:07,394 - searcharray.indexing - INFO - Indexing begins w/ 4 workers


INFO:searcharray.indexing:Indexing begins w/ 4 workers


2025-10-15 12:07:07,403 - searcharray.indexing - INFO - 0 Batch Start tokenization


INFO:searcharray.indexing:0 Batch Start tokenization


2025-10-15 12:07:07,405 - searcharray.indexing - INFO - Tokenizing 42994 documents


INFO:searcharray.indexing:Tokenizing 42994 documents


2025-10-15 12:07:07,694 - searcharray.indexing - INFO - Tokenized 10000 (23.259059403637718%)


INFO:searcharray.indexing:Tokenized 10000 (23.259059403637718%)


2025-10-15 12:07:08,009 - searcharray.indexing - INFO - Tokenized 20000 (46.518118807275435%)


INFO:searcharray.indexing:Tokenized 20000 (46.518118807275435%)


2025-10-15 12:07:08,309 - searcharray.indexing - INFO - Tokenized 30000 (69.77717821091315%)


INFO:searcharray.indexing:Tokenized 30000 (69.77717821091315%)


2025-10-15 12:07:08,602 - searcharray.indexing - INFO - Tokenized 40000 (93.03623761455087%)


INFO:searcharray.indexing:Tokenized 40000 (93.03623761455087%)


2025-10-15 12:07:08,805 - searcharray.indexing - INFO - Tokenization -- vstacking


INFO:searcharray.indexing:Tokenization -- vstacking


2025-10-15 12:07:08,811 - searcharray.indexing - INFO - Tokenization -- DONE


INFO:searcharray.indexing:Tokenization -- DONE


2025-10-15 12:07:08,817 - searcharray.indexing - INFO - Inverting docs->terms


INFO:searcharray.indexing:Inverting docs->terms


2025-10-15 12:07:08,859 - searcharray.indexing - INFO - Encoding positions to bit array


INFO:searcharray.indexing:Encoding positions to bit array


2025-10-15 12:07:08,915 - searcharray.indexing - INFO - Batch tokenization complete


INFO:searcharray.indexing:Batch tokenization complete


2025-10-15 12:07:08,917 - searcharray.indexing - INFO - (main thread) Processing 1 batch results


INFO:searcharray.indexing:(main thread) Processing 1 batch results


2025-10-15 12:07:08,968 - searcharray.indexing - INFO - Indexing from tokenization complete


INFO:searcharray.indexing:Indexing from tokenization complete


2025-10-15 12:07:08,995 - searcharray.indexing - INFO - Indexing begins w/ 4 workers


INFO:searcharray.indexing:Indexing begins w/ 4 workers


2025-10-15 12:07:09,007 - searcharray.indexing - INFO - 0 Batch Start tokenization


INFO:searcharray.indexing:0 Batch Start tokenization


2025-10-15 12:07:09,010 - searcharray.indexing - INFO - Tokenizing 42994 documents


INFO:searcharray.indexing:Tokenizing 42994 documents


2025-10-15 12:07:10,155 - searcharray.indexing - INFO - Tokenized 10000 (23.259059403637718%)


INFO:searcharray.indexing:Tokenized 10000 (23.259059403637718%)


2025-10-15 12:07:11,295 - searcharray.indexing - INFO - Tokenized 20000 (46.518118807275435%)


INFO:searcharray.indexing:Tokenized 20000 (46.518118807275435%)


2025-10-15 12:07:12,474 - searcharray.indexing - INFO - Tokenized 30000 (69.77717821091315%)


INFO:searcharray.indexing:Tokenized 30000 (69.77717821091315%)


2025-10-15 12:07:14,358 - searcharray.indexing - INFO - Tokenized 40000 (93.03623761455087%)


INFO:searcharray.indexing:Tokenized 40000 (93.03623761455087%)


2025-10-15 12:07:15,154 - searcharray.indexing - INFO - Tokenization -- vstacking


INFO:searcharray.indexing:Tokenization -- vstacking


2025-10-15 12:07:15,188 - searcharray.indexing - INFO - Tokenization -- DONE


INFO:searcharray.indexing:Tokenization -- DONE


2025-10-15 12:07:15,232 - searcharray.indexing - INFO - Inverting docs->terms


INFO:searcharray.indexing:Inverting docs->terms


2025-10-15 12:07:16,079 - searcharray.indexing - INFO - Encoding positions to bit array


INFO:searcharray.indexing:Encoding positions to bit array


2025-10-15 12:07:16,424 - searcharray.indexing - INFO - Batch tokenization complete


INFO:searcharray.indexing:Batch tokenization complete


2025-10-15 12:07:16,426 - searcharray.indexing - INFO - (main thread) Processing 1 batch results


INFO:searcharray.indexing:(main thread) Processing 1 batch results


2025-10-15 12:07:16,634 - searcharray.indexing - INFO - Indexing from tokenization complete


INFO:searcharray.indexing:Indexing from tokenization complete


2025-10-15 12:07:17,066 - searcharray.indexing - INFO - Indexing begins w/ 4 workers


INFO:searcharray.indexing:Indexing begins w/ 4 workers


2025-10-15 12:07:17,075 - searcharray.indexing - INFO - 0 Batch Start tokenization


INFO:searcharray.indexing:0 Batch Start tokenization


2025-10-15 12:07:17,077 - searcharray.indexing - INFO - Tokenizing 42994 documents


INFO:searcharray.indexing:Tokenizing 42994 documents


2025-10-15 12:07:17,261 - searcharray.indexing - INFO - Tokenized 10000 (23.259059403637718%)


INFO:searcharray.indexing:Tokenized 10000 (23.259059403637718%)


2025-10-15 12:07:17,489 - searcharray.indexing - INFO - Tokenized 20000 (46.518118807275435%)


INFO:searcharray.indexing:Tokenized 20000 (46.518118807275435%)


2025-10-15 12:07:17,716 - searcharray.indexing - INFO - Tokenized 30000 (69.77717821091315%)


INFO:searcharray.indexing:Tokenized 30000 (69.77717821091315%)


2025-10-15 12:07:17,902 - searcharray.indexing - INFO - Tokenized 40000 (93.03623761455087%)


INFO:searcharray.indexing:Tokenized 40000 (93.03623761455087%)


2025-10-15 12:07:18,058 - searcharray.indexing - INFO - Tokenization -- vstacking


INFO:searcharray.indexing:Tokenization -- vstacking


2025-10-15 12:07:18,061 - searcharray.indexing - INFO - Tokenization -- DONE


INFO:searcharray.indexing:Tokenization -- DONE


2025-10-15 12:07:18,068 - searcharray.indexing - INFO - Inverting docs->terms


INFO:searcharray.indexing:Inverting docs->terms


2025-10-15 12:07:18,075 - searcharray.indexing - INFO - Encoding positions to bit array


INFO:searcharray.indexing:Encoding positions to bit array


2025-10-15 12:07:18,082 - searcharray.indexing - INFO - Batch tokenization complete


INFO:searcharray.indexing:Batch tokenization complete


2025-10-15 12:07:18,083 - searcharray.indexing - INFO - (main thread) Processing 1 batch results


INFO:searcharray.indexing:(main thread) Processing 1 batch results


2025-10-15 12:07:18,102 - searcharray.indexing - INFO - Indexing from tokenization complete


INFO:searcharray.indexing:Indexing from tokenization complete


2025-10-15 12:07:18,112 - searcharray.indexing - INFO - Indexing begins w/ 4 workers


INFO:searcharray.indexing:Indexing begins w/ 4 workers


2025-10-15 12:07:18,128 - searcharray.indexing - INFO - 0 Batch Start tokenization


INFO:searcharray.indexing:0 Batch Start tokenization


2025-10-15 12:07:18,131 - searcharray.indexing - INFO - Tokenizing 42994 documents


INFO:searcharray.indexing:Tokenizing 42994 documents


2025-10-15 12:07:18,356 - searcharray.indexing - INFO - Tokenized 10000 (23.259059403637718%)


INFO:searcharray.indexing:Tokenized 10000 (23.259059403637718%)


2025-10-15 12:07:18,590 - searcharray.indexing - INFO - Tokenized 20000 (46.518118807275435%)


INFO:searcharray.indexing:Tokenized 20000 (46.518118807275435%)


2025-10-15 12:07:18,802 - searcharray.indexing - INFO - Tokenized 30000 (69.77717821091315%)


INFO:searcharray.indexing:Tokenized 30000 (69.77717821091315%)


2025-10-15 12:07:19,015 - searcharray.indexing - INFO - Tokenized 40000 (93.03623761455087%)


INFO:searcharray.indexing:Tokenized 40000 (93.03623761455087%)


2025-10-15 12:07:19,182 - searcharray.indexing - INFO - Tokenization -- vstacking


INFO:searcharray.indexing:Tokenization -- vstacking


2025-10-15 12:07:19,186 - searcharray.indexing - INFO - Tokenization -- DONE


INFO:searcharray.indexing:Tokenization -- DONE


2025-10-15 12:07:19,192 - searcharray.indexing - INFO - Inverting docs->terms


INFO:searcharray.indexing:Inverting docs->terms


2025-10-15 12:07:19,205 - searcharray.indexing - INFO - Encoding positions to bit array


INFO:searcharray.indexing:Encoding positions to bit array


2025-10-15 12:07:19,215 - searcharray.indexing - INFO - Batch tokenization complete


INFO:searcharray.indexing:Batch tokenization complete


2025-10-15 12:07:19,220 - searcharray.indexing - INFO - (main thread) Processing 1 batch results


INFO:searcharray.indexing:(main thread) Processing 1 batch results


2025-10-15 12:07:19,242 - searcharray.indexing - INFO - Indexing from tokenization complete


INFO:searcharray.indexing:Indexing from tokenization complete
Searching: 100%|██████████| 480/480 [00:12<00:00, 38.84it/s]


Unnamed: 0,product_id,product_name,product_class,category hierarchy,product_description,product_features,rating_count,average_rating,review_count,features,...,query_id,rank,query_class,id,label,grade,discounted_gain,idcg,dcg,ndcg
0,7465,hair salon chair,Massage Chairs|Recliners,Furniture / Living Room Furniture / Chairs & S...,offers a wide selection of professional salon ...,fauxleathertype : pu|legheight-toptobottom:18|...,69.0,4.5,53.0,"[fauxleathertype : pu, legheight-toptobottom:1...",...,0,1,Massage Chairs,80.0,Exact,2.0,3.00,8.786905,8.10119,0.921962
1,25431,barberpub salon massage chair,Massage Chairs,Furniture / Living Room Furniture / Chairs & S...,salon chairs are a wonderful avenue for hairst...,supplierintendedandapproveduse : non residenti...,4.0,5.0,4.0,[supplierintendedandapproveduse : non resident...,...,0,2,Massage Chairs,29.0,Exact,2.0,1.50,8.786905,8.10119,0.921962
2,7468,mercer41 hair salon chair hydraulic styling ch...,Massage Chairs,Furniture / Living Room Furniture / Chairs & S...,mercer41 beauty offers a wide selection profes...,seatfillmaterial : foam|waterrepellant : no re...,1.0,5.0,1.0,"[seatfillmaterial : foam, waterrepellant : no ...",...,0,3,Massage Chairs,104.0,Exact,2.0,1.00,8.786905,8.10119,0.921962
3,39461,professional salon reclining massage chair,Massage Chairs,Furniture / Living Room Furniture / Chairs & S...,new and in a good condition . first-rate metal...,overalldepth-fronttoback:39.4|warrantylength:1...,,,,"[overalldepth-fronttoback:39.4, warrantylength...",...,0,4,Massage Chairs,114.0,Exact,2.0,0.75,8.786905,8.10119,0.921962
4,9234,beauty salon task chair,,Furniture / Office Furniture / Office Chairs,"applicable scene : office , home life , beauty...",overallheight-toptobottom:37|backcolor : brown...,,,,"[overallheight-toptobottom:37, backcolor : bro...",...,0,5,Massage Chairs,32.0,Partial,1.0,0.20,8.786905,8.10119,0.921962
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4795,22194,wine glass rack,Kitchen Sink Storage,Kitchen & Tabletop / Kitchen Organization / Co...,drip-dry up to eight wineglasses with this cle...,glasscapacity:8|countryoforigin : united state...,5.0,4.5,3.0,"[glasscapacity:8, countryoforigin : united sta...",...,487,6,,,,0.0,0.00,8.786905,0.00000,0.000000
4796,40243,madisen hanging wine glass rack,Wine Racks,Kitchen & Tabletop / Tableware & Drinkware / B...,complement your farmhouse kitchen decor with t...,producttype : wine glass rack|overallwidth-sid...,29.0,5.0,20.0,"[producttype : wine glass rack, overallwidth-s...",...,487,7,,,,0.0,0.00,8.786905,0.00000,0.000000
4797,40244,kena hanging wine glass rack,Wine Racks,Kitchen & Tabletop / Tableware & Drinkware / B...,spruce up your farmhouse kitchen decor with th...,warrantylength:1 year|producttype : wine glass...,23.0,5.0,18.0,"[warrantylength:1 year, producttype : wine gla...",...,487,8,,,,0.0,0.00,8.786905,0.00000,0.000000
4798,39976,wall mounted wine glass rack,Wine Racks,Kitchen & Tabletop / Tableware & Drinkware / B...,"the latest addition to this collection , this ...",overallheight-toptobottom:4|design : wall moun...,34.0,4.5,18.0,"[overallheight-toptobottom:4, design : wall mo...",...,487,9,,,,0.0,0.00,8.786905,0.00000,0.000000


### Analyze the results

We notice
* good NDCG change
* limited downside impact to other queries

In [None]:
ndcgs(graded_bm25).mean(), ndcgs(graded_fully_classified).mean()

(np.float64(0.5411098691836396), np.float64(0.5613850878381429))

In [None]:
deltas = ndcg_delta(graded_fully_classified, graded_bm25)

In [None]:
sig_improved = len(deltas[deltas > 0.1])
print(f"Num Significatly Improved: {sig_improved}")
deltas[deltas > 0.1]

Num Significatly Improved: 36


Unnamed: 0_level_0,ndcg
query,Unnamed: 1_level_1
bathroom freestanding cabinet,0.692589
bathroom vanity knobs,0.486519
non slip shower floor tile,0.477081
outdoor lounge chair,0.456442
modern outdoor furniture,0.365533
twin bed frame,0.359391
desk for kids,0.344262
wood rack wide,0.33532
outdoor light fixtures,0.320553
turquoise chair,0.310979


In [None]:
sig_harmed = len(deltas[deltas < -0.1])
print(f"Num Significatly Harmed: {sig_harmed}")
print(f"Prop improved/harmed: {sig_improved / (sig_harmed + sig_improved)} | {sig_harmed / (sig_harmed + sig_improved)}")
deltas[deltas < -0.1]

Num Significatly Harmed: 4
Prop improved/harmed: 0.9 | 0.1


Unnamed: 0_level_0,ndcg
query,Unnamed: 1_level_1
papasan chair frame only,-0.12329
tall storage cabinet,-0.149573
sugar canister,-0.224721
chair pillow cushion,-0.487874


### Look at a query

In [None]:
QUERY = "chair pillow cushion"
fully_classified(QUERY)

QueryClassification(keywords='chair pillow cushion', classification='Décor & Pillows / Decorative Pillows & Blankets / Throw Pillows')

In [None]:
ground_truth_cat[ground_truth_cat['query'] == QUERY]

Unnamed: 0,query,category,count
466,chair pillow cushion,No Category Fits,


In [None]:
ground_truth_sub_cat[ground_truth_sub_cat['query'] == QUERY]

Unnamed: 0,query,sub_category,count
466,chair pillow cushion,No SubCategory Fits,


In [None]:
graded_fully_classified[graded_fully_classified['query'] == QUERY][['product_name', 'category hierarchy', 'grade', 'score']]

Unnamed: 0,product_name,category hierarchy,grade,score
4660,replacement pillows outdoor lounge chair cushion,,2.0,67.630212
4661,indoor/outdoor dining chair cushion and pillow...,Outdoor / Outdoor Décor / Outdoor Pillows & Cu...,2.0,59.443655
4662,peacock throw pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,53.270518
4663,two ocztopus throw pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,51.742871
4664,cancer zodiac throw pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,51.742871
4665,velvet ikat 3 '' lumbar pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,51.559994
4666,finesse ii throw pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,51.503314
4667,navy throw pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,50.613693
4668,marble petroleum ii throw pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,50.306183
4669,marble throw pillow,Décor & Pillows / Decorative Pillows & Blanket...,0.0,50.20072


In [None]:
graded_bm25[graded_bm25['query'] == QUERY][['product_name', 'category hierarchy', 'grade', 'score']]

Unnamed: 0,product_name,category hierarchy,grade,score
4660,replacement pillows outdoor lounge chair cushion,,2.0,67.630212
4661,indoor/outdoor dining chair cushion and pillow...,Outdoor / Outdoor Décor / Outdoor Pillows & Cu...,2.0,59.443655
4662,abbottsmoor dining chair cushion,,2.0,49.44753
4663,zipparoll indoor chair cushion,Kitchen & Tabletop / Tableware & Drinkware / T...,2.0,49.140517
4664,indoor chair cushion,,2.0,48.975367
4665,chair pad cushion,,2.0,47.23853
4666,chair indoor seat cushion,,2.0,45.639469
4667,chair outdoor seat cushion,,2.0,45.57626
4668,dining chair cushion,Kitchen & Tabletop / Tableware & Drinkware / T...,2.0,45.068099
4669,tropical outdoor lounge chair cushion,,2.0,45.01303


In [None]:
products[products['product_name'] == 'gem paper clips , plastic , medium size , 500/box']

Unnamed: 0,product_id,product_name,product_class,category hierarchy,product_description,product_features,rating_count,average_rating,review_count,features,category,sub_category,cat_subcat,product_name_snowball,product_description_snowball,subcategory,category_snowball,subcategory_snowball
36813,36813,"gem paper clips , plastic , medium size , 500/box",Clips/Clamps,Clips/Clamps,bright assorted colored plastic medium clips h...,size : medium|packsize:200 or more|producttype...,,,,"[size : medium, packsize:200 or more, productt...",Clips,Clamps,Clips / Clamps,"Terms({'box', 'medium', '500', 'paper', 'plast...","Terms({'medium', 'magnet', 'prevent', 'non', '...",Clamps,Terms({'clip'}),Terms({'clamp'})
