# Tabular semantic search on top of Amazon products using Superlinked

In this notebook we will explore how Superlinked works in building a tabular semantic search solution with natural language queries.

## Imports

In [1]:
%load_ext autoreload
%autoreload 2

import pandas as pd
from superlinked import framework as sl

from superlinked_app import index, query
from superlinked_app.config import settings

settings.validate_processed_dataset_exists()

  from .autonotebook import tqdm as notebook_tqdm
[32m2025-01-09 17:39:07.183[0m | [1mINFO    [0m | [36msuperlinked_app.config[0m:[36m<module>[0m:[36m9[0m - [1mLoading '.env' file from: /Users/pauliusztin/Documents/01_projects/hands-on-retrieval/.env[0m


## Define the Superlinked app

For exploring how Superlinked multi-attribute indexes and queries work we will use an `InMemory` vector database and executor. 

Mongo will be used when shipping the Superlinked app as a RESTful API.

In [2]:
source: sl.InMemorySource = sl.InMemorySource(
    index.product,
    parser=sl.DataFrameParser(schema=index.product, mapping={index.product.id: "asin"}),
)
executor = sl.InMemoryExecutor(sources=[source], indices=[index.product_index])
app = executor.run()

## Load the processed dataset

In [3]:
df = pd.read_json(settings.PROCESSED_DATASET_PATH, lines=True)
df.head()

Unnamed: 0,asin,type,category,title,description,price,review_rating,review_count
0,B07WP4RXHY,product,[Tools & Home Improvement],YUEPIN U-Tube Clamp 304 Stainless Steel Hose P...,Product Description Specification: Material: 3...,9.99,4.7,54
1,B07VRZTK2N,product,[],"Apron for Women, Waterproof Adjustable Bib Coo...",,11.99,4.0,152
2,B07V2F5SN1,product,"[Arts, Crafts & Sewing]",DIY 5D Diamond Painting by Number Kit for Adul...,Product Description 5D DIY Diamond Painting is...,9.99,4.6,378
3,B00MNLQQ7K,product,"[Patio, Lawn & Garden]","Design Toscano QM2787100 Darby, the Forest Faw...",,40.72,4.7,274
4,B089YD2KK5,product,"[Clothing, Shoes & Jewelry]",Crocs Jibbitz 5-Pack Alien Shoe Charms | Jibbi...,From the brand Previous page Shop Crocs Collec...,9.99,4.7,0


In [4]:
len(df)

850

In [5]:
source.put([df])

pd.set_option("display.max_colwidth", 500)

## Query books using filters & natural queries

In [6]:
results = app.query(
    query.filter_query,
    natural_query="books with a price lower than 100",
    limit=3,
)
results.knn_params

[32m2025-01-09 17:47:41.440[0m | [1mINFO    [0m | [36msuperlinked_app.config[0m:[36m<module>[0m:[36m9[0m - [1mLoading '.env' file from: /Users/pauliusztin/Documents/01_projects/hands-on-retrieval/.env[0m


{'title_weight': 0.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 0.0,
 'price_minimizer_weights': 0.0,
 'limit': 3,
 'natural_query': 'books with a price lower than 100',
 'filter_by_type': 'book',
 'query_description': 'books',
 'filter_by_cateogry': None,
 'review_rating_bigger_than': None,
 'price_smaller_than': 100.0,
 'radius_param': None,
 'description_similar_clause_weight': 1.0}

In [7]:
results.to_pandas()

Unnamed: 0,type,category,title,description,review_rating,review_count,price,id,similarity_score,rank
0,book,[Books],100 Days to Brave: Devotions for Unlocking Your Most Courageous Self,,4.7,0,9.01,031008962X,0.532175,0
1,book,[Books],"Stables: Beautiful Paddocks, Horse Barns, and Tack Rooms",,4.7,100,53.1,0847833143,0.532175,1
2,book,[Books],"Spectrum Algebra 1 Workbook, Grades 6-8 Math Covering Algebra Equations, Fractions, Inequalities, Graphing, Rational Numbers, Classroom or Homeschool Curriculum",,4.6,0,7.86,1483816648,0.532175,2


In [8]:
results = app.query(
    query.filter_query,
    natural_query="books with a price lower than 100 and a rating bigger than 4",
    limit=3,
)
results.knn_params

{'title_weight': 0.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 0.0,
 'price_minimizer_weights': 0.0,
 'limit': 3,
 'natural_query': 'books with a price lower than 100 and a rating bigger than 4',
 'filter_by_type': 'book',
 'query_description': 'books',
 'filter_by_cateogry': None,
 'review_rating_bigger_than': 4.0,
 'price_smaller_than': 100.0,
 'radius_param': None,
 'description_similar_clause_weight': 1.0}

In [9]:
results.to_pandas()

Unnamed: 0,type,category,title,description,review_rating,review_count,price,id,similarity_score,rank
0,book,[Books],100 Days to Brave: Devotions for Unlocking Your Most Courageous Self,,4.7,0,9.01,031008962X,0.532175,0
1,book,[Books],"Stables: Beautiful Paddocks, Horse Barns, and Tack Rooms",,4.7,100,53.1,0847833143,0.532175,1
2,book,[Books],"Spectrum Algebra 1 Workbook, Grades 6-8 Math Covering Algebra Equations, Fractions, Inequalities, Graphing, Rational Numbers, Classroom or Homeschool Curriculum",,4.6,0,7.86,1483816648,0.532175,2


📚 More on how [Superlinked natural queries (NLQ) works](https://rebrand.ly/superlinked-nlq-notebook).

## Query books using tabular semantic search & natural queries

In [10]:
results = app.query(
    query.semantic_query,
    natural_query="books with a price lower than 100",
    limit=3,
)
results.knn_params

{'title_weight': 0.0,
 'description_weight': 0.0,
 'review_rating_maximizer_weight': 0.0,
 'price_minimizer_weights': 1.0,
 'limit': 3,
 'natural_query': 'books with a price lower than 100',
 'filter_by_type': 'book',
 'query_description': 'books',
 'query_title': 'books',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [11]:
results.to_pandas()

Unnamed: 0,type,category,title,description,review_rating,review_count,price,id,similarity_score,rank
0,book,"[Books, Children's Books]",Journey to Star Wars: The Rise of Skywalker A Finn & Poe Adventure (A Choose Your Destiny Chapter Book),,4.7,174,5.99,1368043380,0.999956,0
1,book,[Books],"Spectrum Algebra 1 Workbook, Grades 6-8 Math Covering Algebra Equations, Fractions, Inequalities, Graphing, Rational Numbers, Classroom or Homeschool Curriculum",,4.6,0,7.86,1483816648,0.999924,1
2,book,[Books],100 Days to Brave: Devotions for Unlocking Your Most Courageous Self,,4.7,0,9.01,031008962X,0.9999,2


In [12]:
results = app.query(
    query.semantic_query,
    natural_query="books with a price lower than 100 and a rating bigger than 4",
    limit=3,
)
results.knn_params

{'title_weight': 0.5,
 'description_weight': 0.5,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 1.0,
 'limit': 3,
 'natural_query': 'books with a price lower than 100 and a rating bigger than 4',
 'filter_by_type': 'book',
 'query_description': 'books with a price lower than 100 and a rating bigger than 4',
 'query_title': 'books with a price lower than 100 and a rating bigger than 4',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 0.5,
 'title_similar_clause_weight': 0.5}

In [13]:
results.to_pandas()

Unnamed: 0,type,category,title,description,review_rating,review_count,price,id,similarity_score,rank
0,book,[Books],The Norton Introduction to Literature,,4.5,662,76.5,039393893X,0.795153,0
1,book,[Books],The Serengeti Rules: The Quest to Discover How Life Works and Why It Matters - With a new Q&A with the author,,4.6,333,16.95,0691175683,0.791076,1
2,book,[Books],All Aboard! New York: A City Primer,,4.6,74,9.99,1423640748,0.790805,2


In [14]:
results = app.query(
    query.semantic_query,
    natural_query="Return the top 5 books (along with their review count and price) with the highest reviews rating.",
    limit=3,
)
results.knn_params

{'title_weight': 0.0,
 'description_weight': 0.0,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 0.5,
 'limit': 3,
 'natural_query': 'Return the top 5 books (along with their review count and price) with the highest reviews rating.',
 'filter_by_type': 'book',
 'query_description': 'highest reviews rating',
 'query_title': 'books',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [15]:
results.to_pandas()

Unnamed: 0,type,category,title,description,review_rating,review_count,price,id,similarity_score,rank
0,book,[Books],Choose: An Invitation to the Best Day Ever Adventure,,5.0,62,15.0,1547110600,0.948596,0
1,book,[Books],33 Days to Morning Glory: A Do-It-Yourself Retreat In Preparation for Marian Consecration,,4.9,0,13.49,1596142448,0.948396,1
2,book,[Books],Mom Set Free - Bible Study Book: Good News for Moms Who are Tired of Trying to be Good Enough,,4.8,781,15.99,1430039612,0.947717,2


In [21]:
results = app.query(
    query.semantic_query,
    natural_query="psychology and mindfulness with a rating bigger than 4",
    limit=3,
)
results.knn_params

{'title_weight': 1.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 0.0,
 'limit': 3,
 'natural_query': 'psychology and mindfulness with a rating bigger than 4',
 'filter_by_type': None,
 'query_description': 'psychology and mindfulness',
 'query_title': 'psychology and mindfulness',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [22]:
results.to_pandas()

Unnamed: 0,type,category,title,description,review_rating,review_count,price,id,similarity_score,rank
0,book,[Books],"The Mindful Dragon: A Dragon Book about Mindfulness. Teach Your Dragon To Be Mindful. A Cute Children Story to Teach Kids about Mindfulness, Focus and Peace. (My Dragon Books)",,4.7,623,11.69,1948040107,0.693309,0
1,product,[],"KOOSHOO Headband - Doubles as a Dust Mask - Durable, Eco-Friendly Accessory - Versatile Design to Wear Your Way - Perfect for Yoga, Travel, Sports & Everyday",,4.5,323,25.0,B017R8L9RU,0.676599,1
2,product,"[Clothing, Shoes & Jewelry]","SUITEDNOMAD Compression Packing Cubes Set,Ultralight Travel Organizer Bags",,4.7,0,39.95,B07WN8L87S,0.670774,2


📚 More on how [Superlinked natural queries (NLQ) works](https://rebrand.ly/superlinked-nlq-notebook).

## Find similar books based on a given product

In [23]:
df[df["asin"] == "B07WP4RXHY"]

Unnamed: 0,asin,type,category,title,description,price,review_rating,review_count
0,B07WP4RXHY,product,[Tools & Home Improvement],"YUEPIN U-Tube Clamp 304 Stainless Steel Hose Pipe Cable Strap Clips With Rubber Cushioned (1-21/32""(42mm)-10pcs)","Product Description Specification: Material: 304 Stainless Steel,100% New Rubber Color: Silver Shape: U Shape Quantity: 10 Pieces Note: Note: Since the size above is measured by hand, the size of the actual item you received could be slightly different from the size above. Product Description Specification: Material: 304 Stainless Steel,100% New Rubber Color: Silver Shape: U Shape Quantity: 10 Pieces Note: Note: Since the size above is measured by hand, the size of the actual item you receiv...",9.99,4.7,54


In [24]:
results = app.query(
    query.similar_items_query,
    natural_query="similar books to B07WP4RXHY with a price lower than 100 and a rating bigger than 4",
    limit=3,
)
results.knn_params

{'title_weight': 1.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 1.0,
 'limit': 3,
 'natural_query': 'similar books to B07WP4RXHY with a price lower than 100 and a rating bigger than 4',
 'filter_by_type': None,
 'query_description': 'similar books',
 'query_title': 'similar books',
 'filter_by_cateogry': None,
 'product_id': 'B07WP4RXHY',
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0,
 'with_vector_id_weight_param': 1.0}

In [25]:
results.to_pandas()

Unnamed: 0,type,category,title,description,review_rating,review_count,price,id,similarity_score,rank
0,product,[Tools & Home Improvement],"YUEPIN U-Tube Clamp 304 Stainless Steel Hose Pipe Cable Strap Clips With Rubber Cushioned (1-21/32""(42mm)-10pcs)","Product Description Specification: Material: 304 Stainless Steel,100% New Rubber Color: Silver Shape: U Shape Quantity: 10 Pieces Note: Note: Since the size above is measured by hand, the size of the actual item you received could be slightly different from the size above. Product Description Specification: Material: 304 Stainless Steel,100% New Rubber Color: Silver Shape: U Shape Quantity: 10 Pieces Note: Note: Since the size above is measured by hand, the size of the actual item you receiv...",4.7,54,9.99,B07WP4RXHY,0.916242,0
1,product,[Electronics],"Rankie RJ45 Cat6 Snagless Ethernet Patch Cable, 5-Pack, 3 Feet, Black",,4.7,0,12.99,B01J8MHYO0,0.851214,1
2,product,"[Clothing, Shoes & Jewelry]","SUITEDNOMAD Compression Packing Cubes Set,Ultralight Travel Organizer Bags",,4.7,0,39.95,B07WN8L87S,0.837194,2


## Queries back-to-back with text-to-SQL

This section will run a series of queries to compare them with text-to-SQL.

In the next notebook, we will build a text-to-SQL module with LlamaIndex that will follow the same queries to compare them back-to-back.

### Examples 1: Simple

Let's start with a simple query:

In [26]:
results = app.query(
    query.semantic_query,
    natural_query="books with a price lower than 100 and a rating bigger than 4",
    limit=3,
)
results.knn_params

{'title_weight': 0.5,
 'description_weight': 0.5,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 1.0,
 'limit': 3,
 'natural_query': 'books with a price lower than 100 and a rating bigger than 4',
 'filter_by_type': 'book',
 'query_description': 'books with a price lower than 100 and a rating bigger than 4',
 'query_title': 'books with a price lower than 100 and a rating bigger than 4',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 0.5,
 'title_similar_clause_weight': 0.5}

In [27]:
results.to_pandas()[["type", "title", "price", "review_rating"]]

Unnamed: 0,type,title,price,review_rating
0,book,The Norton Introduction to Literature,76.5,4.5
1,book,The Serengeti Rules: The Quest to Discover How Life Works and Why It Matters - With a new Q&A with the author,16.95,4.6
2,book,All Aboard! New York: A City Primer,9.99,4.6


### Examples 2: Specific categories 

Now, we will complicate the query:

In [28]:
results = app.query(
    query.semantic_query,
    natural_query="psychology and mindfulness with a rating bigger than 4",
    limit=3,
)
results.knn_params

{'title_weight': 1.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 0.0,
 'limit': 3,
 'natural_query': 'psychology and mindfulness with a rating bigger than 4',
 'filter_by_type': None,
 'query_description': 'psychology and mindfulness',
 'query_title': 'psychology and mindfulness',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [29]:
results.to_pandas()[["type", "title", "price", "review_rating"]]

Unnamed: 0,type,title,price,review_rating
0,book,"The Mindful Dragon: A Dragon Book about Mindfulness. Teach Your Dragon To Be Mindful. A Cute Children Story to Teach Kids about Mindfulness, Focus and Peace. (My Dragon Books)",11.69,4.7
1,product,"KOOSHOO Headband - Doubles as a Dust Mask - Durable, Eco-Friendly Accessory - Versatile Design to Wear Your Way - Perfect for Yoga, Travel, Sports & Everyday",25.0,4.5
2,product,"SUITEDNOMAD Compression Packing Cubes Set,Ultralight Travel Organizer Bags",39.95,4.7


In [30]:
results = app.query(
    query.semantic_query,
    natural_query="Return the top items (along with their price) with the highest reviews rating on science",
    limit=3
)
results.knn_params

{'title_weight': 0.5,
 'description_weight': 0.5,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 0.0,
 'limit': 3,
 'natural_query': 'Return the top items (along with their price) with the highest reviews rating on science',
 'filter_by_type': None,
 'query_description': 'science',
 'query_title': 'science',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [31]:
results.to_pandas()[["type", "title", "price", "review_rating"]]

Unnamed: 0,type,title,price,review_rating
0,product,"Rankie RJ45 Cat6 Snagless Ethernet Patch Cable, 5-Pack, 3 Feet, Black",12.99,4.7
1,product,110 Pcs Outer Space Party Supplies - Solar System Party - Galaxy Theme To the Moon Party Blast Off Party Astronaut Universe Theme Party Astronaut Back To The Moon Balloons,9.99,4.7
2,product,"SUITEDNOMAD Compression Packing Cubes Set,Ultralight Travel Organizer Bags",39.95,4.7


### Example 3: Titles or keywords

Let's make it even more complex:

In [32]:
results = app.query(
    query.semantic_query,
    natural_query="Lord of the Rings",
    limit=3,
)
results.knn_params

{'title_weight': 1.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 0.0,
 'price_minimizer_weights': 0.0,
 'limit': 3,
 'natural_query': 'Lord of the Rings',
 'filter_by_type': None,
 'query_description': 'Lord of the Rings',
 'query_title': 'Lord of the Rings',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [33]:
results.to_pandas()[["type", "title", "price", "review_rating"]]

Unnamed: 0,type,title,price,review_rating
0,product,Funko POP! Movies: Lord of The Rings - Lurtz Collectible Figure,42.99,4.8
1,product,LEGO The Hobbit Battle of the Five Armies Witch-king Battle 79015,114.0,4.5
2,product,"Lullabb Friendship Couples Gifts Lava Rock Stone Bracelets for Mens Womens, 8MM Natural Howlite Turquoise Essential Oil Diffuser Beads Bangles for Girls Boys",7.36,4.2


### More examples

In [34]:
results = app.query(
    query.semantic_query,
    natural_query="Return the top books (along with their price and rating) with the highest reviews rating and lowest price.",
    limit=3
)
results.knn_params

{'title_weight': 0.0,
 'description_weight': 0.0,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 1.0,
 'limit': 3,
 'natural_query': 'Return the top books (along with their price and rating) with the highest reviews rating and lowest price.',
 'filter_by_type': 'book',
 'query_description': 'highest reviews rating and lowest price',
 'query_title': 'books',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [35]:
results.to_pandas()[["type", "title", "price", "review_rating"]]

Unnamed: 0,type,title,price,review_rating
0,book,Choose: An Invitation to the Best Day Ever Adventure,15.0,5.0
1,book,33 Days to Morning Glory: A Do-It-Yourself Retreat In Preparation for Marian Consecration,13.49,4.9
2,book,Mom Set Free - Bible Study Book: Good News for Moms Who are Tired of Trying to be Good Enough,15.99,4.8


In [36]:
results = app.query(
    query.semantic_query,
    natural_query="Return the top products (along with their price and review) about cats or dogs with a great price and review",
    limit=3
)
results.knn_params

{'title_weight': 1.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': 1.0,
 'limit': 3,
 'natural_query': 'Return the top products (along with their price and review) about cats or dogs with a great price and review',
 'filter_by_type': None,
 'query_description': 'cats or dogs',
 'query_title': 'cats or dogs',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [37]:
results.to_pandas()[["type", "title", "price", "review_rating"]]

Unnamed: 0,type,title,price,review_rating
0,product,Yorkie Super Soft Slippers - E&S Pets - Yorkie Gifts - Cozy House Slippers - Non Skid Bottom - One Size Fits Most - Sherpa slipper - Pet Lover Gifts For Men And Women,13.98,4.7
1,product,"Pidoko Kids Skylar Dollhouse with 20 Pcs Furniture, 5 Dolls and a Pet Dog",99.97,4.7
2,product,FurryValley Fursuit Paws Furry Partial Cosplay Fluffy Claw Gloves Costume Lion Bear Props for Kids Adults (Gray),108.99,4.1


In [38]:
results = app.query(
    query.semantic_query,
    natural_query="I'm looking for a computer or laptop with a price bigger than 100 and a review bigger than 4",
    limit=3
)
results.knn_params

{'title_weight': 1.0,
 'description_weight': 1.0,
 'review_rating_maximizer_weight': 1.0,
 'price_minimizer_weights': -1.0,
 'limit': 3,
 'natural_query': "I'm looking for a computer or laptop with a price bigger than 100 and a review bigger than 4",
 'filter_by_type': None,
 'query_description': 'computer or laptop',
 'query_title': 'computer or laptop',
 'filter_by_cateogry': None,
 'radius_param': None,
 'description_similar_clause_weight': 1.0,
 'title_similar_clause_weight': 1.0}

In [39]:
results.to_pandas()[["type", "title", "price", "review_rating"]]

Unnamed: 0,type,title,price,review_rating
0,product,ZEBRA GX430t Thermal Transfer Desktop Printer Print Width of 4 in USB Serial Parallel and Ethernet Connectivity GX43-102410-000,598.41,4.1
1,product,MSI Gaming GeForce RTX 3060 Ti LHR 8GB GDRR6 256-Bit HDMI/DP Nvlink Torx Fan 4 RGB Ampere Architecture OC Graphics Card (Gaming X 8G LHR),524.99,4.7
2,product,"Acer Aspire 1 A115-31-C2Y3, 15.6"" Full HD Display, Intel Celeron N4020, 4GB DDR4, 64GB eMMC, 802.11ac Wi-Fi 5, Up to 10-Hours of Battery Life, Microsoft 365 Personal, Windows 10 in S mode, Black",282.99,4.4


# Next Steps

Continue with the **3_tabular_semantic_search_text_to_sql.ipynb** Notebook to learn how to build a tabular semantic search module with text-to-SQL techniques.