# Why Embedding Search is not enough

While embedding-based search has been around for decades, new found popularity has resulted in many business relying solely on embeddings which can lead to suboptimal results, primarily because most teams are not fine-tuning their own embedding models

Additionally, many vector databases lack sophisticated filtering capabilities that are standard in SQL databases. While basic vector similarity search is useful, real-world applications often require complex filtering operations such as:

- Time-based filtering (e.g., products added in the last 30 days)
- Range-based filtering (e.g., price between $50-$100)
- Multi-condition filtering (e.g., in-stock items from specific brands)
- Exact match requirements alongside similarity search

Traditional SQL databases like PostgreSQL with pgvector extension offer these advanced filtering capabilities alongside vector search, providing more precise and relevant results. For a detailed exploration of these capabilities, you can refer to Timescale's tutorial on [combining vector search with time-based filtering](https://www.timescale.com/blog/refining-vector-search-queries-with-time-filters-in-pgvector-a-tutorial/).

# Creating Our Database

> Before continuing with this section, make sure that you've configured a Timescale instance and set a `DB_URL` environment variable inside your shell. If you don’t have a Timescale instance configured, you can create one [in our Getting Started docs](https://docs.timescale.com/getting-started/latest/services/?ref=timescale.ghost.io#create-your-timescale-account).


In this section, we'll initialise a database, ingest our dataset into the `products` table after embedding the description of the products using `psycopg`.

Once we've done so, we'll explore some of the limitations of embedding search and see how we can improve our results by using query understanding.

In [4]:
import psycopg
from psycopg.rows import dict_row
from pgvector.psycopg import register_vector
import os
from dotenv import load_dotenv


load_dotenv()


def get_conn():
    conn = psycopg.connect(os.getenv("DB_URL"), row_factory=dict_row)
    register_vector(conn)
    return conn


SQL_INIT_SCRIPT = """
DROP TABLE IF EXISTS products;

-- Enable required extensions
CREATE EXTENSION IF NOT EXISTS vectorscale CASCADE;


-- Create the products table with all necessary fields
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    description TEXT,
    brand TEXT,
    category TEXT NOT NULL,
    subcategory TEXT,
    quantity INTEGER DEFAULT 0,
    price DECIMAL(10,2) NOT NULL,
    embedding VECTOR(1536)
);

-- Create indexes for common query patterns
CREATE INDEX idx_products_category ON products(category);
CREATE INDEX idx_products_price ON products(price);
"""

conn = get_conn()
conn.execute(SQL_INIT_SCRIPT)
conn.commit()
conn.close()

## Our Dataset

We've uploaded a dataset to huggingface with around 191 products. These products have been categorised according to a category and subcategory and can be found at `ivanleomk/timescale-ecommerce` on huggingface.

We'll use this dataset to populate our database.

In [2]:
from datasets import load_dataset

dataset = load_dataset("ivanleomk/timescale-ecommerce")

In [3]:
dataset["train"][0]

{'id': 1,
 'title': 'Lace Detail Sleeveless Top',
 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=768x1024>,
 'description': "Elevate your casual wardrobe with this elegant sleeveless top featuring intricate lace detailing at the neckline. Perfect for both day and night, it's crafted from a soft, breathable fabric for all-day comfort.",
 'brand': 'H&M',
 'category': 'Tops',
 'subcategory': 'Tank Tops',
 'price': 181.04}

We can see that in our dataset itself, we have a variety of different fields for each product. These fields will come in useful down the line when we're trying to filter our results based on user queries. 

## Ingesting the Dataset

Let’s now move on to ingest our data into our database so that we can run embedding search on it. 

To do so, we’ll embed the description of each product and ingest it into our database. We’re using a semaphore here so that we control the number of active requests and stay below our rate limits..

In [7]:
import openai
from tqdm.asyncio import tqdm_asyncio as asyncio
from asyncio import Semaphore


async def embed_item(client, sem, item):
    async with sem:
        embedding = await client.embeddings.create(
            input=item["description"], model="text-embedding-3-small"
        )
        return {**item, "embedding": embedding.data[0].embedding}


client = openai.AsyncClient()
coros = [embed_item(client, Semaphore(10), item) for item in dataset["train"]]
items = await asyncio.gather(*coros)

We'll then insert these items and the embeddings into our database. We'll also simulate the product quantity by randomly assigning a quantity to each product while setting ~40% of the products will be out of stock.

In [5]:
import random

# Insert all products with their embeddings and format values for bulk insert
insert_query = """
    INSERT INTO products (
        title,
        description,
        brand,
        category,
        subcategory,
        price,
        embedding,
        quantity
    ) VALUES (
        %(title)s, %(description)s, %(brand)s, %(category)s, %(subcategory)s, %(price)s, %(embedding)s, %(quantity)s
    )
"""

values = [
    {
        "id": item["id"],
        "title": item["title"],
        "description": item["description"],
        "brand": item["brand"],
        "category": item["category"],
        "subcategory": item["subcategory"],
        "price": item["price"],
        "embedding": item["embedding"],
        "quantity": 0
        if random.random() <= 0.4
        else random.randint(1, 100),  # 40% of the products will be out of stock
    }
    for item in items
]

with get_conn() as conn:
    cursor = conn.cursor()
    cursor.executemany(insert_query, values)
    conn.commit()


# Query Understanding

One of the most common ways to improve search results is to use query understanding. This involves extracting the user's query and extracting the filters that are most relevant to the user's query. In this section, we'll see how query understanding can help improve our search results and where embedding search falls short.

We'll do so in 3 steps

1. We'll first show how our existing implementation of embedding search fails to retrieve the relevant results for the 3 types of queries - Category-Specific Searches, Price-Specific Searches and Stock-Specific Searches
2. We'll then see how we can use function calling to extract the filters from the user's query.
3. Lastly, we'll see how we can use these new filters to improve our search results 


## Limitations of Embedding Search

Let's start by implementing a helper function that takes a user query and then returns the top25 items that match the user's query.


In [5]:
import numpy as np
import pandas as pd

def search_products(query: str, conn: psycopg.Connection):
    with conn.cursor() as cursor:
        client = openai.OpenAI()
        embedding = (
            client.embeddings.create(input=query, model="text-embedding-3-small")
            .data[0]
            .embedding
        )
        results = cursor.execute(
            "SELECT * FROM products ORDER BY embedding <=> %(embedding)s LIMIT 5",
            {"embedding": np.array(embedding)},
        ).fetchall()

        return pd.DataFrame(results)


### Category-Specific Searches
 
What happens when we search for items with a specific category?

In [8]:
conn = get_conn()
search_products("Cute tops", conn)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,73,Smocked Button Front Crop Top,This chic red crop top features a trendy smock...,Forever 21,Tops,Tank Tops,0,355.15,"[0.05767848, 0.032298118, -0.0013013736, 0.000..."
1,130,Women's Cutout Cropped Top,Elevate your style with this chic cutout cropp...,Forever 21,Tops,Blouses,0,222.68,"[0.06053915, -0.004936928, -0.028292606, -0.00..."
2,5,Plaid Crop Top,This chic plaid crop top features thin straps ...,Zara,Tops,Tank Tops,0,261.05,"[0.038300514, 0.027030528, -0.02846925, 0.0036..."
3,125,Buttoned Scoop Neck Tank Top,This chic tank top features a flattering scoop...,Zara,Tops,Tank Tops,5,156.82,"[0.063474484, 0.004105421, -0.049733024, -0.02..."
4,184,Off-Shoulder Black Top,"Elegant and chic, this off-shoulder black top ...",ZARA,Tops,Blouses,43,357.77,"[0.076061815, 0.0075597474, -0.0065461183, -0...."


In [90]:
conn = get_conn()
search_products("skirts that go well with cute tops", conn)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,159,Scallop Trim Short Sleeve Knit Top,This elegant knit top features a delicate scal...,Zara,Tops,Blouses,0,207.28,"[0.080112636, 0.028300002, -0.029006997, -0.04..."
1,130,Women's Cutout Cropped Top,Elevate your style with this chic cutout cropp...,Forever 21,Tops,Blouses,0,222.68,"[0.06053915, -0.004936928, -0.028292606, -0.00..."
2,78,Women's White Sleeveless Top with Skirt,This elegant white sleeveless top pairs perfec...,H&M,Tops,Tank Tops,84,262.91,"[0.08342941, 0.03202856, -0.04765495, -0.05129..."
3,184,Off-Shoulder Black Top,"Elegant and chic, this off-shoulder black top ...",ZARA,Tops,Blouses,43,357.77,"[0.076061815, 0.0075597474, -0.0065461183, -0...."
4,73,Smocked Button Front Crop Top,This chic red crop top features a trendy smock...,Forever 21,Tops,Tank Tops,0,355.15,"[0.05767848, 0.032298118, -0.0013013736, 0.000..."


We can see that in both cases, the results are similar to the user's query but contain irrelevant items.

in the first case, when we searched for cute winter jackets, we got items such as vests and blouses that are not winter jackets.

In the second case, when we searched for tops that go well with winter jackets, we got items such as sweaters. 

These are items that should not have been retrieved.


### Price-Specific Searches

Let's now see what happens when we search for items with specific price ranges

- I want a sweater under $200
- I’d like a slightly more expensive blouse which is a bit luxurious

When we search for sweaters under $200, we get a variety of items that are above the $200 mark that were were looking for ( some of which are not even sweaters )

In [9]:
conn = get_conn()
search_products("I want a sweater under $200", conn)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,191,Women's Ribbed Long Sleeve Sweater,"This elegant ribbed sweater offers a snug fit,...",H&M,Tops,Sweaters,4,237.07,"[0.029483186, 0.01499368, -0.079264835, -0.034..."
1,113,Women's Color Block Sweater,Stay cozy and stylish with this vibrant color ...,Madewell,Tops,Sweaters,0,148.72,"[0.0239264, 0.009523993, -0.06696731, 0.008265..."
2,121,Women's 3/4 Sleeve Knit Top,This 3/4 sleeve knit top offers a classic and ...,Lands' End,Tops,Sweaters,42,384.26,"[0.054612853, 0.03034311, -0.0572225, -0.03762..."
3,28,Purple Zip-Up Cardigan,This stylish purple zip-up cardigan features a...,Zara,Tops,Cardigans,29,147.16,"[0.02910682, -0.010962445, -0.05825576, -0.017..."
4,53,Nike Air Sweatshirt,Stay comfortable and stylish with the Nike Air...,Nike,Tops,Sweatshirts,66,96.37,"[0.038368914, 0.024989309, -0.109480985, -0.04..."


When we search for luxury sweaters, the results are slightly better (and the price is also higher) but we get a similar issue whereby we're getting results such as t-shirts being retrieved

In [10]:
conn = get_conn()
search_products("I’d like a slightly more expensive blouse which is a bit luxurious", conn)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,157,Elegant Black Bow Blouse,Elevate your wardrobe with this chic black blo...,Zara,Tops,Blouses,65,49.87,"[0.06981218, 0.047272313, 0.012234672, 0.01342..."
1,141,Women's Puff Long Sleeve Blouse,"Elegant and versatile, this women's blouse fea...",H&M,Tops,Blouses,46,198.3,"[0.069838315, 0.0045731487, -0.010328281, -0.0..."
2,14,Women's Lace Blouse,Elevate your wardrobe with this elegant lace b...,Zara,Tops,Blouses,80,88.59,"[0.060136728, 0.01768537, 0.0061753304, -0.036..."
3,147,Women's Leopard Print Sleeveless Blouse,This elegant sleeveless blouse features a bold...,Zara,Tops,Blouses,36,396.63,"[0.05188841, 0.020000853, -0.03112883, -0.0121..."
4,64,Lace Short Sleeve Blouse,This elegant lace blouse from H&M features int...,H&M,Tops,Blouses,53,209.36,"[0.06832883, -0.0053561116, -0.012975514, -0.0..."


We can see that for these two specific results, the retrieved results are slightly better but the user's specific price requirements are not met. 

In the first case, we're getting a knit top that's almost double the maximum price that the user has specified. In the second example, we could also argue that a blouse that's $49 dollars might not be considered luxurious.

### Stock Sensitive Queries

Let's now see what happens when we search for items with a specific stock status - I want a black t-shirt in stock

In [117]:
conn = get_conn()
search_products("Black T-Shirts in stock", conn)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,13,Simple Black T-Shirt,This versatile black t-shirt offers a relaxed ...,H&M,Tops,T-Shirts,0,385.62,"[0.061984405, 0.047937427, -0.059149627, -0.02..."
1,189,Color Block T-Shirt,This stylish color block t-shirt is the perfec...,Gap,Tops,T-Shirts,0,139.78,"[0.07176805, 0.008485796, -0.0411243, 0.004143..."
2,39,Graphic T-Shirt,This stylish Opening Ceremony Graphic T-Shirt ...,Opening Ceremony,Tops,T-Shirts,19,203.73,"[0.107823, 0.04437266, -0.06881595, -0.0184602..."
3,92,Women's Graphic Tiger T-Shirt,This stylish black t-shirt from Just Cavalli f...,Just Cavalli,Tops,T-Shirts,37,315.59,"[0.043493904, -0.003560018, -0.005251157, -0.0..."
4,168,Women's Short Sleeve Logo T-Shirt,This casual short sleeve t-shirt features the ...,Armani Exchange,Tops,T-Shirts,0,38.77,"[0.08079471, 0.03946092, -0.05492298, -0.02218..."


The first item in the results is a t-shirt but is clearly out of stock ( since quantity is marked as 0 ). This is a case whereby the retrieved results aren't relevant to the user's query.

## Query Understanding

Now that we've seen how our existing implementation of embedding search fails to retrieve the relevant results for these queries, let's see how we can use query understanding to improve our search results.

We'll start by defining a function that extracts the filters from the user's query.

In [13]:
from pydantic import BaseModel, Field
from typing import Optional, Literal


class QueryFilters(BaseModel):
    avaliability: Literal["in-stock", "out-of-stock", "all"] = Field(
        description="Stock avaliability of item"
    )
    min_price: Optional[float] = None
    max_price: Optional[float] = None
    category: list[Literal["Outerwear", "Tops", "Activewear", "Dresses", "Bottoms"]]
    subcategory: list[
        Literal[
            "Jeans",
            "Athletic Shirts",
            "Vests",
            "Sweatshirts",
            "Pants",
            "Sweaters",
            "Shorts",
            "Tank Tops",
            "Skirts",
            "Leggings",
            "Cardigans",
            "Casual Dresses",
            "Blouses",
            "T-Shirts",
        ]
    ] = []

We can then pass in the categories and subcategories to the model as context along with the user's query to see how well the model performs.

In [14]:
import instructor

categories = ["Outerwear", "Tops", "Activewear", "Dresses", "Bottoms"]
subcategories = [
    "Jeans",
    "Athletic Shirts",
    "Vests",
    "Sweatshirts",
    "Pants",
    "Sweaters",
    "Shorts",
    "Tank Tops",
    "Skirts",
    "Leggings",
    "Cardigans",
    "Casual Dresses",
    "Blouses",
    "T-Shirts",
]


async def extract_query_filters(client: openai.AsyncOpenAI, query: str, sem: Semaphore):
    async with sem:
        return await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {
                    "role": "system",
                    "content": """
                    You are a helpful assistant that extracts user requirements from a query. 

                    You have the following categories and subcategories that you can use to extract the user's requirements. Err on the side of including more filters rather than less.

                    <categories>
                    {{categories}}
                    </categories>

                    <subcategories>
                    {{subcategories}}
                    </subcategories>

                    <availabilities>
                    in-stock, out-of-stock, all
                    </availabilities>
                    
                    Also extract out min price or max price if the user is looking for a specific price range. Let's work with a range that has at least $40 difference between the min and max price if the user does not specify a specific price (Eg. around 100 is going to translate to $60-$140 but under 100 is going to translate to $0-$100).
                    """,
                },
                {
                    "role": "user",
                    "content": "The user's query is: {{ query }}",
                },
            ],
            context={
                "query": query,
                "categories": categories,
                "subcategories": subcategories,
            },
            response_model=QueryFilters,
        )


client = instructor.from_openai(openai.AsyncOpenAI())
sem = Semaphore(10)
query = "I want a top under $50 to wear tonight for an event"
filters = await extract_query_filters(
    client, query, sem
)
print(filters)

avaliability='in-stock' min_price=0.0 max_price=50.0 category=['Tops'] subcategory=[]


We can see that the function has correctly extracted the relevant filters that we should be applying to our search

1. Firstly, the item should be in stock since the user wants something to wear tonight
2. Secondly, the item should be a top as per the user's request
3. Lastly, the item should have a max price of $50

### SQL Filtering

We can now use these filters to filter our results. Timescale makes this easy for us since  expressing complex filters, conditional joins based off other table fields and more is all possible with simple SQL.

In our case, we'll use a Jinja2 template to generate our SQL query and then pass in the parameters to the query. 




In [16]:
import psycopg
import openai
from jinja2 import Template


async def search_with_filters(
    conn: psycopg.Connection,
    limit: int,
    query: str,
    sem: Semaphore,
    client: openai.AsyncOpenAI,
):
    filters = await extract_query_filters(client, query, sem)
    embedding = (
        (await client.embeddings.create(input=query, model="text-embedding-3-small"))
        .data[0]
        .embedding
    )

    conditions = []

    if filters.min_price:
        conditions.append("price >= %(min_price)s")

    if filters.max_price:
        conditions.append("price <= %(max_price)s")

    if filters.category:
        conditions.append("category = ANY(%(category)s)")

    if filters.subcategory:
        conditions.append("subcategory = ANY(%(subcategory)s)")

    if filters.avaliability == "in-stock":
        conditions.append("quantity > 0")

    template = Template("""
SELECT * FROM products
{% if conditions %}
WHERE {{ conditions | join(' AND ') }}
{% endif %}
ORDER BY embedding <=> %(embedding)s
LIMIT %(limit)s
    """)

    query = template.render(filters=filters, conditions=conditions)

    # Flatten the parameters
    params = {
        "embedding": np.array(embedding),
        "limit": limit,
        "max_price": filters.max_price,
        "min_price": filters.min_price,
        "category": filters.category,
        "subcategory": filters.subcategory,
    }

    with conn.cursor() as cursor:
        results = cursor.execute(query, params).fetchall()

        return pd.DataFrame(results if results else [])


sem = Semaphore(10)
conn = get_conn()
query = "I want a top under $50 to wear tonight for an event"
results = await search_with_filters(conn, 5, query, sem, client)
results

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,66,Black Lace Long Sleeve Top,Elevate your wardrobe with this chic black lac...,H&M,Tops,Blouses,37,49.44,"[0.053532373, 0.0062692645, -0.02314972, -0.05..."
1,152,Long Sleeve Scoop Neck Top,Upgrade your basics with this chic long sleeve...,American Eagle,Tops,T-Shirts,16,15.64,"[0.03440164, 0.014031969, -0.04404597, -0.0138..."
2,157,Elegant Black Bow Blouse,Elevate your wardrobe with this chic black blo...,Zara,Tops,Blouses,65,49.87,"[0.06981218, 0.047272313, 0.012234672, 0.01342..."
3,132,Classic Women's Crew Neck T-Shirt,This timeless crew neck t-shirt offers a relax...,GAP,Tops,T-Shirts,50,14.6,"[0.064989835, 0.04809502, -0.076238886, -0.015..."
4,18,Women's Printed Blouse,Elevate your everyday look with this elegant p...,H&M,Tops,Blouses,79,23.18,"[0.058057968, 0.01887594, -0.0018916512, -0.02..."


With our new `search_with_filters` method, we’re now able to generate a SQL string using `jinja` as seen below.

```python
SELECT * FROM products
WHERE price <= %(max_price)s AND category = ANY(%(category)s) AND quantity > 0
ORDER BY embedding <=> %(embedding)s
LIMIT %(limit)s
```

This corresponds to the filters our language model extracted as 

```python
QueryFilters(
    avaliability='in-stock',
    min_price=0.0,
    max_price=50.0,
    category=['Tops'],
    subcategory=[]
)
```

With this new `search_with_filters` method, we’re able to now apply the relevant filters, resulting in items that are

- Categorised under the main category of Top
- Having a quantity > 0
- Having a price that is under $50

## Our Improved Approach

Let's see how our new approach performs by revisiting some of the previous queries that we identified as having isssues

### Category-Specific Searches

Cute skirts that go well with cute tops vs Cute Tops are now much more relevant to the user's query.

As seen below, in the retrieved results, our query understanding correctly understood that the user is looking for skirts and not tops. As a result, the retrieved results are now much more relevant to the user's query.


In [98]:
await search_with_filters(conn, 5, "skirts that go well with cute tops", sem, client)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,149,Rust Midi Skirt,This chic rust-colored midi skirt offers a sop...,H&M,Bottoms,Skirts,72,338.49,"[0.038690697, -0.018261096, -0.05469198, -0.01..."
1,165,Plaid Pencil Skirt,This plaid pencil skirt is a versatile additio...,Zara,Bottoms,Skirts,97,275.42,"[0.01957393, -0.016166368, -0.029405586, -0.02..."
2,60,Women's High-Waisted Midi Skirt,Make a statement with this chic high-waisted m...,Zara,Bottoms,Skirts,35,389.11,"[0.0426107, -0.0119919, -0.013514505, -0.00535..."
3,79,White Eyelet Mini Skirt,"Featuring a delicate eyelet design, this white...",H&M,Bottoms,Skirts,0,102.31,"[0.060203258, 0.022313096, -0.0173979, -0.0479..."
4,65,Floral Print Skirt,Flaunt your femininity with this charming flor...,H&M,Bottoms,Skirts,73,300.24,"[0.05405696, 0.027527, -0.031233393, 0.0025359..."


When we look at the results for cute tops, we can see that the results are now much more relevant to the user's query - the only items that are retrieved are tops.


In [99]:
await search_with_filters(conn, 5, "Cute tops", sem, client)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,73,Smocked Button Front Crop Top,This chic red crop top features a trendy smock...,Forever 21,Tops,Tank Tops,0,355.15,"[0.05767848, 0.032298118, -0.0013013736, 0.000..."
1,130,Women's Cutout Cropped Top,Elevate your style with this chic cutout cropp...,Forever 21,Tops,Blouses,0,222.68,"[0.06053915, -0.004936928, -0.028292606, -0.00..."
2,5,Plaid Crop Top,This chic plaid crop top features thin straps ...,Zara,Tops,Tank Tops,0,261.05,"[0.038300514, 0.027030528, -0.02846925, 0.0036..."
3,125,Buttoned Scoop Neck Tank Top,This chic tank top features a flattering scoop...,Zara,Tops,Tank Tops,5,156.82,"[0.063474484, 0.004105421, -0.049733024, -0.02..."
4,184,Off-Shoulder Black Top,"Elegant and chic, this off-shoulder black top ...",ZARA,Tops,Blouses,43,357.77,"[0.076061815, 0.0075597474, -0.0065461183, -0...."


## Price-Sensitive Searches

Previously when we searched for sweaters under $200, we obtained a list of items that were above the $200 mark, some of which were not even jackets.

With query understanding, we can now retrieve the relevant results that fit the price requirement that the user has specified.

In [115]:
await search_with_filters(conn, 5, "sweaters under $200", sem, client)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,67,Women's Long Sleeve Turtleneck Top,This chic and versatile turtleneck top is craf...,Zara,Tops,Sweaters,74,154.75,"[0.032573458, -0.006064759, -0.03929135, -0.03..."


## Stock-Sensitive Searches

Previously when we searched for  T_Shirts in stock, we got a list of items that were out of stock with quantity marked as 0.

In this case, we can see that the retrieved results are now much more relevant to the user's query since all of them have a quantity greater than 0.

In [118]:
await search_with_filters(conn, 5, "Black T-Shirts in stock", sem, client)

Unnamed: 0,id,title,description,brand,category,subcategory,quantity,price,embedding
0,39,Graphic T-Shirt,This stylish Opening Ceremony Graphic T-Shirt ...,Opening Ceremony,Tops,T-Shirts,19,203.73,"[0.107823, 0.04437266, -0.06881595, -0.0184602..."
1,92,Women's Graphic Tiger T-Shirt,This stylish black t-shirt from Just Cavalli f...,Just Cavalli,Tops,T-Shirts,37,315.59,"[0.043493904, -0.003560018, -0.005251157, -0.0..."
2,15,Classic White T-Shirt,This classic white T-shirt features a simple c...,H&M,Tops,T-Shirts,83,318.67,"[0.06042049, 0.040267307, -0.053937104, -0.022..."
3,37,White Crew Neck T-Shirt,This classic white crew neck t-shirt offers a ...,GAP,Tops,T-Shirts,68,229.9,"[0.06538749, 0.018682139, -0.057946295, -0.000..."
4,88,Classic Crew Neck T-Shirt,This versatile crew neck t-shirt offers a rela...,H&M,Tops,T-Shirts,63,175.24,"[0.061851878, 0.03604318, -0.04478699, -0.0158..."


# Conclusion

In this short tutorial, we've seen how query understanding is essential for improving the relevance of our search results. By using language models, we can extract relevant filters from user queries and then use these filters to improve our search results.

By combining this with `pgvectorscale` capabilities, we can now retrieve queries that are semantically relevant to the user's query which also match the implicit filters that the user has specified. 

Timescale makes this easy by supporting complex filtering operations alongside vector search using simple SQL. In the next notebook, we'll see how we can leverage language models to generate synthetic queries to benchmark our query understanding before we even ship to production.