<img src="https://relevance.ai/wp-content/uploads/2021/11/logo.79f303e-1.svg" width="150" alt="Relevance AI" />
<h5> Developer-first vector platform for ML teams </h5>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/RelevanceAI/RelevanceAI/blob/main/guides/advanced_search_guide.ipynb)

# 🔍 Advanced Search

Fast Search is Relevance AI's most complex search endpoint. 
It combines functionality to search using vectors, exact text search with ability to boost your search results depending on your needs. The following demonstrates a few dummy examples on how to quickly add complexity to your search!

In [1]:
!pip install -q -U RelevanceAI-dev[notebook]

[K     |████████████████████████████████| 299 kB 16.0 MB/s 
[K     |████████████████████████████████| 1.1 MB 61.6 MB/s 
[K     |████████████████████████████████| 253 kB 53.2 MB/s 
[K     |████████████████████████████████| 58 kB 6.5 MB/s 
[K     |████████████████████████████████| 144 kB 49.1 MB/s 
[K     |████████████████████████████████| 94 kB 3.1 MB/s 
[K     |████████████████████████████████| 271 kB 48.8 MB/s 
[K     |████████████████████████████████| 112 kB 47.4 MB/s 
[?25h  Building wheel for fuzzysearch (setup.py) ... [?25l[?25hdone


In [2]:
## Let's use this CLIP popular model to encode text and image into same space https://github.com/openai/CLIP
%%capture
!conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
!pip install ftfy regex tqdm
!pip install git+https://github.com/openai/CLIP.git

You can sign up/login and find your credentials here: https://cloud.relevance.ai/sdk/api
Once you have signed up, click on the value under `Authorization token` and paste it here

In [3]:
import pandas as pd
from relevanceai import Client
client = Client()


Activation Token: ··········


## 🚣 Inserting data

We use a sample ecommerce dataset - with vectors `product_image_clip_vector_` and `product_title_clip_vector_` already encoded for us.

In [4]:
from relevanceai.utils.datasets import get_ecommerce_dataset_encoded
docs = get_ecommerce_dataset_encoded()

In [5]:
ds = client.Dataset("advanced_search_guide")
# ds.delete()
ds.upsert_documents(docs)

✅ All documents inserted/edited successfully.


In [6]:
ds.schema

{'insert_date_': 'date',
 'price': 'numeric',
 'product_image': 'text',
 'product_image_clip_vector_': {'vector': 512},
 'product_link': 'text',
 'product_price': 'text',
 'product_title': 'text',
 'product_title_clip_vector_': {'vector': 512},
 'query': 'text',
 'source': 'text'}

In [7]:
vector_fields = ds.list_vector_fields()
vector_fields

['product_image_clip_vector_', 'product_title_clip_vector_']

## Simple Text Search

In [8]:
results = ds.advanced_search(
    query="nike", fields_to_search=["product_title"], select_fields=["product_title"]
)
pd.DataFrame(results["results"])

Unnamed: 0,product_title,_id,_relevance
0,Nike Women's Summerlite Golf Glove,b37b2aea-800e-4662-8977-198f744d52bb,7.59013
1,Nike Dura Feel Women's Golf Glove,e725c79c-c2d2-4c6d-b77a-ed029f33813b,7.148285
2,Nike Junior's Range Jr Golf Shoes,0e7a5a3d-5d17-42c4-b607-7bf9bb2625a4,7.148285
3,Nike Sport Lite Women's Golf Bag,3660e25b-8359-49b9-88c7-fca2dfd9053f,7.148285
4,Nike Women's Tech Xtreme Golf Glove,8b28e438-0726-4b58-98c7-7597a43d2433,7.148285
5,Nike Women's SQ Dymo Fairway Wood,adab23fd-ded8-4068-b6a2-999bfe20e5e7,7.148285
6,Nike Ladies Lunar Duet Sport Golf Shoes,b655198b-4356-4ba9-b88e-1e1d6608f43e,6.755055
7,Nike Junior's Range Red/ White Golf Shoes,d27e70f3-2884-4490-9742-133166795d0f,6.755055
8,Nike Women's Lunar Duet Classic Golf Shoes,e1f3faf0-72fa-4559-9604-694699426cc2,6.755055
9,Nike Air Men's Range WP Golf Shoes,e8d2552f-3ca5-4d15-9ca7-86855025b183,6.755055


## Simple Vector Search

Let's prepare some functions to help us encode our data!


In [9]:
import torch
import clip
import requests
from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)

# First - let's encode the image based on CLIP
def encode_image(image):
    # Let us download the image and then preprocess it
    image = (
        preprocess(Image.open(requests.get(image, stream=True).raw))
        .unsqueeze(0)
        .to(device)
    )
    # We then feed our processed image through the neural net to get a vector
    with torch.no_grad():
        image_features = model.encode_image(image)
    # Lastly we convert it to a list so that we can send it through the SDK
    return image_features.tolist()[0]


# Next - let's encode text based on CLIP
def encode_text(text):
    # let us get text and then tokenize it
    text = clip.tokenize([text]).to(device)
    # We then feed our processed text through the neural net to get a vector
    with torch.no_grad():
        text_features = model.encode_text(text)
    return text_features.tolist()[0]

100%|███████████████████████████████████████| 338M/338M [00:06<00:00, 52.1MiB/s]


In [10]:
# Encoding the query
query_vector = encode_text("nike")

results = ds.advanced_search(
    vector_search_query=[
        {"vector": query_vector, "field": "product_title_clip_vector_"}
    ],
    select_fields=["product_title"],
)

pd.DataFrame(results["results"])

Unnamed: 0,product_title,_id,_relevance
0,PS4 - Playstation 4 Console,a24c46df-0a1b-49a5-80f4-5ad61bcc6370,0.748447
1,Nike Men's 'Air Visi Pro IV' Synthetic Athleti...,0435795a-899f-4cdf-89be-a0f3f189d69e,0.747137
2,Nike Men's 'Air Max Pillar' Synthetic Athletic...,57ca8324-3e8a-4926-9333-b10599edb17b,0.733907
3,Brica Drink Pod,bbb623f6-485b-44b3-8739-1998b15ae60d,0.725095
4,Gear Head Mouse,c945fe93-fff3-434b-a91f-18133ab28582,0.712708
5,Gear Head Mouse,0f1e86a8-867f-4437-8fb0-2b95a37f0c22,0.712708
6,PS4 - UFC,050a9f63-3549-4720-9be7-9daa07f868e8,0.702847
7,Nike Women's 'Zoom Hyperquickness' Synthetic A...,5536a97a-2183-4342-bc92-422aebbcbbc9,0.697779
8,Nike Women's 'Zoom Hyperquickness' Synthetic A...,00445000-a8ed-4523-b610-f70aa79d47f7,0.695003
9,Nike Men's 'Jordan SC-3' Leather Athletic Shoe,281d9edd-4be6-4c69-a846-502053f3d4e7,0.694744


## Combining Text And Vector Search (Hybrid)

Combining text and vector search allows users get the best of both exact text search and contextual vector search. This can be done as shown below.

In [11]:
results = ds.advanced_search(
    query="nike",
    fields_to_search=["product_title"],
    vector_search_query=[
        {"vector": query_vector, "field": "product_title_clip_vector_"}
    ],
    select_fields=["product_title"],  # results to return
)

pd.DataFrame(results["results"])

Unnamed: 0,product_title,_id,_relevance
0,Nike Women's Summerlite Golf Glove,b37b2aea-800e-4662-8977-198f744d52bb,8.14037
1,Nike Junior's Range Jr Golf Shoes,0e7a5a3d-5d17-42c4-b607-7bf9bb2625a4,7.816567
2,Nike Sport Lite Women's Golf Bag,3660e25b-8359-49b9-88c7-fca2dfd9053f,7.704053
3,Nike Women's SQ Dymo Fairway Wood,adab23fd-ded8-4068-b6a2-999bfe20e5e7,7.700504
4,Nike Dura Feel Women's Golf Glove,e725c79c-c2d2-4c6d-b77a-ed029f33813b,7.696908
5,Nike Women's Tech Xtreme Golf Glove,8b28e438-0726-4b58-98c7-7597a43d2433,7.643136
6,Nike Men's 'Lunarglide 6' Synthetic Athletic Shoe,8cb26a3e-7de4-4af3-ae40-272450fa9b4d,7.445704
7,Nike Men's 'Lunarglide 6' Synthetic Athletic Shoe,968a9319-fdd4-45ca-adc6-940cd83a204a,7.440268
8,Nike Women's SQ Dymo STR8-FIT Driver,ff52b64a-0567-4181-8753-763da7044f2f,7.410513
9,Nike Women's 'Lunaracer+ 3' Mesh Athletic Shoe,0614f0a9-adcb-4c6c-939c-e7869525549c,7.408814


## Adjust the weighting of your vector search results

Adjust the weighting of your vector search results to make it easier for you!
Simply add a `weight` parameter your dictionary inside `vector_search_query`.

In [12]:
results = ds.advanced_search(
    query="nike",
    fields_to_search=["product_title"],
    vector_search_query=[
        {"vector": query_vector, "field": "product_title_clip_vector_", "weight": 0.5}
    ],
    select_fields=["product_title"],  # results to return
)

pd.DataFrame(results["results"])

Unnamed: 0,product_title,_id,_relevance
0,Nike Women's Summerlite Golf Glove,b37b2aea-800e-4662-8977-198f744d52bb,7.86525
1,Nike Junior's Range Jr Golf Shoes,0e7a5a3d-5d17-42c4-b607-7bf9bb2625a4,7.482427
2,Nike Sport Lite Women's Golf Bag,3660e25b-8359-49b9-88c7-fca2dfd9053f,7.426169
3,Nike Women's SQ Dymo Fairway Wood,adab23fd-ded8-4068-b6a2-999bfe20e5e7,7.424395
4,Nike Dura Feel Women's Golf Glove,e725c79c-c2d2-4c6d-b77a-ed029f33813b,7.422597
5,Nike Women's Tech Xtreme Golf Glove,8b28e438-0726-4b58-98c7-7597a43d2433,7.395711
6,Nike Men's 'Lunarglide 6' Synthetic Athletic Shoe,8cb26a3e-7de4-4af3-ae40-272450fa9b4d,7.100379
7,Nike Men's 'Lunarglide 6' Synthetic Athletic Shoe,968a9319-fdd4-45ca-adc6-940cd83a204a,7.097662
8,Nike Women's SQ Dymo STR8-FIT Driver,ff52b64a-0567-4181-8753-763da7044f2f,7.082784
9,Nike Women's 'Lunaracer+ 3' Mesh Athletic Shoe,0614f0a9-adcb-4c6c-939c-e7869525549c,7.081935


## Multi-Vector Search Across Multiple Fields

You can easily add more to your search by extending your vector search query as belows.

In [13]:
from PIL import Image
import requests
import numpy as np

image_url = "https://static.nike.com/a/images/t_PDP_1280_v1/f_auto,q_auto:eco/e6ea66d1-fd36-4436-bcac-72ed14d8308d/wearallday-younger-shoes-5bnMmp.png"


<img src="https://static.nike.com/a/images/t_PDP_1280_v1/f_auto,q_auto:eco/e6ea66d1-fd36-4436-bcac-72ed14d8308d/wearallday-younger-shoes-5bnMmp.png" width="150" alt="Relevance AI" />
<h5> Sample Query Image  </h5>



In [21]:
image_vector[0:5]

[-0.1314697265625,
 -0.442626953125,
 0.0194549560546875,
 0.11602783203125,
 -0.405029296875]

In [26]:
from relevanceai import show_json

image_vector = encode_image(image_url)

results = ds.advanced_search(
    query="nike",
    fields_to_search=["product_title"],
    vector_search_query=[
        {"vector": query_vector, "field": "product_title_clip_vector_", "weight": 0.2},
        {
            "vector": image_vector,
            "field": "product_image_clip_vector_",
            "weight": 0.8,
        },  ## weight the query more on the image vector
    ],
    select_fields=[
        "product_title",
        "product_image",
        "query",
        "product_price",
    ],  # results to return
    queryConfig={"weight": 0.1} # Adjust the weight of the traditional configuration
)


display(
    show_json(
        results["results"],
        text_fields=["product_title", "query", "product_price"],
        image_fields=["product_image"],
    )
)

# pd.DataFrame(results['results'])

Unnamed: 0,product_image,product_title,query,product_price,_id
0,,Nike Men's 'Lunarglide 6' Synthetic Athletic Shoe,nike womens,$145.99,8cb26a3e-7de4-4af3-ae40-272450fa9b4d
1,,Nike Men's 'Lunarglide 6' Synthetic Athletic Shoe,nike shoes,$145.99,968a9319-fdd4-45ca-adc6-940cd83a204a
2,,Nike Junior's Range Jr Golf Shoes,nike shoes,$54.99,0e7a5a3d-5d17-42c4-b607-7bf9bb2625a4
3,,Nike Ladies Lunar Duet Sport Golf Shoes,nike womens,$81.99 - $88.07,80210247-6f40-45be-8279-8743b327f1dc
4,,Nike Mens Lunar Mont Royal Spikeless Golf Shoes,nike shoes,$100.99,e692a73b-a144-4e44-b4db-657be6db96e2
5,,Nike Mens Lunar Cypress Spikeless Golf Shoes,nike shoes,$100.99,fb323476-a16d-439c-9380-0bac1e10a06d
6,,Nike Ladies Lunar Duet Sport Golf Shoes,nike shoes,$81.99 - $88.07,b655198b-4356-4ba9-b88e-1e1d6608f43e
7,,Nike Women's 'Lunaracer+ 3' Mesh Athletic Shoe,nike shoes,$107.99,0614f0a9-adcb-4c6c-939c-e7869525549c
8,,Nike Women's 'Lunaracer+ 3' Mesh Athletic Shoe,nike womens,$107.99,7baea34f-fb0a-47da-9edd-d920abddccf5
9,,Nike Air Men's Range WP Golf Shoes,nike shoes,$90.99 - $91.04,e8d2552f-3ca5-4d15-9ca7-86855025b183
