# Getting Started with Vantage: More Like This Search

Welcome to the More Like This Search part of our [Getting Started with Vantage](https://github.com/VantageDiscovery/vantage-tutorials/tree/main/examples/sdk/python/notebooks/getting_started) series.

This notebook will demonstrate the "more like this" search capabilities provided by the Vantage SDK and guide you on how to use them effectively.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/VantageDiscovery/vantage-tutorials/blob/main/examples/sdk/python/notebooks/getting_started/search_api/more_like_this_search.ipynb)

### âœ… Installation

The first step involves installing the [Vantage](https://pypi.org/project/vantage-sdk/) package.

In [None]:
! pip install vantage-sdk -qU

As usual, let's import the necessary libraries.

In this example we will need just the `os` library to load our environment variables:

In [1]:
import os

### âœ… Initialization

In this example, we will authenticate using a Vantage API Key.
For additional details on initializing the Vantage client, refer to the [notebook](../initializing_the_client.ipynb) that covers this topic first.

Please update the following two cells with the appropriate values.

In [2]:
ACCOUNT_ID = "YOUR_ACCOUNT_ID"
API_HOST = "https://api.dev-a.dev.vantagediscovery.com"

In [None]:
%env VANTAGE_API_KEY=VANTAGE_API_KEY

In [4]:
from vantage_sdk import VantageClient

vantage_instance = VantageClient.using_vantage_api_key(
    vantage_api_key=os.environ["VANTAGE_API_KEY"],
    account_id=ACCOUNT_ID,
    api_host=API_HOST,
)

## âœ… More Like This Search

To perform our More Like This Search, we will first create a sample collection and upload some sample data to it, which we will then search over later.

In this example, we are creating a User-provided embeddings collection because it's easier to manually evaluate search results and embedding vector similarities.

In [35]:
COLLECTION_ID = "mlthis-search-upe-collection"
EMBEDDINGS_DIMENSION = 6

collection = vantage_instance.create_collection(
    collection_id=COLLECTION_ID,
    embeddings_dimension=EMBEDDINGS_DIMENSION,
    user_provided_embeddings=True,
)

In [37]:
sample_documents = [
    {"id": "first_doc", "text": "First Document", "embeddings": [0.8324, 0.2123, 0.1818, 0.1834, 0.3042, 0.5248]},
    {"id": "second_doc", "text": "Second Document", "embeddings": [0.0581, 0.8662, 0.6011, 0.7081, 0.0206, 0.9699]},
    {"id": "third_doc", "text": "Third Document", "embeddings": [0.3745, 0.9507, 0.7320, 0.5987, 0.1560, 0.1560]},
    {"id": "fourth_doc", "text": "Fourth Document", "embeddings": [0.4319, 0.2912, 0.6119, 0.1395, 0.2921, 0.3664]},
    {"id": "query_doc", "text": "Query Document", "embeddings": [0.3892, 0.9485, 0.7327, 0.5844, 0.1506, 0.1571]},
]

In [38]:
import json

DOCUMENTS_JSONL = "\n".join(map(json.dumps, [doc for doc in sample_documents]))

In [39]:
vantage_instance.upload_documents_from_jsonl(
    collection_id=COLLECTION_ID,
    documents=DOCUMENTS_JSONL,
)

To perform a more like this search, we need to provide a query document id. In our case, `QUERY_DOCUMENT_ID` is representing our Query Document which has embedding vector similar to our Third Document.

In [41]:
QUERY_DOCUMENT_ID = "query_doc"

response = vantage_instance.more_like_this_search(
    collection_id=COLLECTION_ID,
    document_id=QUERY_DOCUMENT_ID,
)

In [42]:
for res in response.results:
    print(res)

id='query_doc' score=1.0
id='third_doc' score=0.9999440908432007
id='second_doc' score=0.9122505187988281


We can see that our top result is the third document, with a score of almost 1. There are some similarities with second document as well, but it have clearly lower score than the third.

## ðŸ“Œ Next Steps

You are now familiar with the More Like This Search with Vantage! 

You can take a look at other notebooks from our [Getting Started with Vantage](https://github.com/VantageDiscovery/vantage-tutorials/tree/main/examples/sdk/python/notebooks/getting_started) series or continue using Vantage on your own.

If you need some ideas, check our [Tutorials](https://docs.vantagediscovery.com/docs/tutorials), where you can find inspiration and best practices for using Vantage.

Happy discovering! ðŸ”Ž