# Reinventing Multi-Modal Search with Anyscale and MongoDB

What we are learning about and building today: https://www.anyscale.com/blog/reinventing-multi-modal-search-with-anyscale-and-mongodb

The following instructions will help you get set up your environment

## Register for Anyscale if needed

If you're attending this class at Ray Summit 2024, then you already have an Anyscale account -- we'll use that one!

If you're trying out this application later or on your own,
* You can register for Anyscale [here](https://console.anyscale.com/register/ha?utm_source=github&utm_medium=github&utm_content=multi-modal-search-anyscale-mongodb).

## Login to Anyscale

Once you have an account, [login](https://console.anyscale.com/v2?utm_source=github&utm_medium=github&utm_content=multi-modal-search-anyscale-mongodb) here.

## Get set up with MongoDB

Check out the Mongo Developer Intro Lab at https://mongodb-developer.github.io/intro-lab/

That tutorial -- presented live at Ray Summit 2024 -- covers the following key steps:
* Get you set up with a free MongoDB Atlas account 
* Create a free MongoDB cluster
* Configure securityy to allow public access to your cluster (for demo/class purposes only)
* Create your database user and save the password
* Get the connection string for your MongoDB cluster

## Register or login to Hugging Face

If you don't have a Hugging Face account, you can register [here](https://huggingface.co/join). 

If you already have an account, [login](https://huggingface.co/login) here.

Visit the [tokens](https://huggingface.co/settings/tokens) page to generate a new API token.

Visit the following model pages and request access to these models:
- [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf)
- [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)

Once you have access to these models, you can proceed with the next steps.

## Launch a workspace in Anyscale for this project

At Ray Summit 2024, you're probably already running the right workspace. If you're doing this tutorial on your own, choose the Anyscale Ray Summit 2024 template

## Configure environment variables in your Anyscale Workspace

Under the __Dependencies__ tab in the workspace view, set the MongoDB connection string `DB_CONNECTION_STRING` and huggingface access token `HF_TOKEN` as environment variables.

<img src="https://anyscale-public-materials.s3.us-west-2.amazonaws.com/mongodb-demo/screenshots/workspace-dependencies.png" width="800px" alt="env-vars-setup-workspace">

---

## Test database connection

In [None]:
import pymongo
from pymongo import MongoClient, ASCENDING, DESCENDING
import os
from pymongo.operations import IndexModel, SearchIndexModel

In [None]:
db_name: str = "myntra"
collection_name: str = "myntra-items-offline"

In [None]:
client = MongoClient(os.environ["DB_CONNECTION_STRING"])
db = client[db_name]

*If the `DB_CONNECTION_STRING` env var is not found, you may need to terminate and then restart the workspace.*

### Setup collection

Run this code one time after you've created your database, to set up the collection and indexes

In [None]:
db.drop_collection(collection_name)

my_collection = db[collection_name]

my_collection.create_indexes(
    [
        IndexModel([("rating", DESCENDING)]),
        IndexModel([("category", ASCENDING)]),
        IndexModel([("season", ASCENDING)]),
        IndexModel([("color", ASCENDING)]),
    ]
)

In [None]:
fts_model = SearchIndexModel(
    definition={
        "mappings": {
            "dynamic": False,
            "fields": {
                "name": {"type": "string", "analyzer": "lucene.standard",}
            }
        }
    },
    name="lexical_text_search_index",
    type="search"
)

In [None]:
vs_model = SearchIndexModel(
    definition={
        "fields": [
                        {
                            "numDimensions": 1024,
                            "similarity": "cosine",
                            "type": "vector",
                            "path": "description_embedding",
                        },
                        {
                            "numDimensions": 1024,
                            "similarity": "cosine",
                            "type": "vector",
                            "path": "name_embedding",
                        },                            
                        {
                            "type": "filter",
                            "path": "category",
                        },
                        {
                            "type": "filter",
                            "path": "season",
                        },
                        {
                            "type": "filter",
                            "path": "color",
                        },
                        {
                            "type": "filter",
                            "path": "rating",
                        },
                        {
                            "type": "filter",
                            "path": "price",
                        },
                    ],
    },
    name="vector_search_index",
    type="vectorSearch"
)

In [None]:
my_collection.create_search_indexes(models=[fts_model, vs_model])

### Count docs

In [None]:
my_collection.count_documents({})

# Architecture

We split our system into an offline data indexing stage and an online search stage.

The offline data indexing stage performs the processing, embedding, and upserting text and images into a MongoDB database that supports vector search across multiple fields and dimensions. This stage is built by running multi-modal data pipelines at scale using Anyscale for AI compute platform.

The online search stage performs the necessary search operations by combining legacy text matching with advanced semantic search capabilities offered by MongoDB. This stage is built by running a multi-modal search backend on Anyscale.

## Multi-Modal Data Pipelines at Scale

### Overview
The data pipelines show how to perform offline batch inference and embeddings generation at scale. The pipelines are designed to handle both text and image data by running multi-modal large language model instances. 

### Technology Stack

- `ray[data]`
- `vLLM`
- `pymongo`
- `sentence-transformers`

## Multi-Modal Search at Scale

### Overview
The search backend combines legacy lexical text matching with advanced semantic search capabilities, offering a robust hybrid search solution. 

### Technology Stack
- `ray[serve]`
- `gradio`
- `motor`
- `sentence-transformers`

### Empty collection

As you're working, you may have experiment, errors, or changes which alter the MongoDB collection. To drop all records in the collection, use the following line.

In [None]:
my_collection.delete_many({})