<!-- TABS -->
# Multimodal vector search

The first step in any SuperDuperDB application is to connect to your data-backend with SuperDuperDB:

<!-- TABS -->
## Connect to SuperDuperDB

In [None]:
# <tab: MongoDB>
from pinnacledb import pinnacle

db = pinnacle('mongodb://localhost:27017/documents')

In [None]:
# <tab: SQLite>
from pinnacledb import pinnacle

db = pinnacle('sqlite://my_db.db')

Once you have done that you are ready to define your datatype(s) which you would like to "search".

<!-- TABS -->
## Create datatype

In [None]:
# <tab: Audio>
...

In [None]:
# <tab: Video>
...

<!-- TABS -->
## Insert data

In order to create data, we need create a `Schema` for encoding our special `Datatype` column(s) in the databackend.

Here's some sample data to work with:

In [None]:
# <tab: Text>
!curl -O https://jupyter-sessions.s3.us-east-2.amazonaws.com/text.json

import json
with open('text.json') as f:
    data = json.load(f)

In [None]:
# <tab: Images>
!curl -O https://jupyter-sessions.s3.us-east-2.amazonaws.com/images.zip
!unzip images.zip

import os
data = [{'image': f'file://image/{file}'} for file in os.listdir('./images')]

In [None]:
# <tab: Audio>
!curl -O https://jupyter-sessions.s3.us-east-2.amazonaws.com/audio.zip
!unzip audio.zip

import os
data = [{'audio': f'file://audio/{file}'} for file in os.listdir('./audio')]

The next code-block is only necessary if you're working with a custom `DataType`:

In [None]:
from pinnacledb import Schema, Document

schema = Schema(
    'my_schema',
    fields={
        'my_key': dt
    }
)

data = [
    Document({'my_key': item}) for item in data
]

In [None]:
# <tab: MongoDB>
from pinnacledb.backends.mongodb import Collection

collection = Collection('documents')

db.execute(collection.insert_many(data))

In [None]:
# <tab: SQL>
from pinnacledb.backends.ibis import Table

table = Table(
    'my_table',
    schema=schema,
)

db.add(table)
db.execute(table.insert(data))

<!-- TABS -->
## Build multimodal embedding models

In [None]:
# <tab: Text>

...

In [None]:
# <tab: Image>

...

In [None]:
# <tab: Text-2-Image>

...

<!-- TABS -->
## Perform a vector search

- `item` is the item which is to be encoded
- `dt` is the `DataType` instance to apply

In [None]:
from pinnacledb import Document

item = Document({'my_key': dt(item)})

Once we have this search target, we can execute a search as follows:

In [None]:
# <tab: MongoDB>
from pinnacledb.backends.mongodb import Collection

collection = Collection('documents')

select = collection.find().like(item)

In [None]:
# <tab: SQL>

# Table was created earlier, before preparing vector-search
table = db.load('table', 'documents')

select = table.like(item)

In [None]:
results = db.execute(select)