diff --git a/docs.json b/docs.json index d30aa8ba2..092f59f5e 100644 --- a/docs.json +++ b/docs.json @@ -167,6 +167,7 @@ "learn/ai_powered_search/getting_started_with_ai_search", "learn/ai_powered_search/configure_rest_embedder", "learn/ai_powered_search/document_template_best_practices", + "learn/ai_powered_search/image_search_with_multimodal_embeddings", "learn/ai_powered_search/image_search_with_user_provided_embeddings", "learn/ai_powered_search/search_with_user_provided_embeddings", "learn/ai_powered_search/retrieve_related_search_results", diff --git a/learn/ai_powered_search/image_search_with_multimodal_embeddings.mdx b/learn/ai_powered_search/image_search_with_multimodal_embeddings.mdx new file mode 100644 index 000000000..809fa2dd2 --- /dev/null +++ b/learn/ai_powered_search/image_search_with_multimodal_embeddings.mdx @@ -0,0 +1,229 @@ +--- +title: Image search with multimodal embeddings +description: This article shows you the main steps for performing multimodal text-to-image searches +--- + +This guide shows the main steps to search through a database of images using Meilisearch's experimental multimodal embeddings. + +## Requirements + +- A database of images +- A Meilisearch project +- Access to a multimodal embedding provider (for example, [VoyageAI multimodal embeddings](https://docs.voyageai.com/reference/multimodal-embeddings-api)) + +## Enable multimodal embeddings + +First, enable the `multimodal` experimental feature: + +```sh +curl \ + -X PATCH 'MEILISEARCH_URL/experimental-features/' \ + -H 'Content-Type: application/json' \ + --data-binary '{ + "multimodal": true + }' +``` + +You may also enable multimodal in your Meilisearch Cloud project's general settings, under "Experimental features". + +## Configure a multimodal embedder + +Much like other embedders, multimodal embedders must set their `source` to `rest` and explicitly declare their `url`. Depending on your chosen provider, you may also have to specify `apiKey`. + +All multimodal embedders must contain an `indexingFragments` field and a `searchFragments` field. Fragments are sets of embeddings built out of specific parts of document data. + +Fragments must follow the structure defined by the REST API of your chosen provider. + +### `indexingFragments` + +Use `indexingFragments` to tell Meilisearch how to send document data to the provider's API when generating document embeddings. + +For example, when using VoyageAI's multimodal model, an indexing fragment might look like this: + +```json +"indexingFragments": { + "TEXTUAL_FRAGMENT_NAME": { + "value": { + "content": [ + { + "type": "text", + "text": "A document named {{doc.title}} described as {{doc.description}}" + } + ] + } + }, + "IMAGE_FRAGMENT_NAME": { + "value": { + "content": [ + { + "type": "image_url", + "image_url": "{{doc.poster_url}}" + } + ] + } + } +} +``` + +The example above requests Meilisearch to create two sets of embeddings during indexing: one for the textual description of an image, and another for the actual image. + +Any JSON string value appearing in a fragment is handled as a Liquid template, where you interpolate document data present in `doc`. In `IMAGE_FRAGMENT_NAME`, that's `image_url` which outputs the plain URL string in the document field `poster_url`. In `TEXT_FRAGMENT_NAME`, `text` contains a longer string contextualizing two document fields, `title` and `description`. + +### `searchFragments` + +Use `searchFragments` to tell Meilisearch how to send search query data to the chosen provider's REST API when converting them into embeddings: + +```json +"searchFragments": { + "USER_TEXT_FRAGMENT": { + "value": { + "content": [ + { + "type": "text", + "text": "{{q}}" + } + ] + } + }, + "USER_SUBMITTED_IMAGE_FRAGMENT": { + "value": { + "content": [ + { + "type": "image_base64", + "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}" + } + ] + } + } +} +``` + +In this example, two modes of search are configured: + +1. A textual search based on the `q` parameter, which will be embedded as text +2. An image search based on [data url](https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Schemes/data) rebuilt from the `image.mime` and `image.data` field in the `media` field of the query + +Search fragments have access to data present in the query parameters `media` and `q`. + +Each semantic search query for this embedder should match exactly one search fragment of this embedder, so the fragments should each have at least one disambiguating field + +### Complete embedder configuration + +Your embedder should look similar to this example with all fragments and embedding provider data: + +```sh +curl \ + -X PATCH 'MEILISEARCH_URL/indexes/INDEX_NAME/settings' \ + -H 'Content-Type: application/json' \ + --data-binary '{ + "embedders": { + "MULTIMODAL_EMBEDDER_NAME": { + "source": "rest", + "url": "https://api.voyageai.com/v1/multimodal-embeddings", + "apiKey": "VOYAGE_API_KEY", + "indexingFragments": { + "TEXTUAL_FRAGMENT_NAME": { + "value": { + "content": [ + { + "type": "text", + "text": "A document named {{doc.title}} described as {{doc.description}}" + } + ] + } + }, + "IMAGE_FRAGMENT_NAME": { + "value": { + "content": [ + { + "type": "image_url", + "image_url": "{{doc.poster_url}}" + } + ] + } + } + }, + "searchFragments": { + "USER_TEXT_FRAGMENT": { + "value": { + "content": [ + { + "type": "text", + "text": "{{q}}" + } + ] + } + }, + "USER_SUBMITTED_IMAGE_FRAGMENT": { + "value": { + "content": [ + { + "type": "image_base64", + "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}" + } + ] + } + } + } + } + } + }' +``` + +## Add documents + +Once your embedder is configured, you can [add documents to your index](/learn/getting_started/cloud_quick_start) with the [`/documents` endpoint](/reference/api/documents). + +During indexing, Meilisearch will automatically generate multimodal embeddings for each document using the configured `indexingFragments`. + +## Perform searches + +The final step is to perform searches using different types of content. + +### Use text to search for images + +Use the following search query to retrieve a mix of documents with images matching the description, documents with and documents containing the specified keywords: + +```sh +curl -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \ + -H 'Content-Type: application/json' \ + --data-binary '{ + "q": "a mountain sunset with snow", + "hybrid": { + "embedder": "MULTIMODAL_EMBEDDER_NAME" + } + }' +``` + +### Use an image to search for images + +You can also use an image to search for other, similar images: + +```sh +curl -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \ + -H 'Content-Type: application/json' \ + --data-binary '{ + "media": { + "image": { + "mime": "image/jpeg", + "data": "" + } + }, + "hybrid": { + "embedder": "MULTIMODAL_EMBEDDER_NAME" + } + }' +``` + + +In most cases you will need a GUI interface that allows users to submit their images and converts these images to Base64 format. Creating this is outside the scope of this guide. + + +## Conclusion + +With multimodal embedders you can: + +1. Configure Meilisearch to embed both images and queries +2. Add image documents — Meilisearch automatically generates embeddings +3. Accept text or image input from users +4. Run hybrid searches using a mix of textual and input from other types of media, or run pure semantic semantic searches using only non-textual input diff --git a/reference/api/settings.mdx b/reference/api/settings.mdx index 7b20e6600..78b255969 100644 --- a/reference/api/settings.mdx +++ b/reference/api/settings.mdx @@ -2939,6 +2939,14 @@ For example, for [VoyageAI's multimodal embedding route](https://docs.voyageai.c Use Liquid templates to interpolate document data into the fragment fields, where `doc` gives you access to all fields within a document. + +If a Liquid template appearing inside of a fragment cannot be rendered, no embedding will be generated for that fragment and that document. If a document has no indexing fragments, it will not be returned in multimodal searches. In most cases, a fragment is not rendered because a field it references is missing in the document. + +This is different from embeddings based on `documentTemplate`, which abort the indexing task if the document template cannot be rendered for a document. + +You can check which documents have embeddings for a given fragment using [vector filters](/learn/filtering_and_sorting/filter_expression_reference#vector-filters). + + `indexingFragments` is optional when using the `rest` source. `indexingFragments` is incompatible with all other embedder sources. @@ -2974,19 +2982,31 @@ curl \ As with `indexingFragments`, the content of `value` should follow your model's specification. -Use Liquid templates to interpolate search query data into the fragment fields, where `media` gives you access to all multimodal data received with a query: +Use Liquid templates to interpolate search query data into the fragment fields, where `{{media.*}}` gives you access to all [multimodal data received with a query](/reference/api/search#media) and `{{q}}` gives you access to the regular textual query: ```json -"SEARCH_FRAGMENT_A": { - "value": { - "content": [ - { - "type": "image_base64", - "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}" - } - ] +{ + "SEARCH_FRAGMENT_A": { + "value": { + "content": [ + { + "type": "image_base64", + "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}" + } + ] + } + }, + "SEARCH_FRAGMENT_B": { + "value": { + "content": [ + { + "type": "text", + "text": "{{q}}" + } + ] + } } -}, +} ``` `searchFragments` is optional when using the `rest` source. diff --git a/snippets/samples/code_samples_facet_search_3.mdx b/snippets/samples/code_samples_facet_search_3.mdx index 70c2bcdc9..64fee76d9 100644 --- a/snippets/samples/code_samples_facet_search_3.mdx +++ b/snippets/samples/code_samples_facet_search_3.mdx @@ -49,7 +49,8 @@ client.Index("books").FacetSearch(&meilisearch.FacetSearchRequest{ ```csharp C# var query = new SearchFacetsQuery() { - FacetQuery = "c" + FacetQuery = "c", + ExhaustiveFacetCount: true }; await client.Index("books").FacetSearchAsync("genres", query); ``` diff --git a/snippets/samples/code_samples_geosearch_guide_filter_usage_4.mdx b/snippets/samples/code_samples_geosearch_guide_filter_usage_4.mdx new file mode 100644 index 000000000..629a30155 --- /dev/null +++ b/snippets/samples/code_samples_geosearch_guide_filter_usage_4.mdx @@ -0,0 +1,9 @@ + + +```bash cURL +curl \ + -X POST 'MEILISEARCH_URL/indexes/restaurants/search' \ + -H 'Content-type:application/json' \ + --data-binary '{ "filter": "_geoPolygon([45.494181, 9.214024], [45.449484, 9.179175], [45.449486, 9.179177])" }' +``` + \ No newline at end of file diff --git a/snippets/samples/code_samples_getting_started_add_documents.mdx b/snippets/samples/code_samples_getting_started_add_documents.mdx index 9a2a53b20..e3b2e68ad 100644 --- a/snippets/samples/code_samples_getting_started_add_documents.mdx +++ b/snippets/samples/code_samples_getting_started_add_documents.mdx @@ -78,14 +78,14 @@ $client->index('movies')->addDocuments($movies); // // com.meilisearch.sdk // meilisearch-java -// 0.15.0 +// 0.16.1 // pom // // For Gradle // Add the following line to the `dependencies` section of your `build.gradle`: // -// implementation 'com.meilisearch.sdk:meilisearch-java:0.15.0' +// implementation 'com.meilisearch.sdk:meilisearch-java:0.16.1' // In your .java file: import com.meilisearch.sdk; @@ -192,7 +192,7 @@ namespace Meilisearch_demo ```text Rust // In your .toml file: [dependencies] - meilisearch-sdk = "0.29.1" + meilisearch-sdk = "0.30.0" # futures: because we want to block on futures futures = "0.3" # serde: required if you are going to use documents diff --git a/snippets/samples/code_samples_ranking_score_threshold.mdx b/snippets/samples/code_samples_ranking_score_threshold.mdx new file mode 100644 index 000000000..afa4dcc32 --- /dev/null +++ b/snippets/samples/code_samples_ranking_score_threshold.mdx @@ -0,0 +1,8 @@ + + +```dart Dart +await client + .index('INDEX_NAME') + .search('badman', SearchQuery(rankingScoreThreshold: 0.2)); +``` + \ No newline at end of file diff --git a/snippets/samples/code_samples_rename_an_index_1.mdx b/snippets/samples/code_samples_rename_an_index_1.mdx index c47acef51..cb2efca97 100644 --- a/snippets/samples/code_samples_rename_an_index_1.mdx +++ b/snippets/samples/code_samples_rename_an_index_1.mdx @@ -6,4 +6,11 @@ curl \ -H 'Content-Type: application/json' \ --data-binary '{ "uid": "INDEX_B" }' ``` + +```rust Rust +curl \ + -X PATCH 'MEILISEARCH_URL/indexes/INDEX_A' \ + -H 'Content-Type: application/json' \ + --data-binary '{ "uid": "INDEX_B" }' +``` \ No newline at end of file