Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions explore-analyze/elastic-inference/eis-supported-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,9 @@
| Jina Embeddings v5 Small {applies_to}`stack: ga 9.3+` | 6,000 | 6,000,000 | 600,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. |
| Jina Embeddings v3 {applies_to}`stack: ga 9.3+` | 6,000 | 6,000,000 | 600,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. |
| Jina Embeddings v5 (Small) {applies_to}`stack: ga 9.3+` | 6,000 | 6,000,000 | 600,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. |
| Jina Embeddings v5 (Nano) {applies_to}`stack: ga 9.3+` | 6,000 | 6,000,000 | 600,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. |

Check warning on line 79 in explore-analyze/elastic-inference/eis-supported-models.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'Nano' is a possible misspelling.
| Jina Embeddings v5 Omni Nano {applies_to}`stack: ga 9.5+` | 6,000 | 6,000,000 | 600,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. Audio, video, and PDF inputs require {{stack}} 9.5+ (transport version `inference_api_audio_video_pdf_support`). |

Check warning on line 80 in explore-analyze/elastic-inference/eis-supported-models.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'Nano' is a possible misspelling.

Check warning on line 80 in explore-analyze/elastic-inference/eis-supported-models.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'Omni' is a possible misspelling.
| Jina Embeddings v5 Omni Small {applies_to}`stack: ga 9.5+` | 6,000 | 6,000,000 | 600,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. Audio, video, and PDF inputs require {{stack}} 9.5+ (transport version `inference_api_audio_video_pdf_support`). |

Check warning on line 81 in explore-analyze/elastic-inference/eis-supported-models.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'Omni' is a possible misspelling.
| Jina Reranker v2 {applies_to}`stack: ga 9.3+` | 600 | - | 6,000,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. |
| Jina Reranker v3 {applies_to}`stack: ga 9.3+` | 600 | - | 6,000,000 | Limits are applied to both requests per minute and tokens per minute, whichever limit is reached first. |

Expand Down
2 changes: 2 additions & 0 deletions explore-analyze/elastic-inference/embedding-models.csv
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ Elastic,ELSER v2,elser_model_2,[ELSER docs](https://www.elastic.co/docs/explore-
Jina,Embeddings v3,jina-embeddings-v3,[jina-embeddings-v3](https://jina.ai/models/jina-embeddings-v3/),[Elastic Terms](https://www.elastic.co/legal/terms-of-use),Text,Embedding,,0,No,"US, SG, EU",Generally Available,9.3
Jina,Embeddings v5 Text Nano,jina-embeddings-v5-text-nano,[jina-embeddings-v5-text-nano](https://huggingface.co/jinaai/jina-embeddings-v5-text-nano),[Elastic Terms](https://www.elastic.co/legal/terms-of-use),Text,Embedding,,0,No,"US, SG, EU",Generally Available,9.3
Jina,Embeddings v5 Text Small,jina-embeddings-v5-text-small,[jina-embeddings-v5-text-small](https://huggingface.co/jinaai/jina-embeddings-v5-text-small),[Elastic Terms](https://www.elastic.co/legal/terms-of-use),Text,Embedding,,0,No,"US, SG, EU",Generally Available,9.3
Jina,Embeddings v5 Omni Nano,jina-embeddings-v5-omni-nano,[jina-embeddings-v5-omni-nano](https://jina.ai/models/jina-embeddings-v5-omni-nano/),[Elastic Terms](https://www.elastic.co/legal/terms-of-use),"Text, Image, Audio, Video, PDF",Embedding,,0,No,"US, SG, EU",Generally Available,9.5
Jina,Embeddings v5 Omni Small,jina-embeddings-v5-omni-small,[jina-embeddings-v5-omni-small](https://jina.ai/models/jina-embeddings-v5-omni-small/),[Elastic Terms](https://www.elastic.co/legal/terms-of-use),"Text, Image, Audio, Video, PDF",Embedding,,0,No,"US, SG, EU",Generally Available,9.5
Google,Gemini Embedding v1,google-gemini-embedding-001,[Gemini Embedding 001](https://deepmind.google/research/publications/157741/),[Google terms](https://cloud.google.com/terms),Text,Text,,55 days,No,US,Generally Available,9.3
Microsoft,Multilingual E5 Large,microsoft-multilingual-e5-large,[Multilingual E5 Large System Card](https://huggingface.co/intfloat/e5-large-v2),DeepInfra terms,Text,Embedding,,0,No,US,Generally Available,9.3
OpenAI,Text Embedding 003 Large,openai-text-embedding-3-large,[Text Embedding 003 Large](https://platform.openai.com/docs/models/text-embedding-3-large),[OpenAI terms](https://openai.com/en-GB/policies/row-terms-of-use/),Text,Text,,Unknown,No,US,Generally Available,9.3
Expand Down
169 changes: 165 additions & 4 deletions explore-analyze/machine-learning/nlp/ml-nlp-jina.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,19 @@
Jina models are currently available only through [Elastic {{infer-cap}} Service (EIS)](/explore-analyze/elastic-inference/eis.md) or [external {{infer}}](docs-content://explore-analyze/elastic-inference/external.md) providers. Since these models rely on external connectivity, they cannot currently be deployed on [{{ml}} nodes](/deploy-manage/distributed-architecture/clusters-nodes-shards/node-roles.md#ml-node-role) and are not compatible with fully air-gapped environments.
:::

:::{tip}
If you may need to search images, audio, video, or PDFs alongside text, start with a `jina-embeddings-v5-omni-*` model. The v5 omni models share the same text embedding space as their matching v5 text models, so existing `v5-text-*` vectors can be compared with text vectors from the matching omni model without reindexing.

Check notice on line 19 in explore-analyze/machine-learning/nlp/ml-nlp-jina.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.WordChoice: Consider using 'can, might' instead of 'may', unless the term is in the UI.

Check warning on line 19 in explore-analyze/machine-learning/nlp/ml-nlp-jina.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'omni' is a possible misspelling.
:::

Currently, the following models are available as built-in models:

**Embedding models**

* [`jina-embeddings-v5-text-small`](#jina-embeddings-v5-text-small)
* [`jina-embeddings-v5-text-nano`](#jina-embeddings-v5-text-nano)
* [`jina-embeddings-v3`](#jina-embeddings-v3)
* [`jina-embeddings-v5-omni-small`](#jina-embeddings-v5-omni-small) — multimodal (text, image, audio, video, PDF)
* [`jina-embeddings-v5-omni-nano`](#jina-embeddings-v5-omni-nano) — multimodal (text, image, audio, video, PDF)
* [`jina-embeddings-v5-text-small`](#jina-embeddings-v5-text-small) — text-only
* [`jina-embeddings-v5-text-nano`](#jina-embeddings-v5-text-nano) — text-only
* [`jina-embeddings-v3`](#jina-embeddings-v3) — text-only

**Rerankers**

Expand All @@ -32,7 +38,161 @@

Embedding models convert text into vector embeddings, which are fixed-length numerical representations that capture semantic meaning.
Texts with similar meaning are mapped to nearby points in vector space, so you can retrieve relevant documents with vector similarity search.
When you send text to an EIS {{infer}} endpoint that uses an embedding model, the model returns a vector of floating-point numbers (for example, 1024 values). {{es}} stores these vectors in [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) fields or through the [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) filed and uses vector similarity search to retrieve the most relevant documents for a given query. Unlike [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md), which expands text into sparse token-weight vectors, these models produce compact dense vectors that are well suited for multilingual and cross-domain use cases.
When you send text to an EIS {{infer}} endpoint that uses an embedding model, the model returns a vector of floating-point numbers (for example, 1024 values). {{es}} stores these vectors in [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) fields or through the [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field and uses vector similarity search to retrieve the most relevant documents for a given query. Unlike [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md), which expands text into sparse token-weight vectors, these models produce compact dense vectors that are well suited for multilingual and cross-domain use cases.

### Jina v5 omni embedding models [jina-embeddings-v5-omni]

Check warning on line 43 in explore-analyze/machine-learning/nlp/ml-nlp-jina.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'omni' is a possible misspelling.

The `jina-embeddings-v5-omni-*` models accept **text, image, audio, video, and PDF** inputs and place all supported input types in a shared vector space. Use them when you need cross-modal retrieval, such as querying a text index with an image or finding videos from a text query.

The v5 omni models are available through Elastic {{infer-cap}} Service (EIS), so no {{ml}} node scaling or model deployment is required.

Check warning on line 47 in explore-analyze/machine-learning/nlp/ml-nlp-jina.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'omni' is a possible misspelling.

#### `jina-embeddings-v5-omni-small` [jina-embeddings-v5-omni-small]

```{applies_to}
stack: ga 9.5
serverless: ga
```

[`jina-embeddings-v5-omni-small`](https://www.elastic.co/search-labs/blog/jina-embeddings-v5-omni-all-media-one-index) is the recommended Jina embedding model for deployments that need higher-quality mixed-media search. It produces 1024-dimension embeddings by default, supports a 32768 token input context window, and uses the same text embedding space as [`jina-embeddings-v5-text-small`](#jina-embeddings-v5-text-small).

For more information about the model, refer to the [Elastic blog post](https://www.elastic.co/search-labs/blog/jina-embeddings-v5-omni-all-media-one-index) or the [model page](https://jina.ai/models/jina-embeddings-v5-omni-small/).

#### `jina-embeddings-v5-omni-nano` [jina-embeddings-v5-omni-nano]

```{applies_to}
stack: ga 9.5
serverless: ga
```

[`jina-embeddings-v5-omni-nano`](https://www.elastic.co/search-labs/blog/jina-embeddings-v5-omni-all-media-one-index) is the compact, lower-cost member of the Jina v5 omni family. It produces 768-dimension embeddings by default, supports a 32768 token input context window, and uses the same text embedding space as [`jina-embeddings-v5-text-nano`](#jina-embeddings-v5-text-nano).

Check warning on line 67 in explore-analyze/machine-learning/nlp/ml-nlp-jina.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'omni' is a possible misspelling.

For more information about the model, refer to the [Elastic blog post](https://www.elastic.co/search-labs/blog/jina-embeddings-v5-omni-all-media-one-index) or the [model page](https://jina.ai/models/jina-embeddings-v5-omni-nano/).

#### Requirements [jina-embeddings-v5-omni-req]

To use a v5 omni model, you must have the [appropriate subscription](https://www.elastic.co/subscriptions) level. {{ecloud}} trial accounts cannot use the v5 omni models; start a paid {{ecloud}} deployment or {{serverless-short}} project to access them.

Check warning on line 73 in explore-analyze/machine-learning/nlp/ml-nlp-jina.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'omni' is a possible misspelling.

All input types require {{stack}} 9.5 or later.

#### Getting started with v5 omni models through Elastic {{infer-cap}} Service

Check warning on line 77 in explore-analyze/machine-learning/nlp/ml-nlp-jina.md

View workflow job for this annotation

GitHub Actions / build / vale

Elastic.Spelling: 'omni' is a possible misspelling.

For text input, the recommended entry point is a `semantic_text` field that references one of the preconfigured v5 omni {{infer}} endpoints. {{es}} provisions the endpoint on first reference.

Create an index with a `semantic_text` field:

```console
PUT multimodal-semantic-index
{
"mappings": {
"properties": {
"content": {
"type": "semantic_text",
"inference_id": ".jina-embeddings-v5-omni-small"
}
}
}
}
```

Index documents normally. {{es}} generates embeddings through the {{infer}} endpoint:

```console
POST multimodal-semantic-index/_doc
{
"content": "'Kraft Dinner' is what Canadians call macaroni and cheese when prepared from a kit."
}
```

Query the field with a `semantic` query:

```console
GET multimodal-semantic-index/_search
{
"query": {
"semantic": {
"field": "content",
"query": "Was bedeutet 'Kraft Dinner' für Kanadier?"
}
}
}
```

To use `jina-embeddings-v5-omni-nano`, set `inference_id` to `.jina-embeddings-v5-omni-nano` instead.

To create an explicit {{infer}} endpoint instead of using the preconfigured endpoint, use the `embedding` task type:

```console
PUT _inference/embedding/eis-jina-embeddings-v5-omni-small
{
"service": "elastic",
"service_settings": {
"model_id": "jina-embeddings-v5-omni-small"
}
}
```

#### Multimodal ingestion and querying [jina-embeddings-v5-omni-multimodal]

`semantic_text` ingests text content. To embed image, audio, video, or PDF input, or to issue a cross-modal query against a text index, call the {{infer}} endpoint directly and store or compare the resulting vector against a `dense_vector` field.

The request body is a structured `input` array. Each element holds a `content` object describing one piece of media, and each request can hold up to 16 input items. Media values are base64-encoded data URIs:

```console
POST _inference/embedding/.jina-embeddings-v5-omni-small
{
"input": [
{ "content": { "type": "image", "format": "base64", "value": "data:image/png;base64,iVBORw0KGgo..." } },
{ "content": { "type": "audio", "format": "base64", "value": "data:audio/wav;base64,UklGRiQAAAB..." } },
{ "content": { "type": "video", "format": "base64", "value": "data:video/mp4;base64,AAAAIGZ0eXA..." } },
{ "content": { "type": "pdf", "format": "base64", "value": "data:application/pdf;base64,JVBE..." } }
]
}
```

To combine several media items into a single embedding, pass an array of content fields under one `input` element:

```console
POST _inference/embedding/.jina-embeddings-v5-omni-small
{
"input": [
{
"content": [
{ "type": "text", "value": "A description of the scene" },
{ "type": "image", "format": "base64", "value": "data:image/png;base64,iVBORw0KGgo..." },
{ "type": "audio", "format": "base64", "value": "data:audio/wav;base64,UklGRiQAAAB..." }
]
}
]
}
```

The response is shaped `{"embeddings": [{"embedding": [...]}, ...]}`. The array length matches the number of input items, except for PDF input, which produces one embedding per page.

#### Upgrading from `jina-embeddings-v5-text-*` [jina-embeddings-v5-omni-migrate]

The v5 omni models share their text embedding space with the matching v5 text models. Existing `dense_vector` data populated by `jina-embeddings-v5-text-*` remains directly comparable to vectors produced from text input by the corresponding omni model, so no reindex is required.

Use the matching pair:

* `jina-embeddings-v5-omni-small` with `jina-embeddings-v5-text-small` (1024 dimensions)
* `jina-embeddings-v5-omni-nano` with `jina-embeddings-v5-text-nano` (768 dimensions)

Do not mix across the small and nano families, because their vector spaces and dimensions differ.

For `semantic_text` mappings, set the `inference_id` to the corresponding `.jina-embeddings-v5-omni-*` endpoint on new indices. Existing indices continue to work unchanged.

For code that calls `_inference` directly, the task type changes from `text_embedding` to `embedding`, and the request body changes from a flat string `input` to an array of `content` objects. See [Multimodal ingestion and querying](#jina-embeddings-v5-omni-multimodal).

For pure text workloads, keep using `v5-text-*` endpoints. Use `v5-omni-*` when mixed media is in scope.

#### Performance considerations [jina-embeddings-v5-omni-performance]

* Use `jina-embeddings-v5-omni-small` when retrieval quality is the main priority. Use `jina-embeddings-v5-omni-nano` when ingestion volume, latency, or cost is the main constraint.
* Each {{infer}} request can contain up to 16 input items.
* Image inputs must be at least 28×28 pixels (784 pixels total).
* PDF inputs return one embedding per page.
* Video is sampled at 32 uniformly spaced frames regardless of clip length. For long videos, segment into shorter clips for finer temporal resolution.
* Although the models support a 32768 token context window, consider chunking very large text fields to control latency and cost.

### `jina-embeddings-v5-text-small` [jina-embeddings-v5-text-small]

Expand Down Expand Up @@ -270,6 +430,7 @@

The following blog posts provide additional background and context:

* [jina-embeddings-v5-omni for text, images, video, and audio](https://www.elastic.co/search-labs/blog/jina-embeddings-v5-omni-all-media-one-index)
* [jina-embeddings-v5-text: Compact state-of-the-art text embeddings for search and intelligent applications](https://www.elastic.co/search-labs/blog/jina-embeddings-v5-text)
* [Jina rerankers bring fast, multilingual reranking to Elastic Inference Service (EIS)](https://www.elastic.co/search-labs/blog/jina-rerankers-elastic-inference-service)
* [jina-embeddings-v3 is now available on Elastic Inference Service](https://www.elastic.co/search-labs/blog/jina-embeddings-v3-elastic-inference-service)
Loading