From 976e92fe1c7045d59ee5a372e8d7ea7ad4026d52 Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Wed, 12 Nov 2025 13:03:30 +0100 Subject: [PATCH 1/7] [ON week] Add language client examples to get-started/quickstarts --- solutions/search/get-started/index-basics.md | 1013 ++++++++++++++++- .../search/get-started/semantic-search.md | 500 +++++++- 2 files changed, 1459 insertions(+), 54 deletions(-) diff --git a/solutions/search/get-started/index-basics.md b/solutions/search/get-started/index-basics.md index 1d271869d4..603186254f 100644 --- a/solutions/search/get-started/index-basics.md +++ b/solutions/search/get-started/index-basics.md @@ -7,10 +7,10 @@ applies_to: This quickstart provides a hands-on introduction to the fundamental concepts of {{es}}: [indices, documents, and field type mappings](../../../manage-data/data-store/index-basics.md). You'll learn how to create an index, add documents, work with dynamic and explicit mappings, and perform your first basic searches. -::::{tip} +:::::{tip} The code examples are in [Console](/explore-analyze/query-filter/tools/console.md) syntax by default. You can [convert into other programming languages](/explore-analyze/query-filter/tools/console.md#import-export-console-requests) in the Console UI. -:::: +::::: ## Requirements [getting-started-requirements] @@ -20,25 +20,119 @@ To get started quickly, spin up a cluster [locally in Docker](/deploy-manage/dep ## Add data to {{es}} [getting-started-index-creation] -::::{tip} +:::::{tip} This quickstart uses {{es}} APIs, but there are many other ways to [add data to {{es}}](/solutions/search/ingest-for-search.md). -:::: +::::: You add data to {{es}} as JSON objects called documents. {{es}} stores these documents in searchable indices. -:::::{stepper} -::::{step} Create an index +::::::{stepper} +:::::{step} Create an index Create a new index named `books`: +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console PUT /books ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X PUT "$ELASTICSEARCH_URL/books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` +::: + +:::{tab-item} Python +:sync: python +```python +import os +from elasticsearch import Elasticsearch + +client = Elasticsearch( + hosts=["$ELASTICSEARCH_URL"], + api_key=os.getenv("ELASTIC_API_KEY"), +) + +resp = client.indices.create( + index="books", +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const { Client } = require("@elastic/elasticsearch"); + +const client = new Client({ + nodes: ["$ELASTICSEARCH_URL"], + auth: { + apiKey: process.env["ELASTIC_API_KEY"], + }, +}); + +async function run() { + const response = await client.indices.create({ + index: "books", + }); +} + +run(); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +setHosts(["$ELASTICSEARCH_URL"]) + ->setApiKey(getenv("ELASTIC_API_KEY")) + ->build(); + +$resp = $client->indices()->create([ + "index" => "books", +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +require "elasticsearch" + +client = Elasticsearch::Client.new( + host: "$ELASTICSEARCH_URL", + api_key: ENV["ELASTIC_API_KEY"] +) + +response = client.indices.create( + index: "books" +) + +``` +::: + +:::: The following response indicates the index was created successfully. -:::{dropdown} Example response +::::{dropdown} Example response ```console-result { @@ -48,13 +142,18 @@ The following response indicates the index was created successfully. } ``` -::: :::: -::::{step} Add a single document +::::: +:::::{step} Add a single document Use the following request to add a single document to the `books` index. If the index doesn't already exist, this request will automatically create it. +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console POST books/_doc { @@ -64,10 +163,86 @@ POST books/_doc "page_count": 470 } ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X POST "$ELASTICSEARCH_URL/books/_doc" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"name":"Snow Crash","author":"Neal Stephenson","release_date":"1992-06-01","page_count":470}' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.index( + index="books", + document={ + "name": "Snow Crash", + "author": "Neal Stephenson", + "release_date": "1992-06-01", + "page_count": 470 + }, +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.index({ + index: "books", + document: { + name: "Snow Crash", + author: "Neal Stephenson", + release_date: "1992-06-01", + page_count: 470, + }, +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->index([ + "index" => "books", + "body" => [ + "name" => "Snow Crash", + "author" => "Neal Stephenson", + "release_date" => "1992-06-01", + "page_count" => 470, + ], +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.index( + index: "books", + body: { + "name": "Snow Crash", + "author": "Neal Stephenson", + "release_date": "1992-06-01", + "page_count": 470 + } +) + +``` +::: + +:::: The response includes metadata that {{es}} generates for the document, including a unique `_id` for the document within the index. -:::{dropdown} Example response +::::{dropdown} Example response ```console-result { @@ -95,13 +270,18 @@ The response includes metadata that {{es}} generates for the document, including 8. `failed`: The number of shards that failed during the indexing operation. *0* indicates no failures. 9. `_seq_no`: A monotonically increasing number incremented for each indexing operation on a shard. 10. `_primary_term`: A monotonically increasing number incremented each time a primary shard is assigned to a different node. -::: :::: -::::{step} Add multiple documents +::::: +:::::{step} Add multiple documents Use the [`_bulk` endpoint]({{es-apis}}operation/operation-bulk) to add multiple documents in a single request. Bulk data must be formatted as newline-delimited JSON (NDJSON). +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console POST /_bulk { "index" : { "_index" : "books" } } @@ -115,10 +295,295 @@ POST /_bulk { "index" : { "_index" : "books" } } {"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X POST "$ELASTICSEARCH_URL/_bulk" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d $'{"index":{"_index":"books"}} +{"name":"Revelation Space","author":"Alastair Reynolds","release_date":"2000-03-15","page_count":585} +{"index":{"_index":"books"}} +{"name":"1984","author":"George Orwell","release_date":"1985-06-01","page_count":328} +{"index":{"_index":"books"}} +{"name":"Fahrenheit 451","author":"Ray Bradbury","release_date":"1953-10-15","page_count":227} +{"index":{"_index":"books"}} +{"name":"Brave New World","author":"Aldous Huxley","release_date":"1932-06-01","page_count":268} +{"index":{"_index":"books"}} +{"name":"The Handmaids Tale","author":"Margaret Atwood","release_date":"1985-06-01","page_count":311}\n' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.bulk( + operations=[ + { + "index": { + "_index": "books" + } + }, + { + "name": "Revelation Space", + "author": "Alastair Reynolds", + "release_date": "2000-03-15", + "page_count": 585 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "1984", + "author": "George Orwell", + "release_date": "1985-06-01", + "page_count": 328 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Fahrenheit 451", + "author": "Ray Bradbury", + "release_date": "1953-10-15", + "page_count": 227 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Brave New World", + "author": "Aldous Huxley", + "release_date": "1932-06-01", + "page_count": 268 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "The Handmaids Tale", + "author": "Margaret Atwood", + "release_date": "1985-06-01", + "page_count": 311 + } + ], +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.bulk({ + operations: [ + { + index: { + _index: "books", + }, + }, + { + name: "Revelation Space", + author: "Alastair Reynolds", + release_date: "2000-03-15", + page_count: 585, + }, + { + index: { + _index: "books", + }, + }, + { + name: "1984", + author: "George Orwell", + release_date: "1985-06-01", + page_count: 328, + }, + { + index: { + _index: "books", + }, + }, + { + name: "Fahrenheit 451", + author: "Ray Bradbury", + release_date: "1953-10-15", + page_count: 227, + }, + { + index: { + _index: "books", + }, + }, + { + name: "Brave New World", + author: "Aldous Huxley", + release_date: "1932-06-01", + page_count: 268, + }, + { + index: { + _index: "books", + }, + }, + { + name: "The Handmaids Tale", + author: "Margaret Atwood", + release_date: "1985-06-01", + page_count: 311, + }, + ], +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->bulk([ + "body" => array( + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "Revelation Space", + "author" => "Alastair Reynolds", + "release_date" => "2000-03-15", + "page_count" => 585, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "1984", + "author" => "George Orwell", + "release_date" => "1985-06-01", + "page_count" => 328, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "Fahrenheit 451", + "author" => "Ray Bradbury", + "release_date" => "1953-10-15", + "page_count" => 227, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "Brave New World", + "author" => "Aldous Huxley", + "release_date" => "1932-06-01", + "page_count" => 268, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "The Handmaids Tale", + "author" => "Margaret Atwood", + "release_date" => "1985-06-01", + "page_count" => 311, + ], + ), +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.bulk( + body: [ + { + "index": { + "_index": "books" + } + }, + { + "name": "Revelation Space", + "author": "Alastair Reynolds", + "release_date": "2000-03-15", + "page_count": 585 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "1984", + "author": "George Orwell", + "release_date": "1985-06-01", + "page_count": 328 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Fahrenheit 451", + "author": "Ray Bradbury", + "release_date": "1953-10-15", + "page_count": 227 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Brave New World", + "author": "Aldous Huxley", + "release_date": "1932-06-01", + "page_count": 268 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "The Handmaids Tale", + "author": "Margaret Atwood", + "release_date": "1985-06-01", + "page_count": 311 + } + ] +) + +``` +::: + +:::: You should receive a response indicating there were no errors. -:::{dropdown} Example response +::::{dropdown} Example response ```console-result { @@ -209,9 +674,9 @@ You should receive a response indicating there were no errors. } ``` -::: :::: -::::{step} Use dynamic mapping +::::: +:::::{step} Use dynamic mapping [Mappings](/manage-data/data-store/index-basics.md#elasticsearch-intro-documents-fields-mappings) define how data is stored and indexed in {{es}}, like a schema in a relational database. @@ -220,6 +685,11 @@ The documents you've added so far have used dynamic mapping, because you didn't To see how dynamic mapping works, add a new document to the `books` index with a field that isn't available in the existing documents. +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console POST /books/_doc { @@ -230,18 +700,152 @@ POST /books/_doc "language": "EN" <1> } ``` - 1. The new field. +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X POST "$ELASTICSEARCH_URL/books/_doc" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"name":"The Great Gatsby","author":"F. Scott Fitzgerald","release_date":"1925-04-10","page_count":180,"language":"EN"}' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.index( + index="books", + document={ + "name": "The Great Gatsby", + "author": "F. Scott Fitzgerald", + "release_date": "1925-04-10", + "page_count": 180, + "language": "EN" + }, +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.index({ + index: "books", + document: { + name: "The Great Gatsby", + author: "F. Scott Fitzgerald", + release_date: "1925-04-10", + page_count: 180, + language: "EN", + }, +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->index([ + "index" => "books", + "body" => [ + "name" => "The Great Gatsby", + "author" => "F. Scott Fitzgerald", + "release_date" => "1925-04-10", + "page_count" => 180, + "language" => "EN", + ], +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.index( + index: "books", + body: { + "name": "The Great Gatsby", + "author": "F. Scott Fitzgerald", + "release_date": "1925-04-10", + "page_count": 180, + "language": "EN" + } +) + +``` +::: + +:::: View the mapping for the `books` index with the [get mapping API]({{es-apis}}operation/operation-indices-get-mapping). The new field `language` has been added to the mapping with a `text` data type. +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console GET /books/_mapping ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X GET "$ELASTICSEARCH_URL/books/_mapping" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.indices.get_mapping( + index="books", +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.indices.getMapping({ + index: "books", +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->indices()->getMapping([ + "index" => "books", +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.indices.get_mapping( + index: "books" +) + +``` +::: + +:::: The following response displays the mappings that were created by {{es}}. -:::{dropdown} Example response +::::{dropdown} Example response ```console-result { @@ -286,14 +890,19 @@ The following response displays the mappings that were created by {{es}}. } } ``` -::: :::: -::::{step} Define explicit mapping +::::: +:::::{step} Define explicit mapping Create an index named `my-explicit-mappings-books` and specify the mappings yourself. Pass each field's properties as a JSON object. This object should contain the [field data type](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md) and any additional [mapping parameters](elasticsearch://reference/elasticsearch/mapping-reference/mapping-parameters.md). +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console PUT /my-explicit-mappings-books { @@ -308,12 +917,139 @@ PUT /my-explicit-mappings-books } } ``` - 1. `dynamic`: Turns off dynamic mapping for the index. If you don't define fields in the mapping, they'll still be stored in the document's `_source` field, but you can't index or search them. 2. `properties`: Defines the fields and their corresponding data types. +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X PUT "$ELASTICSEARCH_URL/my-explicit-mappings-books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"mappings":{"dynamic":false,"properties":{"name":{"type":"text"},"author":{"type":"text"},"release_date":{"type":"date","format":"yyyy-MM-dd"},"page_count":{"type":"integer"}}}}' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.indices.create( + index="my-explicit-mappings-books", + mappings={ + "dynamic": False, + "properties": { + "name": { + "type": "text" + }, + "author": { + "type": "text" + }, + "release_date": { + "type": "date", + "format": "yyyy-MM-dd" + }, + "page_count": { + "type": "integer" + } + } + }, +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.indices.create({ + index: "my-explicit-mappings-books", + mappings: { + dynamic: false, + properties: { + name: { + type: "text", + }, + author: { + type: "text", + }, + release_date: { + type: "date", + format: "yyyy-MM-dd", + }, + page_count: { + type: "integer", + }, + }, + }, +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->indices()->create([ + "index" => "my-explicit-mappings-books", + "body" => [ + "mappings" => [ + "dynamic" => false, + "properties" => [ + "name" => [ + "type" => "text", + ], + "author" => [ + "type" => "text", + ], + "release_date" => [ + "type" => "date", + "format" => "yyyy-MM-dd", + ], + "page_count" => [ + "type" => "integer", + ], + ], + ], + ], +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.indices.create( + index: "my-explicit-mappings-books", + body: { + "mappings": { + "dynamic": false, + "properties": { + "name": { + "type": "text" + }, + "author": { + "type": "text" + }, + "release_date": { + "type": "date", + "format": "yyyy-MM-dd" + }, + "page_count": { + "type": "integer" + } + } + } + } +) + +``` +::: + +:::: The following response indicates a successful operation. -:::{dropdown} Example response +::::{dropdown} Example response ```console-result { "acknowledged": true, @@ -321,29 +1057,84 @@ The following response indicates a successful operation. "index": "my-explicit-mappings-books" } ``` -::: +:::: Explicit mappings are defined at index creation, and documents must conform to these mappings. You can also use the [update mapping API]({{es-apis}}operation/operation-indices-put-mapping). When an index has the `dynamic` flag set to `true`, you can add new fields to documents without updating the mapping, which allows you to combine explicit and dynamic mappings. Learn more about [managing and updating mappings](/manage-data/data-store/mapping.md#mapping-manage-update). -:::: ::::: +:::::: ## Search your data [getting-started-search-data] Indexed documents are available for search in near real-time, using the [`_search` API](/solutions/search/querying-for-search.md). -:::::{stepper} -::::{step} Search all documents +::::::{stepper} +:::::{step} Search all documents Use the following request to search all documents in the `books` index: +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console GET books/_search ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X GET "$ELASTICSEARCH_URL/books/_search" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.search( + index="books", +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.search({ + index: "books", +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->search([ + "index" => "books", +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.search( + index: "books" +) + +``` +::: -:::{dropdown} Example response +:::: + +::::{dropdown} Example response ```console-result { "took": 2, <1> @@ -389,14 +1180,19 @@ GET books/_search 9. `_score`: The relevance score of the document 10. `_source`: The original JSON object submitted during indexing -::: :::: -::::{step} Search with a match query +::::: +:::::{step} Search with a match query Use the [`match` query](elasticsearch://reference/query-languages/query-dsl/query-dsl-match-query.md) to search for documents that contain a specific value in a specific field. This is the standard query for full-text searches. Use the following request to search the `books` index for documents containing `brave` in the `name` field: +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console GET books/_search { @@ -407,12 +1203,88 @@ GET books/_search } } ``` +::: -:::{tip} -This example uses [Query DSL](/explore-analyze/query-filter/languages/querydsl.md), which is the primary query language for {{es}}. +:::{tab-item} curl +:sync: curl +```bash +curl -X GET "$ELASTICSEARCH_URL/books/_search" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"query":{"match":{"name":"brave"}}}' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.search( + index="books", + query={ + "match": { + "name": "brave" + } + }, +) + +``` ::: -:::{dropdown} Example response +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.search({ + index: "books", + query: { + match: { + name: "brave", + }, + }, +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->search([ + "index" => "books", + "body" => [ + "query" => [ + "match" => [ + "name" => "brave", + ], + ], + ], +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.search( + index: "books", + body: { + "query": { + "match": { + "name": "brave" + } + } + } +) + +``` +::: + +:::: + +::::{tip} +This example uses [Query DSL](/explore-analyze/query-filter/languages/querydsl.md), which is the primary query language for {{es}}. +:::: + +::::{dropdown} Example response ```console-result { "took": 9, @@ -447,9 +1319,9 @@ This example uses [Query DSL](/explore-analyze/query-filter/languages/querydsl.m ``` 1. `max_score`: Score of the highest-scoring document in the results. In this case, there is only one matching document, so the `max_score` is the score of that document. -::: :::: ::::: +:::::: ## Delete your indices [getting-started-delete-indices] @@ -457,16 +1329,89 @@ If you want to delete an index to start from scratch at any point, use the [dele For example, use the following request to delete the indices created in this quickstart: +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console DELETE /books DELETE /my-explicit-mappings-books ``` +::: -::::{warning} -Deleting an index permanently deletes its documents, shards, and metadata. +:::{tab-item} curl +:sync: curl +```bash +curl -X DELETE "$ELASTICSEARCH_URL/books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +curl -X DELETE "$ELASTICSEARCH_URL/my-explicit-mappings-books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.indices.delete( + index="books", +) + +resp1 = client.indices.delete( + index="my-explicit-mappings-books", +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.indices.delete({ + index: "books", +}); + +const response1 = await client.indices.delete({ + index: "my-explicit-mappings-books", +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->indices()->delete([ + "index" => "books", +]); + +$resp1 = $client->indices()->delete([ + "index" => "my-explicit-mappings-books", +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.indices.delete( + index: "books" +) + +response1 = client.indices.delete( + index: "my-explicit-mappings-books" +) + +``` +::: :::: +:::::{warning} +Deleting an index permanently deletes its documents, shards, and metadata. + +::::: + ## Next steps This quickstart introduced the basics of creating indices, adding data, and performing basic searches with {{es}}. diff --git a/solutions/search/get-started/semantic-search.md b/solutions/search/get-started/semantic-search.md index babff9646b..1add22a701 100644 --- a/solutions/search/get-started/semantic-search.md +++ b/solutions/search/get-started/semantic-search.md @@ -37,8 +37,8 @@ The way that you store vectors has a significant impact on the performance and a They must be stored in specialized data structures designed to ensure efficient similarity search and speedy vector distance calculations. This guide uses the [semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md), which provides sensible defaults and automation. -:::::{stepper} -::::{step} Create an index +::::::{stepper} +:::::{step} Create an index An index is a collection of documents uniquely identified by a name or an alias. You can follow the guided index workflow: @@ -49,18 +49,117 @@ You can follow the guided index workflow: When you complete the workflow, you will have sample data and can skip to the steps related to exploring and searching it. Alternatively, run the following API request in [Console](/explore-analyze/query-filter/tools/console.md): +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console PUT /semantic-index ``` +::: -:::{tip} -For an introduction to the concept of indices, check out [](/manage-data/data-store/index-basics.md). +:::{tab-item} curl +:sync: curl +```bash +curl -X PUT "$ELASTICSEARCH_URL/semantic-index" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` +::: + +:::{tab-item} Python +:sync: python +```python +import os +from elasticsearch import Elasticsearch + +client = Elasticsearch( + hosts=["$ELASTICSEARCH_URL"], + api_key=os.getenv("ELASTIC_API_KEY"), +) + +resp = client.indices.create( + index="semantic-index", +) + +``` ::: + +:::{tab-item} JavaScript +:sync: js +```js +const { Client } = require("@elastic/elasticsearch"); + +const client = new Client({ + nodes: ["$ELASTICSEARCH_URL"], + auth: { + apiKey: process.env["ELASTIC_API_KEY"], + }, +}); + +async function run() { + const response = await client.indices.create({ + index: "semantic-index", + }); +} + +run(); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +setHosts(["$ELASTICSEARCH_URL"]) + ->setApiKey(getenv("ELASTIC_API_KEY")) + ->build(); + +$resp = $client->indices()->create([ + "index" => "semantic-index", +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +require "elasticsearch" + +client = Elasticsearch::Client.new( + host: "$ELASTICSEARCH_URL", + api_key: ENV["ELASTIC_API_KEY"] +) + +response = client.indices.create( + index: "semantic-index" +) + +``` +::: + +:::: + +::::{tip} +For an introduction to the concept of indices, check out [](/manage-data/data-store/index-basics.md). :::: -::::{step} Create a semantic_text field mapping +::::: +:::::{step} Create a semantic_text field mapping Each index has mappings that define how data is stored and indexed, like a schema in a relational database. The following example creates a mapping for a single field ("content"): +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console PUT /semantic-index/_mapping { @@ -71,17 +170,98 @@ PUT /semantic-index/_mapping } } ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X PUT "$ELASTICSEARCH_URL/semantic-index/_mapping" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"properties":{"content":{"type":"semantic_text"}}}' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.indices.put_mapping( + index="semantic-index", + properties={ + "content": { + "type": "semantic_text" + } + }, +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.indices.putMapping({ + index: "semantic-index", + properties: { + content: { + type: "semantic_text", + }, + }, +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->indices()->putMapping([ + "index" => "semantic-index", + "body" => [ + "properties" => [ + "content" => [ + "type" => "semantic_text", + ], + ], + ], +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.indices.put_mapping( + index: "semantic-index", + body: { + "properties": { + "content": { + "type": "semantic_text" + } + } + } +) + +``` +::: + +:::: When you use `semantic_text` fields, the type of vector is determined by the vector embedding model. In this case, the default ELSER model will be used to create sparse vectors. For a deeper dive, check out [Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector](https://www.elastic.co/search-labs/blog/mapping-embeddings-to-elasticsearch-field-types). -:::: +::::: -::::{step} Add documents +:::::{step} Add documents You can use the Elasticsearch bulk API to ingest an array of documents: +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console POST /_bulk?pretty { "index": { "_index": "semantic-index" } } @@ -91,6 +271,165 @@ POST /_bulk?pretty { "index": { "_index": "semantic-index" } } {"content":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."} ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X POST "$ELASTICSEARCH_URL/_bulk?pretty" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '[{"index":{"_index":"semantic-index"}},{"content":"Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site."},{"index":{"_index":"semantic-index"}},{"content":"Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face."},{"index":{"_index":"semantic-index"}},{"content":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."}]' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.bulk( + pretty=True, + operations=[ + { + "index": { + "_index": "semantic-index" + } + }, + { + "content": "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site." + }, + { + "index": { + "_index": "semantic-index" + } + }, + { + "content": "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face." + }, + { + "index": { + "_index": "semantic-index" + } + }, + { + "content": "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site." + } + ], +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.bulk({ + pretty: "true", + operations: [ + { + index: { + _index: "semantic-index", + }, + }, + { + content: + "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site.", + }, + { + index: { + _index: "semantic-index", + }, + }, + { + content: + "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face.", + }, + { + index: { + _index: "semantic-index", + }, + }, + { + content: + "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site.", + }, + ], +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->bulk([ + "pretty" => "true", + "body" => array( + [ + "index" => [ + "_index" => "semantic-index", + ], + ], + [ + "content" => "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site.", + ], + [ + "index" => [ + "_index" => "semantic-index", + ], + ], + [ + "content" => "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face.", + ], + [ + "index" => [ + "_index" => "semantic-index", + ], + ], + [ + "content" => "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site.", + ], + ), +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.bulk( + pretty: "true", + body: [ + { + "index": { + "_index": "semantic-index" + } + }, + { + "content": "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site." + }, + { + "index": { + "_index": "semantic-index" + } + }, + { + "content": "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face." + }, + { + "index": { + "_index": "semantic-index" + } + }, + { + "content": "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site." + } + ] +) + +``` +::: + +:::: The bulk ingestion might take longer than the default request timeout. If it times out, wait for the ELSER model to load (typically 1-5 minutes) then retry it. @@ -102,8 +441,8 @@ Each chunk of text is then transformed into a sparse vector by using the ELSER m ![Semantic search chunking](https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt9bbe5e260012b15d/67ffffc8165067d96124b586/animated-gif-semantic-search-chunking.gif) The vectors are stored in {{es}} and are ready to be used for semantic search. -:::: -::::{step} Explore the data +::::: +:::::{step} Explore the data To familiarize yourself with this data set, open [Discover](/explore-analyze/discover.md) from the navigation menu or the global search field. @@ -112,11 +451,11 @@ In **Discover**, you can click the expand icon {icon}`expand` to show details ab :::{image} /solutions/images/serverless-discover-semantic.png :screenshot: :alt: Discover table view with document expanded -::: +:::: For more tips, check out [](/explore-analyze/discover/discover-get-started.md). -:::: ::::: +:::::: ## Test semantic search @@ -124,8 +463,8 @@ When you run a semantic search, the text in your query must be turned into vecto This step is performed automatically when you use `semantic_text` fields. You therefore only need to pick a query language and a method for comparing the vectors. -:::::{stepper} -::::{step} Choose a query language +::::::{stepper} +:::::{step} Choose a query language {{es}} provides a variety of query languages for interacting with your data. For an overview of their features and use cases, check out [](/explore-analyze/query-filter/languages.md). @@ -133,8 +472,8 @@ The [Elasticsearch Query Language](elasticsearch://reference/query-languages/esq It enables you to query your data directly in **Discover**, so it's a good one to start with. Go to **Discover** and select **Try ES|QL** from the application menu bar. -:::: -::::{step} Choose a vector comparison method +::::: +:::::{step} Choose a vector comparison method You can search data that is stored in `semantic_text` fields by using a specific subset of queries, including `knn`, `match`, `semantic`, and `sparse_vector`. For the definitive list of supported queries, refer to [Semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md). @@ -155,8 +494,8 @@ When you click **▶Run**, the results appear in a table. Each row in the table represents a document. To learn more about these commands, refer to [ES|QL syntax reference](elasticsearch://reference/query-languages/esql/esql-syntax-reference.md) and [](/solutions/search/esql-for-search.md). -:::: -::::{step} Analyze the results +::::: +:::::{step} Analyze the results To have a better understanding of how well each document matches your query, add commands to include the relevance score and sort the results based on that value. For example: @@ -173,19 +512,24 @@ FROM semantic-index METADATA _score <1> 2. The KEEP processing command affects the columns and their order in the results table. 3. The results are sorted in descending order based on the `_score`. -:::{tip} +::::{tip} Click the **ES|QL help** button to open the in-product reference documentation for all commands and functions or to get recommended queries. For more tips, check out [Using ES|QL in Discover](/explore-analyze/discover/try-esql.md). -::: +:::: In this example, the first row in the table is the document related to Rocky Mountain National Park, which had the highest relevance score for the query: :::{image} /solutions/images/serverless-discover-semantic-esql.png :screenshot: :alt: Run an ES|QL semantic query in Discover -::: +:::: Optionally, try out the same search as an API request in **Console**: +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console POST /_query?format=txt { @@ -198,15 +542,131 @@ POST /_query?format=txt """ } ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X POST "$ELASTICSEARCH_URL/_query?format=txt" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"query":"\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n "}' +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.esql.query( + format="txt", + query="\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n ", +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.esql.query({ + format: "txt", + query: + '\n FROM semantic-index METADATA _score\n | WHERE content: "best spot for rappelling"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n ', +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->esql()->query([ + "format" => "txt", + "body" => [ + "query" => "\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n ", + ], +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.esql.query( + format: "txt", + body: { + "query": "\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n " + } +) + +``` +::: + +:::: When you finish your tests and no longer need the sample data set, delete the index: +::::{tab-set} +:group: api-examples + +:::{tab-item} Console +:sync: console ```console DELETE /semantic-index ``` +::: + +:::{tab-item} curl +:sync: curl +```bash +curl -X DELETE "$ELASTICSEARCH_URL/semantic-index" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` +::: + +:::{tab-item} Python +:sync: python +```python +resp = client.indices.delete( + index="semantic-index", +) + +``` +::: + +:::{tab-item} JavaScript +:sync: js +```js +const response = await client.indices.delete({ + index: "semantic-index", +}); +``` +::: + +:::{tab-item} PHP +:sync: php +```php +$resp = $client->indices()->delete([ + "index" => "semantic-index", +]); + +``` +::: + +:::{tab-item} Ruby +:sync: ruby +```ruby +response = client.indices.delete( + index: "semantic-index" +) + +``` +::: :::: + ::::: +:::::: ## Next steps From 75ccb1856711fb5c418c8777d857484572d39f59 Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Thu, 13 Nov 2025 14:23:23 +0100 Subject: [PATCH 2/7] cleanup setup code --- solutions/search/get-started/index-basics.md | 8 ++++---- solutions/search/get-started/semantic-search.md | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/solutions/search/get-started/index-basics.md b/solutions/search/get-started/index-basics.md index 603186254f..13889920a6 100644 --- a/solutions/search/get-started/index-basics.md +++ b/solutions/search/get-started/index-basics.md @@ -57,7 +57,7 @@ import os from elasticsearch import Elasticsearch client = Elasticsearch( - hosts=["$ELASTICSEARCH_URL"], + hosts=[os.getenv("ELASTICSEARCH_URL")], api_key=os.getenv("ELASTIC_API_KEY"), ) @@ -74,7 +74,7 @@ resp = client.indices.create( const { Client } = require("@elastic/elasticsearch"); const client = new Client({ - nodes: ["$ELASTICSEARCH_URL"], + nodes: [process.env["ELASTICSEARCH_URL"]], auth: { apiKey: process.env["ELASTIC_API_KEY"], }, @@ -100,7 +100,7 @@ require(__DIR__ . "/vendor/autoload.php"); use Elastic\Elasticsearch\ClientBuilder; $client = ClientBuilder::create() - ->setHosts(["$ELASTICSEARCH_URL"]) + ->setHosts([getenv("ELASTICSEARCH_URL")]) ->setApiKey(getenv("ELASTIC_API_KEY")) ->build(); @@ -117,7 +117,7 @@ $resp = $client->indices()->create([ require "elasticsearch" client = Elasticsearch::Client.new( - host: "$ELASTICSEARCH_URL", + host: ENV["ELASTICSEARCH_URL"], api_key: ENV["ELASTIC_API_KEY"] ) diff --git a/solutions/search/get-started/semantic-search.md b/solutions/search/get-started/semantic-search.md index 1add22a701..a775155292 100644 --- a/solutions/search/get-started/semantic-search.md +++ b/solutions/search/get-started/semantic-search.md @@ -74,7 +74,7 @@ import os from elasticsearch import Elasticsearch client = Elasticsearch( - hosts=["$ELASTICSEARCH_URL"], + hosts=[os.getenv("ELASTICSEARCH_URL")], api_key=os.getenv("ELASTIC_API_KEY"), ) @@ -91,7 +91,7 @@ resp = client.indices.create( const { Client } = require("@elastic/elasticsearch"); const client = new Client({ - nodes: ["$ELASTICSEARCH_URL"], + nodes: [process.env["ELASTICSEARCH_URL"]], auth: { apiKey: process.env["ELASTIC_API_KEY"], }, @@ -117,7 +117,7 @@ require(__DIR__ . "/vendor/autoload.php"); use Elastic\Elasticsearch\ClientBuilder; $client = ClientBuilder::create() - ->setHosts(["$ELASTICSEARCH_URL"]) + ->setHosts([getenv("ELASTICSEARCH_URL")]) ->setApiKey(getenv("ELASTIC_API_KEY")) ->build(); @@ -134,7 +134,7 @@ $resp = $client->indices()->create([ require "elasticsearch" client = Elasticsearch::Client.new( - host: "$ELASTICSEARCH_URL", + host: ENV["ELASTICSEARCH_URL"], api_key: ENV["ELASTIC_API_KEY"] ) From 4f25e0b7537fe5652b2c582998726ffca6e979d6 Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Fri, 14 Nov 2025 12:41:24 +0100 Subject: [PATCH 3/7] Refactor index-basics.md to use snippet includes - Extract code examples into _snippets/index-basics/ directory - Replace inline code blocks with include directives - Generate snippet files --- .../index-basics/example1-console.md | 3 + .../_snippets/index-basics/example1-curl.md | 4 ++ .../_snippets/index-basics/example1-js.md | 19 ++++++ .../_snippets/index-basics/example1-php.md | 18 ++++++ .../_snippets/index-basics/example1-python.md | 15 +++++ .../_snippets/index-basics/example1-ruby.md | 14 +++++ .../index-basics/example2-console.md | 9 +++ .../_snippets/index-basics/example2-curl.md | 6 ++ .../_snippets/index-basics/example2-js.md | 12 ++++ .../_snippets/index-basics/example2-php.md | 13 ++++ .../_snippets/index-basics/example2-python.md | 13 ++++ .../_snippets/index-basics/example2-ruby.md | 13 ++++ .../index-basics/example3-console.md | 13 ++++ .../_snippets/index-basics/example3-curl.md | 6 ++ .../_snippets/index-basics/example3-js.md | 62 ++++++++++++++++++ .../_snippets/index-basics/example3-php.md | 63 +++++++++++++++++++ .../_snippets/index-basics/example3-python.md | 63 +++++++++++++++++++ .../_snippets/index-basics/example3-ruby.md | 63 +++++++++++++++++++ .../index-basics/example4-console.md | 12 ++++ .../_snippets/index-basics/example4-curl.md | 6 ++ .../_snippets/index-basics/example4-js.md | 13 ++++ .../_snippets/index-basics/example4-php.md | 14 +++++ .../_snippets/index-basics/example4-python.md | 14 +++++ .../_snippets/index-basics/example4-ruby.md | 14 +++++ .../index-basics/example5-console.md | 3 + .../_snippets/index-basics/example5-curl.md | 4 ++ .../_snippets/index-basics/example5-js.md | 6 ++ .../_snippets/index-basics/example5-php.md | 7 +++ .../_snippets/index-basics/example5-python.md | 7 +++ .../_snippets/index-basics/example5-ruby.md | 7 +++ .../index-basics/example6-console.md | 17 +++++ .../_snippets/index-basics/example6-curl.md | 6 ++ .../_snippets/index-basics/example6-js.md | 24 +++++++ .../_snippets/index-basics/example6-php.md | 27 ++++++++ .../_snippets/index-basics/example6-python.md | 25 ++++++++ .../_snippets/index-basics/example6-ruby.md | 27 ++++++++ .../index-basics/example7-console.md | 3 + .../_snippets/index-basics/example7-curl.md | 4 ++ .../_snippets/index-basics/example7-js.md | 6 ++ .../_snippets/index-basics/example7-php.md | 7 +++ .../_snippets/index-basics/example7-python.md | 7 +++ .../_snippets/index-basics/example7-ruby.md | 7 +++ .../index-basics/example8-console.md | 10 +++ .../_snippets/index-basics/example8-curl.md | 6 ++ .../_snippets/index-basics/example8-js.md | 11 ++++ .../_snippets/index-basics/example8-php.md | 14 +++++ .../_snippets/index-basics/example8-python.md | 12 ++++ .../_snippets/index-basics/example8-ruby.md | 14 +++++ .../index-basics/example9-console.md | 4 ++ .../_snippets/index-basics/example9-curl.md | 6 ++ .../_snippets/index-basics/example9-js.md | 11 ++++ .../_snippets/index-basics/example9-php.md | 12 ++++ .../_snippets/index-basics/example9-python.md | 12 ++++ .../_snippets/index-basics/example9-ruby.md | 12 ++++ 54 files changed, 800 insertions(+) create mode 100644 solutions/search/get-started/_snippets/index-basics/example1-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example1-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example1-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example1-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example1-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example1-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example2-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example2-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example2-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example2-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example2-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example2-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example3-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example3-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example3-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example3-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example3-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example3-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example4-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example4-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example4-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example4-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example4-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example4-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example5-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example5-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example5-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example5-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example5-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example5-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example6-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example6-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example6-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example6-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example6-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example6-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example7-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example7-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example7-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example7-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example7-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example7-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example8-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example8-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example8-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example8-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example8-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example8-ruby.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example9-console.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example9-curl.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example9-js.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example9-php.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example9-python.md create mode 100644 solutions/search/get-started/_snippets/index-basics/example9-ruby.md diff --git a/solutions/search/get-started/_snippets/index-basics/example1-console.md b/solutions/search/get-started/_snippets/index-basics/example1-console.md new file mode 100644 index 0000000000..2d87bd71bd --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example1-console.md @@ -0,0 +1,3 @@ +```console +PUT /books +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example1-curl.md b/solutions/search/get-started/_snippets/index-basics/example1-curl.md new file mode 100644 index 0000000000..374a189c3d --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example1-curl.md @@ -0,0 +1,4 @@ +```bash +curl -X PUT "$ELASTICSEARCH_URL/books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example1-js.md b/solutions/search/get-started/_snippets/index-basics/example1-js.md new file mode 100644 index 0000000000..70b93e3f8c --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example1-js.md @@ -0,0 +1,19 @@ +```js +const { Client } = require("@elastic/elasticsearch"); + +const client = new Client({ + nodes: [process.env["ELASTICSEARCH_URL"]], + auth: { + apiKey: process.env["ELASTIC_API_KEY"], + }, +}); + +async function run() { + const response = await client.indices.create({ + index: "books", + }); + console.log(response); +} + +run(); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example1-php.md b/solutions/search/get-started/_snippets/index-basics/example1-php.md new file mode 100644 index 0000000000..0ed4e5c862 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example1-php.md @@ -0,0 +1,18 @@ +```php +setHosts([getenv("ELASTICSEARCH_URL")]) + ->setApiKey(getenv("ELASTIC_API_KEY")) + ->build(); + +$resp = $client->indices()->create([ + "index" => "books", +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example1-python.md b/solutions/search/get-started/_snippets/index-basics/example1-python.md new file mode 100644 index 0000000000..b200b3a942 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example1-python.md @@ -0,0 +1,15 @@ +```python +import os +from elasticsearch import Elasticsearch + +client = Elasticsearch( + hosts=[os.getenv("ELASTICSEARCH_URL")], + api_key=os.getenv("ELASTIC_API_KEY"), +) + +resp = client.indices.create( + index="books", +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example1-ruby.md b/solutions/search/get-started/_snippets/index-basics/example1-ruby.md new file mode 100644 index 0000000000..3238c841a3 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example1-ruby.md @@ -0,0 +1,14 @@ +```ruby +require "elasticsearch" + +client = Elasticsearch::Client.new( + host: ENV["ELASTICSEARCH_URL"], + api_key: ENV["ELASTIC_API_KEY"] +) + +response = client.indices.create( + index: "books" +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example2-console.md b/solutions/search/get-started/_snippets/index-basics/example2-console.md new file mode 100644 index 0000000000..6cedc23194 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example2-console.md @@ -0,0 +1,9 @@ +```console +POST books/_doc +{ + "name": "Snow Crash", + "author": "Neal Stephenson", + "release_date": "1992-06-01", + "page_count": 470 +} +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example2-curl.md b/solutions/search/get-started/_snippets/index-basics/example2-curl.md new file mode 100644 index 0000000000..0bf3dcc8d8 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example2-curl.md @@ -0,0 +1,6 @@ +```bash +curl -X POST "$ELASTICSEARCH_URL/books/_doc" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"name":"Snow Crash","author":"Neal Stephenson","release_date":"1992-06-01","page_count":470}' +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example2-js.md b/solutions/search/get-started/_snippets/index-basics/example2-js.md new file mode 100644 index 0000000000..5fb6dc8bca --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example2-js.md @@ -0,0 +1,12 @@ +```js +const response = await client.index({ + index: "books", + document: { + name: "Snow Crash", + author: "Neal Stephenson", + release_date: "1992-06-01", + page_count: 470, + }, +}); +console.log(response); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example2-php.md b/solutions/search/get-started/_snippets/index-basics/example2-php.md new file mode 100644 index 0000000000..2ee1b6f42f --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example2-php.md @@ -0,0 +1,13 @@ +```php +$resp = $client->index([ + "index" => "books", + "body" => [ + "name" => "Snow Crash", + "author" => "Neal Stephenson", + "release_date" => "1992-06-01", + "page_count" => 470, + ], +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example2-python.md b/solutions/search/get-started/_snippets/index-basics/example2-python.md new file mode 100644 index 0000000000..a2c579f9c8 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example2-python.md @@ -0,0 +1,13 @@ +```python +resp = client.index( + index="books", + document={ + "name": "Snow Crash", + "author": "Neal Stephenson", + "release_date": "1992-06-01", + "page_count": 470 + }, +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example2-ruby.md b/solutions/search/get-started/_snippets/index-basics/example2-ruby.md new file mode 100644 index 0000000000..f44dd39057 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example2-ruby.md @@ -0,0 +1,13 @@ +```ruby +response = client.index( + index: "books", + body: { + "name": "Snow Crash", + "author": "Neal Stephenson", + "release_date": "1992-06-01", + "page_count": 470 + } +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example3-console.md b/solutions/search/get-started/_snippets/index-basics/example3-console.md new file mode 100644 index 0000000000..8b498e0e63 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example3-console.md @@ -0,0 +1,13 @@ +```console +POST /_bulk +{ "index" : { "_index" : "books" } } +{"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} +{ "index" : { "_index" : "books" } } +{"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} +{ "index" : { "_index" : "books" } } +{"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} +{ "index" : { "_index" : "books" } } +{"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} +{ "index" : { "_index" : "books" } } +{"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example3-curl.md b/solutions/search/get-started/_snippets/index-basics/example3-curl.md new file mode 100644 index 0000000000..3e1681f18f --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example3-curl.md @@ -0,0 +1,6 @@ +```bash +curl -X POST "$ELASTICSEARCH_URL/_bulk" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/x-ndjson" \ + -d $'{"index":{"_index":"books"}}\n{"name":"Revelation Space","author":"Alastair Reynolds","release_date":"2000-03-15","page_count":585}\n{"index":{"_index":"books"}}\n{"name":"1984","author":"George Orwell","release_date":"1985-06-01","page_count":328}\n{"index":{"_index":"books"}}\n{"name":"Fahrenheit 451","author":"Ray Bradbury","release_date":"1953-10-15","page_count":227}\n{"index":{"_index":"books"}}\n{"name":"Brave New World","author":"Aldous Huxley","release_date":"1932-06-01","page_count":268}\n{"index":{"_index":"books"}}\n{"name":"The Handmaids Tale","author":"Margaret Atwood","release_date":"1985-06-01","page_count":311}\n' +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example3-js.md b/solutions/search/get-started/_snippets/index-basics/example3-js.md new file mode 100644 index 0000000000..9f2193d620 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example3-js.md @@ -0,0 +1,62 @@ +```js +const response = await client.bulk({ + operations: [ + { + index: { + _index: "books", + }, + }, + { + name: "Revelation Space", + author: "Alastair Reynolds", + release_date: "2000-03-15", + page_count: 585, + }, + { + index: { + _index: "books", + }, + }, + { + name: "1984", + author: "George Orwell", + release_date: "1985-06-01", + page_count: 328, + }, + { + index: { + _index: "books", + }, + }, + { + name: "Fahrenheit 451", + author: "Ray Bradbury", + release_date: "1953-10-15", + page_count: 227, + }, + { + index: { + _index: "books", + }, + }, + { + name: "Brave New World", + author: "Aldous Huxley", + release_date: "1932-06-01", + page_count: 268, + }, + { + index: { + _index: "books", + }, + }, + { + name: "The Handmaids Tale", + author: "Margaret Atwood", + release_date: "1985-06-01", + page_count: 311, + }, + ], +}); +console.log(response); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example3-php.md b/solutions/search/get-started/_snippets/index-basics/example3-php.md new file mode 100644 index 0000000000..31c61af636 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example3-php.md @@ -0,0 +1,63 @@ +```php +$resp = $client->bulk([ + "body" => array( + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "Revelation Space", + "author" => "Alastair Reynolds", + "release_date" => "2000-03-15", + "page_count" => 585, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "1984", + "author" => "George Orwell", + "release_date" => "1985-06-01", + "page_count" => 328, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "Fahrenheit 451", + "author" => "Ray Bradbury", + "release_date" => "1953-10-15", + "page_count" => 227, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "Brave New World", + "author" => "Aldous Huxley", + "release_date" => "1932-06-01", + "page_count" => 268, + ], + [ + "index" => [ + "_index" => "books", + ], + ], + [ + "name" => "The Handmaids Tale", + "author" => "Margaret Atwood", + "release_date" => "1985-06-01", + "page_count" => 311, + ], + ), +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example3-python.md b/solutions/search/get-started/_snippets/index-basics/example3-python.md new file mode 100644 index 0000000000..985f5f349a --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example3-python.md @@ -0,0 +1,63 @@ +```python +resp = client.bulk( + operations=[ + { + "index": { + "_index": "books" + } + }, + { + "name": "Revelation Space", + "author": "Alastair Reynolds", + "release_date": "2000-03-15", + "page_count": 585 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "1984", + "author": "George Orwell", + "release_date": "1985-06-01", + "page_count": 328 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Fahrenheit 451", + "author": "Ray Bradbury", + "release_date": "1953-10-15", + "page_count": 227 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Brave New World", + "author": "Aldous Huxley", + "release_date": "1932-06-01", + "page_count": 268 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "The Handmaids Tale", + "author": "Margaret Atwood", + "release_date": "1985-06-01", + "page_count": 311 + } + ], +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example3-ruby.md b/solutions/search/get-started/_snippets/index-basics/example3-ruby.md new file mode 100644 index 0000000000..559bce5355 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example3-ruby.md @@ -0,0 +1,63 @@ +```ruby +response = client.bulk( + body: [ + { + "index": { + "_index": "books" + } + }, + { + "name": "Revelation Space", + "author": "Alastair Reynolds", + "release_date": "2000-03-15", + "page_count": 585 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "1984", + "author": "George Orwell", + "release_date": "1985-06-01", + "page_count": 328 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Fahrenheit 451", + "author": "Ray Bradbury", + "release_date": "1953-10-15", + "page_count": 227 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "Brave New World", + "author": "Aldous Huxley", + "release_date": "1932-06-01", + "page_count": 268 + }, + { + "index": { + "_index": "books" + } + }, + { + "name": "The Handmaids Tale", + "author": "Margaret Atwood", + "release_date": "1985-06-01", + "page_count": 311 + } + ] +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example4-console.md b/solutions/search/get-started/_snippets/index-basics/example4-console.md new file mode 100644 index 0000000000..1d3684a64c --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example4-console.md @@ -0,0 +1,12 @@ +```console +POST /books/_doc +{ + "name": "The Great Gatsby", + "author": "F. Scott Fitzgerald", + "release_date": "1925-04-10", + "page_count": 180, + "language": "EN" <1> +} +``` + +1. The new field. diff --git a/solutions/search/get-started/_snippets/index-basics/example4-curl.md b/solutions/search/get-started/_snippets/index-basics/example4-curl.md new file mode 100644 index 0000000000..e6d8aef882 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example4-curl.md @@ -0,0 +1,6 @@ +```bash +curl -X POST "$ELASTICSEARCH_URL/books/_doc" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"name":"The Great Gatsby","author":"F. Scott Fitzgerald","release_date":"1925-04-10","page_count":180,"language":"EN"}' +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example4-js.md b/solutions/search/get-started/_snippets/index-basics/example4-js.md new file mode 100644 index 0000000000..726fcbfed0 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example4-js.md @@ -0,0 +1,13 @@ +```js +const response = await client.index({ + index: "books", + document: { + name: "The Great Gatsby", + author: "F. Scott Fitzgerald", + release_date: "1925-04-10", + page_count: 180, + language: "EN", + }, +}); +console.log(response); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example4-php.md b/solutions/search/get-started/_snippets/index-basics/example4-php.md new file mode 100644 index 0000000000..585eabcfbe --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example4-php.md @@ -0,0 +1,14 @@ +```php +$resp = $client->index([ + "index" => "books", + "body" => [ + "name" => "The Great Gatsby", + "author" => "F. Scott Fitzgerald", + "release_date" => "1925-04-10", + "page_count" => 180, + "language" => "EN", + ], +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example4-python.md b/solutions/search/get-started/_snippets/index-basics/example4-python.md new file mode 100644 index 0000000000..efda319d4b --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example4-python.md @@ -0,0 +1,14 @@ +```python +resp = client.index( + index="books", + document={ + "name": "The Great Gatsby", + "author": "F. Scott Fitzgerald", + "release_date": "1925-04-10", + "page_count": 180, + "language": "EN" + }, +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example4-ruby.md b/solutions/search/get-started/_snippets/index-basics/example4-ruby.md new file mode 100644 index 0000000000..92f4db46cd --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example4-ruby.md @@ -0,0 +1,14 @@ +```ruby +response = client.index( + index: "books", + body: { + "name": "The Great Gatsby", + "author": "F. Scott Fitzgerald", + "release_date": "1925-04-10", + "page_count": 180, + "language": "EN" + } +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example5-console.md b/solutions/search/get-started/_snippets/index-basics/example5-console.md new file mode 100644 index 0000000000..5b277de59d --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example5-console.md @@ -0,0 +1,3 @@ +```console +GET /books/_mapping +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example5-curl.md b/solutions/search/get-started/_snippets/index-basics/example5-curl.md new file mode 100644 index 0000000000..3cf1906155 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example5-curl.md @@ -0,0 +1,4 @@ +```bash +curl -X GET "$ELASTICSEARCH_URL/books/_mapping" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example5-js.md b/solutions/search/get-started/_snippets/index-basics/example5-js.md new file mode 100644 index 0000000000..17241ee766 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example5-js.md @@ -0,0 +1,6 @@ +```js +const response = await client.indices.getMapping({ + index: "books", +}); +console.log(response); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example5-php.md b/solutions/search/get-started/_snippets/index-basics/example5-php.md new file mode 100644 index 0000000000..b90dc7349d --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example5-php.md @@ -0,0 +1,7 @@ +```php +$resp = $client->indices()->getMapping([ + "index" => "books", +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example5-python.md b/solutions/search/get-started/_snippets/index-basics/example5-python.md new file mode 100644 index 0000000000..ce23c1383a --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example5-python.md @@ -0,0 +1,7 @@ +```python +resp = client.indices.get_mapping( + index="books", +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example5-ruby.md b/solutions/search/get-started/_snippets/index-basics/example5-ruby.md new file mode 100644 index 0000000000..7caba1f334 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example5-ruby.md @@ -0,0 +1,7 @@ +```ruby +response = client.indices.get_mapping( + index: "books" +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example6-console.md b/solutions/search/get-started/_snippets/index-basics/example6-console.md new file mode 100644 index 0000000000..ec26471362 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example6-console.md @@ -0,0 +1,17 @@ +```console +PUT /my-explicit-mappings-books +{ + "mappings": { + "dynamic": false, <1> + "properties": { <2> + "name": { "type": "text" }, + "author": { "type": "text" }, + "release_date": { "type": "date", "format": "yyyy-MM-dd" }, + "page_count": { "type": "integer" } + } + } +} +``` + +1. `dynamic`: Turns off dynamic mapping for the index. If you don't define fields in the mapping, they'll still be stored in the document's `_source` field, but you can't index or search them. +2. `properties`: Defines the fields and their corresponding data types. diff --git a/solutions/search/get-started/_snippets/index-basics/example6-curl.md b/solutions/search/get-started/_snippets/index-basics/example6-curl.md new file mode 100644 index 0000000000..78dfc71151 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example6-curl.md @@ -0,0 +1,6 @@ +```bash +curl -X PUT "$ELASTICSEARCH_URL/my-explicit-mappings-books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"mappings":{"dynamic":false,"properties":{"name":{"type":"text"},"author":{"type":"text"},"release_date":{"type":"date","format":"yyyy-MM-dd"},"page_count":{"type":"integer"}}}}' +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example6-js.md b/solutions/search/get-started/_snippets/index-basics/example6-js.md new file mode 100644 index 0000000000..26195f6168 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example6-js.md @@ -0,0 +1,24 @@ +```js +const response = await client.indices.create({ + index: "my-explicit-mappings-books", + mappings: { + dynamic: false, + properties: { + name: { + type: "text", + }, + author: { + type: "text", + }, + release_date: { + type: "date", + format: "yyyy-MM-dd", + }, + page_count: { + type: "integer", + }, + }, + }, +}); +console.log(response); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example6-php.md b/solutions/search/get-started/_snippets/index-basics/example6-php.md new file mode 100644 index 0000000000..3fcc181056 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example6-php.md @@ -0,0 +1,27 @@ +```php +$resp = $client->indices()->create([ + "index" => "my-explicit-mappings-books", + "body" => [ + "mappings" => [ + "dynamic" => false, + "properties" => [ + "name" => [ + "type" => "text", + ], + "author" => [ + "type" => "text", + ], + "release_date" => [ + "type" => "date", + "format" => "yyyy-MM-dd", + ], + "page_count" => [ + "type" => "integer", + ], + ], + ], + ], +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example6-python.md b/solutions/search/get-started/_snippets/index-basics/example6-python.md new file mode 100644 index 0000000000..077ec46228 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example6-python.md @@ -0,0 +1,25 @@ +```python +resp = client.indices.create( + index="my-explicit-mappings-books", + mappings={ + "dynamic": False, + "properties": { + "name": { + "type": "text" + }, + "author": { + "type": "text" + }, + "release_date": { + "type": "date", + "format": "yyyy-MM-dd" + }, + "page_count": { + "type": "integer" + } + } + }, +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example6-ruby.md b/solutions/search/get-started/_snippets/index-basics/example6-ruby.md new file mode 100644 index 0000000000..8793f0ab6b --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example6-ruby.md @@ -0,0 +1,27 @@ +```ruby +response = client.indices.create( + index: "my-explicit-mappings-books", + body: { + "mappings": { + "dynamic": false, + "properties": { + "name": { + "type": "text" + }, + "author": { + "type": "text" + }, + "release_date": { + "type": "date", + "format": "yyyy-MM-dd" + }, + "page_count": { + "type": "integer" + } + } + } + } +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example7-console.md b/solutions/search/get-started/_snippets/index-basics/example7-console.md new file mode 100644 index 0000000000..3f3337a619 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example7-console.md @@ -0,0 +1,3 @@ +```console +GET books/_search +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example7-curl.md b/solutions/search/get-started/_snippets/index-basics/example7-curl.md new file mode 100644 index 0000000000..c5f98747dd --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example7-curl.md @@ -0,0 +1,4 @@ +```bash +curl -X GET "$ELASTICSEARCH_URL/books/_search" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example7-js.md b/solutions/search/get-started/_snippets/index-basics/example7-js.md new file mode 100644 index 0000000000..aabfb858d4 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example7-js.md @@ -0,0 +1,6 @@ +```js +const response = await client.search({ + index: "books", +}); +console.log(response); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example7-php.md b/solutions/search/get-started/_snippets/index-basics/example7-php.md new file mode 100644 index 0000000000..b54427705a --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example7-php.md @@ -0,0 +1,7 @@ +```php +$resp = $client->search([ + "index" => "books", +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example7-python.md b/solutions/search/get-started/_snippets/index-basics/example7-python.md new file mode 100644 index 0000000000..0518f95aa1 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example7-python.md @@ -0,0 +1,7 @@ +```python +resp = client.search( + index="books", +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example7-ruby.md b/solutions/search/get-started/_snippets/index-basics/example7-ruby.md new file mode 100644 index 0000000000..176fe908c5 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example7-ruby.md @@ -0,0 +1,7 @@ +```ruby +response = client.search( + index: "books" +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example8-console.md b/solutions/search/get-started/_snippets/index-basics/example8-console.md new file mode 100644 index 0000000000..57cf0c4f71 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example8-console.md @@ -0,0 +1,10 @@ +```console +GET books/_search +{ + "query": { + "match": { + "name": "brave" + } + } +} +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example8-curl.md b/solutions/search/get-started/_snippets/index-basics/example8-curl.md new file mode 100644 index 0000000000..baebefe060 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example8-curl.md @@ -0,0 +1,6 @@ +```bash +curl -X GET "$ELASTICSEARCH_URL/books/_search" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"query":{"match":{"name":"brave"}}}' +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example8-js.md b/solutions/search/get-started/_snippets/index-basics/example8-js.md new file mode 100644 index 0000000000..a1cf90f147 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example8-js.md @@ -0,0 +1,11 @@ +```js +const response = await client.search({ + index: "books", + query: { + match: { + name: "brave", + }, + }, +}); +console.log(response); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example8-php.md b/solutions/search/get-started/_snippets/index-basics/example8-php.md new file mode 100644 index 0000000000..64764ce976 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example8-php.md @@ -0,0 +1,14 @@ +```php +$resp = $client->search([ + "index" => "books", + "body" => [ + "query" => [ + "match" => [ + "name" => "brave", + ], + ], + ], +]); +echo $resp->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example8-python.md b/solutions/search/get-started/_snippets/index-basics/example8-python.md new file mode 100644 index 0000000000..89719370a6 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example8-python.md @@ -0,0 +1,12 @@ +```python +resp = client.search( + index="books", + query={ + "match": { + "name": "brave" + } + }, +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example8-ruby.md b/solutions/search/get-started/_snippets/index-basics/example8-ruby.md new file mode 100644 index 0000000000..084d4032b4 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example8-ruby.md @@ -0,0 +1,14 @@ +```ruby +response = client.search( + index: "books", + body: { + "query": { + "match": { + "name": "brave" + } + } + } +) +print(resp) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example9-console.md b/solutions/search/get-started/_snippets/index-basics/example9-console.md new file mode 100644 index 0000000000..43bca616f6 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example9-console.md @@ -0,0 +1,4 @@ +```console +DELETE /books +DELETE /my-explicit-mappings-books +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example9-curl.md b/solutions/search/get-started/_snippets/index-basics/example9-curl.md new file mode 100644 index 0000000000..586b630bbb --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example9-curl.md @@ -0,0 +1,6 @@ +```bash +curl -X DELETE "$ELASTICSEARCH_URL/books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +curl -X DELETE "$ELASTICSEARCH_URL/my-explicit-mappings-books" \ + -H "Authorization: ApiKey $ELASTIC_API_KEY" +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example9-js.md b/solutions/search/get-started/_snippets/index-basics/example9-js.md new file mode 100644 index 0000000000..f7a6e5e2e1 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example9-js.md @@ -0,0 +1,11 @@ +```js +const response = await client.indices.delete({ + index: "books", +}); +console.log(response); + +const response1 = await client.indices.delete({ + index: "my-explicit-mappings-books", +}); +console.log(response1); +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example9-php.md b/solutions/search/get-started/_snippets/index-basics/example9-php.md new file mode 100644 index 0000000000..4c7ef097bb --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example9-php.md @@ -0,0 +1,12 @@ +```php +$resp = $client->indices()->delete([ + "index" => "books", +]); +echo $resp->asString(); + +$resp1 = $client->indices()->delete([ + "index" => "my-explicit-mappings-books", +]); +echo $resp1->asString(); + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example9-python.md b/solutions/search/get-started/_snippets/index-basics/example9-python.md new file mode 100644 index 0000000000..d1558b7698 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example9-python.md @@ -0,0 +1,12 @@ +```python +resp = client.indices.delete( + index="books", +) +print(resp) + +resp1 = client.indices.delete( + index="my-explicit-mappings-books", +) +print(resp1) + +``` diff --git a/solutions/search/get-started/_snippets/index-basics/example9-ruby.md b/solutions/search/get-started/_snippets/index-basics/example9-ruby.md new file mode 100644 index 0000000000..6a99573e41 --- /dev/null +++ b/solutions/search/get-started/_snippets/index-basics/example9-ruby.md @@ -0,0 +1,12 @@ +```ruby +response = client.indices.delete( + index: "books" +) +print(resp) + +response1 = client.indices.delete( + index: "my-explicit-mappings-books" +) +print(resp1) + +``` From f7065191f970943397542b3df67b7ca201dc5926 Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Fri, 14 Nov 2025 12:42:07 +0100 Subject: [PATCH 4/7] actually update index-basics.md --- solutions/search/get-started/index-basics.md | 819 ++----------------- 1 file changed, 80 insertions(+), 739 deletions(-) diff --git a/solutions/search/get-started/index-basics.md b/solutions/search/get-started/index-basics.md index 13889920a6..9b2aaf6434 100644 --- a/solutions/search/get-started/index-basics.md +++ b/solutions/search/get-started/index-basics.md @@ -37,95 +37,38 @@ Create a new index named `books`: :::{tab-item} Console :sync: console -```console -PUT /books -``` + +:::{include} _snippets/index-basics/example1-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X PUT "$ELASTICSEARCH_URL/books" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" -``` + +:::{include} _snippets/index-basics/example1-curl.md ::: :::{tab-item} Python :sync: python -```python -import os -from elasticsearch import Elasticsearch -client = Elasticsearch( - hosts=[os.getenv("ELASTICSEARCH_URL")], - api_key=os.getenv("ELASTIC_API_KEY"), -) - -resp = client.indices.create( - index="books", -) - -``` +:::{include} _snippets/index-basics/example1-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const { Client } = require("@elastic/elasticsearch"); -const client = new Client({ - nodes: [process.env["ELASTICSEARCH_URL"]], - auth: { - apiKey: process.env["ELASTIC_API_KEY"], - }, -}); - -async function run() { - const response = await client.indices.create({ - index: "books", - }); -} - -run(); -``` +:::{include} _snippets/index-basics/example1-js.md ::: :::{tab-item} PHP :sync: php -```php -setHosts([getenv("ELASTICSEARCH_URL")]) - ->setApiKey(getenv("ELASTIC_API_KEY")) - ->build(); - -$resp = $client->indices()->create([ - "index" => "books", -]); - -``` +:::{include} _snippets/index-basics/example1-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -require "elasticsearch" - -client = Elasticsearch::Client.new( - host: ENV["ELASTICSEARCH_URL"], - api_key: ENV["ELASTIC_API_KEY"] -) -response = client.indices.create( - index: "books" -) - -``` +:::{include} _snippets/index-basics/example1-ruby.md ::: :::: @@ -154,88 +97,38 @@ If the index doesn't already exist, this request will automatically create it. :::{tab-item} Console :sync: console -```console -POST books/_doc -{ - "name": "Snow Crash", - "author": "Neal Stephenson", - "release_date": "1992-06-01", - "page_count": 470 -} -``` + +:::{include} _snippets/index-basics/example2-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X POST "$ELASTICSEARCH_URL/books/_doc" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{"name":"Snow Crash","author":"Neal Stephenson","release_date":"1992-06-01","page_count":470}' -``` + +:::{include} _snippets/index-basics/example2-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.index( - index="books", - document={ - "name": "Snow Crash", - "author": "Neal Stephenson", - "release_date": "1992-06-01", - "page_count": 470 - }, -) -``` +:::{include} _snippets/index-basics/example2-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.index({ - index: "books", - document: { - name: "Snow Crash", - author: "Neal Stephenson", - release_date: "1992-06-01", - page_count: 470, - }, -}); -``` + +:::{include} _snippets/index-basics/example2-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->index([ - "index" => "books", - "body" => [ - "name" => "Snow Crash", - "author" => "Neal Stephenson", - "release_date" => "1992-06-01", - "page_count" => 470, - ], -]); -``` +:::{include} _snippets/index-basics/example2-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.index( - index: "books", - body: { - "name": "Snow Crash", - "author": "Neal Stephenson", - "release_date": "1992-06-01", - "page_count": 470 - } -) -``` +:::{include} _snippets/index-basics/example2-ruby.md ::: :::: @@ -282,301 +175,38 @@ Bulk data must be formatted as newline-delimited JSON (NDJSON). :::{tab-item} Console :sync: console -```console -POST /_bulk -{ "index" : { "_index" : "books" } } -{"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585} -{ "index" : { "_index" : "books" } } -{"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328} -{ "index" : { "_index" : "books" } } -{"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227} -{ "index" : { "_index" : "books" } } -{"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268} -{ "index" : { "_index" : "books" } } -{"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311} -``` + +:::{include} _snippets/index-basics/example3-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X POST "$ELASTICSEARCH_URL/_bulk" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d $'{"index":{"_index":"books"}} -{"name":"Revelation Space","author":"Alastair Reynolds","release_date":"2000-03-15","page_count":585} -{"index":{"_index":"books"}} -{"name":"1984","author":"George Orwell","release_date":"1985-06-01","page_count":328} -{"index":{"_index":"books"}} -{"name":"Fahrenheit 451","author":"Ray Bradbury","release_date":"1953-10-15","page_count":227} -{"index":{"_index":"books"}} -{"name":"Brave New World","author":"Aldous Huxley","release_date":"1932-06-01","page_count":268} -{"index":{"_index":"books"}} -{"name":"The Handmaids Tale","author":"Margaret Atwood","release_date":"1985-06-01","page_count":311}\n' -``` + +:::{include} _snippets/index-basics/example3-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.bulk( - operations=[ - { - "index": { - "_index": "books" - } - }, - { - "name": "Revelation Space", - "author": "Alastair Reynolds", - "release_date": "2000-03-15", - "page_count": 585 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "1984", - "author": "George Orwell", - "release_date": "1985-06-01", - "page_count": 328 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "Fahrenheit 451", - "author": "Ray Bradbury", - "release_date": "1953-10-15", - "page_count": 227 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "Brave New World", - "author": "Aldous Huxley", - "release_date": "1932-06-01", - "page_count": 268 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "The Handmaids Tale", - "author": "Margaret Atwood", - "release_date": "1985-06-01", - "page_count": 311 - } - ], -) -``` +:::{include} _snippets/index-basics/example3-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.bulk({ - operations: [ - { - index: { - _index: "books", - }, - }, - { - name: "Revelation Space", - author: "Alastair Reynolds", - release_date: "2000-03-15", - page_count: 585, - }, - { - index: { - _index: "books", - }, - }, - { - name: "1984", - author: "George Orwell", - release_date: "1985-06-01", - page_count: 328, - }, - { - index: { - _index: "books", - }, - }, - { - name: "Fahrenheit 451", - author: "Ray Bradbury", - release_date: "1953-10-15", - page_count: 227, - }, - { - index: { - _index: "books", - }, - }, - { - name: "Brave New World", - author: "Aldous Huxley", - release_date: "1932-06-01", - page_count: 268, - }, - { - index: { - _index: "books", - }, - }, - { - name: "The Handmaids Tale", - author: "Margaret Atwood", - release_date: "1985-06-01", - page_count: 311, - }, - ], -}); -``` + +:::{include} _snippets/index-basics/example3-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->bulk([ - "body" => array( - [ - "index" => [ - "_index" => "books", - ], - ], - [ - "name" => "Revelation Space", - "author" => "Alastair Reynolds", - "release_date" => "2000-03-15", - "page_count" => 585, - ], - [ - "index" => [ - "_index" => "books", - ], - ], - [ - "name" => "1984", - "author" => "George Orwell", - "release_date" => "1985-06-01", - "page_count" => 328, - ], - [ - "index" => [ - "_index" => "books", - ], - ], - [ - "name" => "Fahrenheit 451", - "author" => "Ray Bradbury", - "release_date" => "1953-10-15", - "page_count" => 227, - ], - [ - "index" => [ - "_index" => "books", - ], - ], - [ - "name" => "Brave New World", - "author" => "Aldous Huxley", - "release_date" => "1932-06-01", - "page_count" => 268, - ], - [ - "index" => [ - "_index" => "books", - ], - ], - [ - "name" => "The Handmaids Tale", - "author" => "Margaret Atwood", - "release_date" => "1985-06-01", - "page_count" => 311, - ], - ), -]); -``` +:::{include} _snippets/index-basics/example3-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.bulk( - body: [ - { - "index": { - "_index": "books" - } - }, - { - "name": "Revelation Space", - "author": "Alastair Reynolds", - "release_date": "2000-03-15", - "page_count": 585 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "1984", - "author": "George Orwell", - "release_date": "1985-06-01", - "page_count": 328 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "Fahrenheit 451", - "author": "Ray Bradbury", - "release_date": "1953-10-15", - "page_count": 227 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "Brave New World", - "author": "Aldous Huxley", - "release_date": "1932-06-01", - "page_count": 268 - }, - { - "index": { - "_index": "books" - } - }, - { - "name": "The Handmaids Tale", - "author": "Margaret Atwood", - "release_date": "1985-06-01", - "page_count": 311 - } - ] -) -``` +:::{include} _snippets/index-basics/example3-ruby.md ::: :::: @@ -690,94 +320,38 @@ To see how dynamic mapping works, add a new document to the `books` index with a :::{tab-item} Console :sync: console -```console -POST /books/_doc -{ - "name": "The Great Gatsby", - "author": "F. Scott Fitzgerald", - "release_date": "1925-04-10", - "page_count": 180, - "language": "EN" <1> -} -``` -1. The new field. + +:::{include} _snippets/index-basics/example4-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X POST "$ELASTICSEARCH_URL/books/_doc" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{"name":"The Great Gatsby","author":"F. Scott Fitzgerald","release_date":"1925-04-10","page_count":180,"language":"EN"}' -``` + +:::{include} _snippets/index-basics/example4-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.index( - index="books", - document={ - "name": "The Great Gatsby", - "author": "F. Scott Fitzgerald", - "release_date": "1925-04-10", - "page_count": 180, - "language": "EN" - }, -) -``` +:::{include} _snippets/index-basics/example4-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.index({ - index: "books", - document: { - name: "The Great Gatsby", - author: "F. Scott Fitzgerald", - release_date: "1925-04-10", - page_count: 180, - language: "EN", - }, -}); -``` + +:::{include} _snippets/index-basics/example4-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->index([ - "index" => "books", - "body" => [ - "name" => "The Great Gatsby", - "author" => "F. Scott Fitzgerald", - "release_date" => "1925-04-10", - "page_count" => 180, - "language" => "EN", - ], -]); -``` +:::{include} _snippets/index-basics/example4-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.index( - index: "books", - body: { - "name": "The Great Gatsby", - "author": "F. Scott Fitzgerald", - "release_date": "1925-04-10", - "page_count": 180, - "language": "EN" - } -) -``` +:::{include} _snippets/index-basics/example4-ruby.md ::: :::: @@ -790,56 +364,38 @@ The new field `language` has been added to the mapping with a `text` data type. :::{tab-item} Console :sync: console -```console -GET /books/_mapping -``` + +:::{include} _snippets/index-basics/example5-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X GET "$ELASTICSEARCH_URL/books/_mapping" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" -``` + +:::{include} _snippets/index-basics/example5-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.indices.get_mapping( - index="books", -) -``` +:::{include} _snippets/index-basics/example5-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.indices.getMapping({ - index: "books", -}); -``` + +:::{include} _snippets/index-basics/example5-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->indices()->getMapping([ - "index" => "books", -]); -``` +:::{include} _snippets/index-basics/example5-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.indices.get_mapping( - index: "books" -) -``` +:::{include} _snippets/index-basics/example5-ruby.md ::: :::: @@ -903,147 +459,38 @@ This object should contain the [field data type](elasticsearch://reference/elast :::{tab-item} Console :sync: console -```console -PUT /my-explicit-mappings-books -{ - "mappings": { - "dynamic": false, <1> - "properties": { <2> - "name": { "type": "text" }, - "author": { "type": "text" }, - "release_date": { "type": "date", "format": "yyyy-MM-dd" }, - "page_count": { "type": "integer" } - } - } -} -``` -1. `dynamic`: Turns off dynamic mapping for the index. If you don't define fields in the mapping, they'll still be stored in the document's `_source` field, but you can't index or search them. -2. `properties`: Defines the fields and their corresponding data types. + +:::{include} _snippets/index-basics/example6-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X PUT "$ELASTICSEARCH_URL/my-explicit-mappings-books" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{"mappings":{"dynamic":false,"properties":{"name":{"type":"text"},"author":{"type":"text"},"release_date":{"type":"date","format":"yyyy-MM-dd"},"page_count":{"type":"integer"}}}}' -``` + +:::{include} _snippets/index-basics/example6-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.indices.create( - index="my-explicit-mappings-books", - mappings={ - "dynamic": False, - "properties": { - "name": { - "type": "text" - }, - "author": { - "type": "text" - }, - "release_date": { - "type": "date", - "format": "yyyy-MM-dd" - }, - "page_count": { - "type": "integer" - } - } - }, -) -``` +:::{include} _snippets/index-basics/example6-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.indices.create({ - index: "my-explicit-mappings-books", - mappings: { - dynamic: false, - properties: { - name: { - type: "text", - }, - author: { - type: "text", - }, - release_date: { - type: "date", - format: "yyyy-MM-dd", - }, - page_count: { - type: "integer", - }, - }, - }, -}); -``` + +:::{include} _snippets/index-basics/example6-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->indices()->create([ - "index" => "my-explicit-mappings-books", - "body" => [ - "mappings" => [ - "dynamic" => false, - "properties" => [ - "name" => [ - "type" => "text", - ], - "author" => [ - "type" => "text", - ], - "release_date" => [ - "type" => "date", - "format" => "yyyy-MM-dd", - ], - "page_count" => [ - "type" => "integer", - ], - ], - ], - ], -]); -``` +:::{include} _snippets/index-basics/example6-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.indices.create( - index: "my-explicit-mappings-books", - body: { - "mappings": { - "dynamic": false, - "properties": { - "name": { - "type": "text" - }, - "author": { - "type": "text" - }, - "release_date": { - "type": "date", - "format": "yyyy-MM-dd" - }, - "page_count": { - "type": "integer" - } - } - } - } -) -``` +:::{include} _snippets/index-basics/example6-ruby.md ::: :::: @@ -1080,56 +527,38 @@ Use the following request to search all documents in the `books` index: :::{tab-item} Console :sync: console -```console -GET books/_search -``` + +:::{include} _snippets/index-basics/example7-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X GET "$ELASTICSEARCH_URL/books/_search" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" -``` + +:::{include} _snippets/index-basics/example7-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.search( - index="books", -) -``` +:::{include} _snippets/index-basics/example7-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.search({ - index: "books", -}); -``` + +:::{include} _snippets/index-basics/example7-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->search([ - "index" => "books", -]); -``` +:::{include} _snippets/index-basics/example7-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.search( - index: "books" -) -``` +:::{include} _snippets/index-basics/example7-ruby.md ::: :::: @@ -1193,89 +622,38 @@ Use the following request to search the `books` index for documents containing ` :::{tab-item} Console :sync: console -```console -GET books/_search -{ - "query": { - "match": { - "name": "brave" - } - } -} -``` + +:::{include} _snippets/index-basics/example8-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X GET "$ELASTICSEARCH_URL/books/_search" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{"query":{"match":{"name":"brave"}}}' -``` + +:::{include} _snippets/index-basics/example8-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.search( - index="books", - query={ - "match": { - "name": "brave" - } - }, -) -``` +:::{include} _snippets/index-basics/example8-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.search({ - index: "books", - query: { - match: { - name: "brave", - }, - }, -}); -``` + +:::{include} _snippets/index-basics/example8-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->search([ - "index" => "books", - "body" => [ - "query" => [ - "match" => [ - "name" => "brave", - ], - ], - ], -]); -``` +:::{include} _snippets/index-basics/example8-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.search( - index: "books", - body: { - "query": { - "match": { - "name": "brave" - } - } - } -) -``` +:::{include} _snippets/index-basics/example8-ruby.md ::: :::: @@ -1334,75 +712,38 @@ For example, use the following request to delete the indices created in this qui :::{tab-item} Console :sync: console -```console -DELETE /books -DELETE /my-explicit-mappings-books -``` + +:::{include} _snippets/index-basics/example9-console.md ::: :::{tab-item} curl :sync: curl -```bash -curl -X DELETE "$ELASTICSEARCH_URL/books" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" -curl -X DELETE "$ELASTICSEARCH_URL/my-explicit-mappings-books" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" -``` + +:::{include} _snippets/index-basics/example9-curl.md ::: :::{tab-item} Python :sync: python -```python -resp = client.indices.delete( - index="books", -) - -resp1 = client.indices.delete( - index="my-explicit-mappings-books", -) -``` +:::{include} _snippets/index-basics/example9-python.md ::: :::{tab-item} JavaScript :sync: js -```js -const response = await client.indices.delete({ - index: "books", -}); - -const response1 = await client.indices.delete({ - index: "my-explicit-mappings-books", -}); -``` + +:::{include} _snippets/index-basics/example9-js.md ::: :::{tab-item} PHP :sync: php -```php -$resp = $client->indices()->delete([ - "index" => "books", -]); - -$resp1 = $client->indices()->delete([ - "index" => "my-explicit-mappings-books", -]); -``` +:::{include} _snippets/index-basics/example9-php.md ::: :::{tab-item} Ruby :sync: ruby -```ruby -response = client.indices.delete( - index: "books" -) - -response1 = client.indices.delete( - index: "my-explicit-mappings-books" -) -``` +:::{include} _snippets/index-basics/example9-ruby.md ::: :::: From 97cea0867bc7625663bc4b578a4d1360e323e91f Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Fri, 14 Nov 2025 13:00:51 +0100 Subject: [PATCH 5/7] restore semantic-search.md --- .../search/get-started/semantic-search.md | 692 ++---------------- 1 file changed, 46 insertions(+), 646 deletions(-) diff --git a/solutions/search/get-started/semantic-search.md b/solutions/search/get-started/semantic-search.md index a775155292..8bc5665a9c 100644 --- a/solutions/search/get-started/semantic-search.md +++ b/solutions/search/get-started/semantic-search.md @@ -1,681 +1,81 @@ --- -navigation_title: Semantic search -description: An introduction to semantic search in Elasticsearch. +mapped_pages: + - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html + - https://www.elastic.co/guide/en/serverless/current/elasticsearch-reference-semantic-search.html applies_to: - serverless: all - stack: all + stack: + serverless: products: - id: elasticsearch + - id: cloud-serverless --- -# Get started with semantic search -_Semantic search_ is a type of AI-powered search that enables you to use natural language in your queries. -It returns results that match the meaning of a query, as opposed to literal keyword matches. -For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document. +# Semantic search [semantic-search] -Semantic search uses {{es}} [vector database](https://www.elastic.co/what-is/vector-database) and [vector search](https://www.elastic.co/what-is/vector-search) technology. -Each _vector_ (or _vector embedding_) is an array of numbers that represent different characteristics of the text, such as sentiment, context, and syntactics. -These numeric representations make vector comparisons very efficient. - -In this quickstart guide, you'll create vectors for a small set of sample data, store them in {{es}}, then run a semantic query. -By playing with a simple use case, you'll take the first steps toward understanding whether it's applicable to your own data. - -## Prerequisites - -- If you're using {{es-serverless}}, you must have a `developer` or `admin` predefined role or an equivalent custom role to add the sample data. -- If you're using [{{ech}}](/deploy-manage/deploy/elastic-cloud/cloud-hosted.md) or [running {{es}} locally](/deploy-manage/deploy/self-managed/local-development-installation-quickstart.md), start {{es}} and {{kib}}. To add the sample data, log in with a user that has the `superuser` built-in role. - -To learn about role-based access control, check out [](/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles.md). - -## Create a vector database - -When you create vectors (or _vectorize_ your data), you convert complex and nuanced documents into multidimensional numerical representations. -You can choose from many different vector embedding models. Some are extremely hardware efficient and can be run with less computational power. Others have a greater understanding of the context, can answer questions, and lead a threaded conversation. -The examples in this guide use the default Learned Sparse Encoder ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) model, which provides great relevance across domains without the need for additional fine tuning. - -The way that you store vectors has a significant impact on the performance and accuracy of search results. -They must be stored in specialized data structures designed to ensure efficient similarity search and speedy vector distance calculations. -This guide uses the [semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md), which provides sensible defaults and automation. - -::::::{stepper} -:::::{step} Create an index -An index is a collection of documents uniquely identified by a name or an alias. -You can follow the guided index workflow: - -- If you're using {{es-serverless}}, {{ech}}, or running {{es}} locally: - 1. Go to the **Index Management** page using the navigation menu or the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). - 2. Select **Create index**, select **Semantic Search**, and follow the guided workflow. - -When you complete the workflow, you will have sample data and can skip to the steps related to exploring and searching it. -Alternatively, run the following API request in [Console](/explore-analyze/query-filter/tools/console.md): - -::::{tab-set} -:group: api-examples - -:::{tab-item} Console -:sync: console -```console -PUT /semantic-index -``` -::: - -:::{tab-item} curl -:sync: curl -```bash -curl -X PUT "$ELASTICSEARCH_URL/semantic-index" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" -``` -::: - -:::{tab-item} Python -:sync: python -```python -import os -from elasticsearch import Elasticsearch - -client = Elasticsearch( - hosts=[os.getenv("ELASTICSEARCH_URL")], - api_key=os.getenv("ELASTIC_API_KEY"), -) - -resp = client.indices.create( - index="semantic-index", -) - -``` -::: - -:::{tab-item} JavaScript -:sync: js -```js -const { Client } = require("@elastic/elasticsearch"); - -const client = new Client({ - nodes: [process.env["ELASTICSEARCH_URL"]], - auth: { - apiKey: process.env["ELASTIC_API_KEY"], - }, -}); - -async function run() { - const response = await client.indices.create({ - index: "semantic-index", - }); -} - -run(); -``` -::: - -:::{tab-item} PHP -:sync: php -```php -setHosts([getenv("ELASTICSEARCH_URL")]) - ->setApiKey(getenv("ELASTIC_API_KEY")) - ->build(); - -$resp = $client->indices()->create([ - "index" => "semantic-index", -]); - -``` -::: - -:::{tab-item} Ruby -:sync: ruby -```ruby -require "elasticsearch" - -client = Elasticsearch::Client.new( - host: ENV["ELASTICSEARCH_URL"], - api_key: ENV["ELASTIC_API_KEY"] -) - -response = client.indices.create( - index: "semantic-index" -) - -``` -::: - -:::: - -::::{tip} -For an introduction to the concept of indices, check out [](/manage-data/data-store/index-basics.md). -:::: -::::: -:::::{step} Create a semantic_text field mapping -Each index has mappings that define how data is stored and indexed, like a schema in a relational database. -The following example creates a mapping for a single field ("content"): - -::::{tab-set} -:group: api-examples - -:::{tab-item} Console -:sync: console -```console -PUT /semantic-index/_mapping -{ - "properties": { - "content": { - "type": "semantic_text" - } - } -} -``` -::: - -:::{tab-item} curl -:sync: curl -```bash -curl -X PUT "$ELASTICSEARCH_URL/semantic-index/_mapping" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{"properties":{"content":{"type":"semantic_text"}}}' -``` -::: - -:::{tab-item} Python -:sync: python -```python -resp = client.indices.put_mapping( - index="semantic-index", - properties={ - "content": { - "type": "semantic_text" - } - }, -) - -``` -::: - -:::{tab-item} JavaScript -:sync: js -```js -const response = await client.indices.putMapping({ - index: "semantic-index", - properties: { - content: { - type: "semantic_text", - }, - }, -}); -``` -::: - -:::{tab-item} PHP -:sync: php -```php -$resp = $client->indices()->putMapping([ - "index" => "semantic-index", - "body" => [ - "properties" => [ - "content" => [ - "type" => "semantic_text", - ], - ], - ], -]); - -``` -::: - -:::{tab-item} Ruby -:sync: ruby -```ruby -response = client.indices.put_mapping( - index: "semantic-index", - body: { - "properties": { - "content": { - "type": "semantic_text" - } - } - } -) - -``` -::: - -:::: - -When you use `semantic_text` fields, the type of vector is determined by the vector embedding model. -In this case, the default ELSER model will be used to create sparse vectors. - -For a deeper dive, check out [Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector](https://www.elastic.co/search-labs/blog/mapping-embeddings-to-elasticsearch-field-types). -::::: - -:::::{step} Add documents - -You can use the Elasticsearch bulk API to ingest an array of documents: - -::::{tab-set} -:group: api-examples - -:::{tab-item} Console -:sync: console -```console -POST /_bulk?pretty -{ "index": { "_index": "semantic-index" } } -{"content":"Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site."} -{ "index": { "_index": "semantic-index" } } -{"content":"Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face."} -{ "index": { "_index": "semantic-index" } } -{"content":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."} -``` +:::{note} +This page focuses on the semantic search workflows available in {{es}}. For detailed information about lower-level vector search implementations, refer to [vector search](vector.md). ::: -:::{tab-item} curl -:sync: curl -```bash -curl -X POST "$ELASTICSEARCH_URL/_bulk?pretty" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d '[{"index":{"_index":"semantic-index"}},{"content":"Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site."},{"index":{"_index":"semantic-index"}},{"content":"Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face."},{"index":{"_index":"semantic-index"}},{"content":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."}]' -``` -::: +{{es}} provides various semantic search capabilities using [natural language processing (NLP)](/explore-analyze/machine-learning/nlp.md) and [vector search](vector.md). -:::{tab-item} Python -:sync: python -```python -resp = client.bulk( - pretty=True, - operations=[ - { - "index": { - "_index": "semantic-index" - } - }, - { - "content": "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site." - }, - { - "index": { - "_index": "semantic-index" - } - }, - { - "content": "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face." - }, - { - "index": { - "_index": "semantic-index" - } - }, - { - "content": "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site." - } - ], -) +To understand the infrastructure that powers semantic search and other NLP tasks, including managed services and inference endpoints, see the [Elastic Inference overview](../../explore-analyze/elastic-inference.md) page. -``` -::: +Learn more about use cases for AI-powered search in the [overview](ai-search/ai-search.md) page. -:::{tab-item} JavaScript -:sync: js -```js -const response = await client.bulk({ - pretty: "true", - operations: [ - { - index: { - _index: "semantic-index", - }, - }, - { - content: - "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site.", - }, - { - index: { - _index: "semantic-index", - }, - }, - { - content: - "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face.", - }, - { - index: { - _index: "semantic-index", - }, - }, - { - content: - "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site.", - }, - ], -}); -``` -::: +## Overview of semantic search workflows [semantic-search-workflows-overview] -:::{tab-item} PHP -:sync: php -```php -$resp = $client->bulk([ - "pretty" => "true", - "body" => array( - [ - "index" => [ - "_index" => "semantic-index", - ], - ], - [ - "content" => "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site.", - ], - [ - "index" => [ - "_index" => "semantic-index", - ], - ], - [ - "content" => "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face.", - ], - [ - "index" => [ - "_index" => "semantic-index", - ], - ], - [ - "content" => "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site.", - ], - ), -]); +You have several options for using NLP models for semantic search in the {{stack}}: -``` -::: +* [Option 1](#_semantic_text_workflow): Use the `semantic_text` workflow (recommended) +* [Option 2](#_infer_api_workflow): Use the {{infer}} API workflow +* [Option 3](#_model_deployment_workflow): Deploy models directly in {{es}} -:::{tab-item} Ruby -:sync: ruby -```ruby -response = client.bulk( - pretty: "true", - body: [ - { - "index": { - "_index": "semantic-index" - } - }, - { - "content": "Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site." - }, - { - "index": { - "_index": "semantic-index" - } - }, - { - "content": "Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face." - }, - { - "index": { - "_index": "semantic-index" - } - }, - { - "content": "Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site." - } - ] -) +This diagram summarizes the relative complexity of each workflow: -``` +:::{image} /solutions/images/elasticsearch-reference-semantic-options.svg +:alt: Overview of semantic search workflows in {{es}} ::: -:::: - -The bulk ingestion might take longer than the default request timeout. -If it times out, wait for the ELSER model to load (typically 1-5 minutes) then retry it. -You can check the model state by going to the **{{models-app}}** page from the navigation menu or the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). - -First, the content is divided into smaller, manageable chunks to ensure that meaningful segments can be more effectively processed and searched. -Each chunk of text is then transformed into a sparse vector by using the ELSER model's text expansion techniques. - -![Semantic search chunking](https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt9bbe5e260012b15d/67ffffc8165067d96124b586/animated-gif-semantic-search-chunking.gif) - -The vectors are stored in {{es}} and are ready to be used for semantic search. -::::: -:::::{step} Explore the data - -To familiarize yourself with this data set, open [Discover](/explore-analyze/discover.md) from the navigation menu or the global search field. - -In **Discover**, you can click the expand icon {icon}`expand` to show details about documents in the table: - -:::{image} /solutions/images/serverless-discover-semantic.png -:screenshot: -:alt: Discover table view with document expanded -:::: +## Choose a semantic search workflow [using-nlp-models] -For more tips, check out [](/explore-analyze/discover/discover-get-started.md). -::::: -:::::: +### Option 1: `semantic_text` [_semantic_text_workflow] -## Test semantic search +The simplest way to use NLP models in the {{stack}} is through the [`semantic_text` workflow](semantic-search/semantic-search-semantic-text.md). We recommend using this approach because it abstracts away a lot of manual work. All you need to do is create an index mapping to start ingesting, embedding, and querying data. There is no need to define model-related settings and parameters, or to create {{infer}} ingest pipelines. For guidance on the available query types for `semantic_text`, see [Querying `semantic_text` fields](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md#querying-semantic-text-fields). -When you run a semantic search, the text in your query must be turned into vectors that use the same embedding model as your vector database. -This step is performed automatically when you use `semantic_text` fields. -You therefore only need to pick a query language and a method for comparing the vectors. +To learn more about supported services, refer to [](/explore-analyze/elastic-inference/inference-api.md) and the [{{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference) documentation. For an end-to-end tutorial, refer to [Semantic search with `semantic_text`](semantic-search/semantic-search-semantic-text.md). -::::::{stepper} -:::::{step} Choose a query language +### Option 2: Inference API [_infer_api_workflow] -{{es}} provides a variety of query languages for interacting with your data. -For an overview of their features and use cases, check out [](/explore-analyze/query-filter/languages.md). -The [Elasticsearch Query Language](elasticsearch://reference/query-languages/esql.md) (ES|QL) is designed to be easy to read and write. -It enables you to query your data directly in **Discover**, so it's a good one to start with. +The {{infer}} API workflow is more complex but offers greater control over the {{infer}} endpoint configuration. You need to create an {{infer}} endpoint, provide various model-related settings and parameters, and define an index mapping. Optionally you can also set up an {{infer}} ingest pipeline for automatic embedding during data ingestion, or alternatively, you can manually call the {{infer}} API. -Go to **Discover** and select **Try ES|QL** from the application menu bar. -::::: -:::::{step} Choose a vector comparison method -You can search data that is stored in `semantic_text` fields by using a specific subset of queries, including `knn`, `match`, `semantic`, and `sparse_vector`. -For the definitive list of supported queries, refer to [Semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md). +For an end-to-end tutorial, refer to [Semantic search with the {{infer}} API](semantic-search/semantic-search-inference.md). -In ES|QL, you can perform semantic searches on `semantic_text` field types using the same match syntax as full-text search. -For example: +### Option 3: Manual model deployment [_model_deployment_workflow] -```esql -FROM semantic-index <1> -| WHERE content: "what's the biggest park?" <2> -| LIMIT 10 <3> -``` +You can also deploy NLP in {{es}} manually, without using an {{infer}} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {{stack}}. You need to select an NLP model from the [list of supported dense and sparse vector models](../../explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. -1. The FROM source command returns a table of data from the specified index. -2. A simplified syntax for the MATCH search function, this command performs a semantic query on the specified field. Think of some queries that are relevant to the documents you explored, such as finding the biggest park or the best for rappelling. -3. The LIMIT processing command defines the maximum number of rows to return. - -When you click **▶Run**, the results appear in a table. -Each row in the table represents a document. - -To learn more about these commands, refer to [ES|QL syntax reference](elasticsearch://reference/query-languages/esql/esql-syntax-reference.md) and [](/solutions/search/esql-for-search.md). -::::: -:::::{step} Analyze the results - -To have a better understanding of how well each document matches your query, add commands to include the relevance score and sort the results based on that value. -For example: - -```esql -FROM semantic-index METADATA _score <1> - | WHERE content: "best spot for rappelling" - | KEEP content, _score <2> - | SORT _score DESC <3> - | LIMIT 10 -``` - -1. The `METADATA` clause provides access to the query relevance score, which is a [metadata field](elasticsearch://reference/query-languages/esql/esql-metadata-fields.md). -2. The KEEP processing command affects the columns and their order in the results table. -3. The results are sorted in descending order based on the `_score`. +For an end-to-end tutorial, refer to [Semantic search with a model deployed in {{es}}](vector/dense-versus-sparse-ingest-pipelines.md). ::::{tip} -Click the **ES|QL help** button to open the in-product reference documentation for all commands and functions or to get recommended queries. For more tips, check out [Using ES|QL in Discover](/explore-analyze/discover/try-esql.md). -:::: - -In this example, the first row in the table is the document related to Rocky Mountain National Park, which had the highest relevance score for the query: - -:::{image} /solutions/images/serverless-discover-semantic-esql.png -:screenshot: -:alt: Run an ES|QL semantic query in Discover -:::: - -Optionally, try out the same search as an API request in **Console**: - -::::{tab-set} -:group: api-examples - -:::{tab-item} Console -:sync: console -```console -POST /_query?format=txt -{ - "query": """ - FROM semantic-index METADATA _score - | WHERE content: "best spot for rappelling" - | KEEP content, _score - | SORT _score DESC - | LIMIT 10 - """ -} -``` -::: - -:::{tab-item} curl -:sync: curl -```bash -curl -X POST "$ELASTICSEARCH_URL/_query?format=txt" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" \ - -H "Content-Type: application/json" \ - -d '{"query":"\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n "}' -``` -::: - -:::{tab-item} Python -:sync: python -```python -resp = client.esql.query( - format="txt", - query="\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n ", -) - -``` -::: - -:::{tab-item} JavaScript -:sync: js -```js -const response = await client.esql.query({ - format: "txt", - query: - '\n FROM semantic-index METADATA _score\n | WHERE content: "best spot for rappelling"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n ', -}); -``` -::: - -:::{tab-item} PHP -:sync: php -```php -$resp = $client->esql()->query([ - "format" => "txt", - "body" => [ - "query" => "\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n ", - ], -]); - -``` -::: - -:::{tab-item} Ruby -:sync: ruby -```ruby -response = client.esql.query( - format: "txt", - body: { - "query": "\n FROM semantic-index METADATA _score\n | WHERE content: \"best spot for rappelling\"\n | KEEP content, _score\n | SORT _score DESC\n | LIMIT 10\n " - } -) - -``` -::: - -:::: - -When you finish your tests and no longer need the sample data set, delete the index: - -::::{tab-set} -:group: api-examples - -:::{tab-item} Console -:sync: console -```console -DELETE /semantic-index -``` -::: - -:::{tab-item} curl -:sync: curl -```bash -curl -X DELETE "$ELASTICSEARCH_URL/semantic-index" \ - -H "Authorization: ApiKey $ELASTIC_API_KEY" -``` -::: - -:::{tab-item} Python -:sync: python -```python -resp = client.indices.delete( - index="semantic-index", -) - -``` -::: - -:::{tab-item} JavaScript -:sync: js -```js -const response = await client.indices.delete({ - index: "semantic-index", -}); -``` -::: - -:::{tab-item} PHP -:sync: php -```php -$resp = $client->indices()->delete([ - "index" => "semantic-index", -]); - -``` -::: - -:::{tab-item} Ruby -:sync: ruby -```ruby -response = client.indices.delete( - index: "semantic-index" -) - -``` -::: - +Refer to [vector queries and field types](vector.md#vector-queries-and-field-types) for a quick reference overview. :::: -::::: -:::::: +## Learn more [semantic-search-read-more] -## Next steps +### Interactive examples -Thanks for taking the time to try out semantic search. -For a deeper dive, go to [](/solutions/search/semantic-search.md). +- The [`elasticsearch-labs`](https://github.com/elastic/elasticsearch-labs) repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {{es}} Python client +- [Semantic search with ELSER using the model deployment workflow](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb) +- [Semantic search with `semantic_text`](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb) -If you want to extend this example, try an index with more fields. -For example, if you have both a `text` field and a `semantic_text` field, you can combine the strengths of traditional keyword search and advanced semantic search. -A [hybrid search](/solutions/search/hybrid-semantic-text.md) provides comprehensive search capabilities to find relevant information based on both the raw text and its underlying meaning. +### Blogs -To learn about more options, such as vector and keyword search, go to [](/solutions/search/search-approaches.md). -For a summary of the AI-powered search use cases, go to [](/solutions/search/ai-search/ai-search.md). +- [{{es}} new semantic_text mapping: Simplifying semantic search](https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text) +- [GA information for `semantic_text`](https://www.elastic.co/search-labs/blog/semantic-text-ga) +- [Introducing ELSER: Elastic's AI model for semantic search](https://www.elastic.co/blog/may-2023-launch-sparse-encoder-ai-model) +- [How to get the best of lexical and AI-powered search with Elastic's vector database](https://www.elastic.co/blog/lexical-ai-powered-search-elastic-vector-database) +- Information retrieval blog series: + - [Part 1: Steps to improve search relevance](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-search-relevance) + - [Part 2: Benchmarking passage retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval) + - [Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model](https://www.elastic.co/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model) + - [Part 4: Hybrid retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid) \ No newline at end of file From 231871711e49a70f107045f8f71bf49613598b27 Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Fri, 14 Nov 2025 13:04:23 +0100 Subject: [PATCH 6/7] restore semantic-search.md really --- .../search/get-started/semantic-search.md | 234 ++++++++++++++---- 1 file changed, 187 insertions(+), 47 deletions(-) diff --git a/solutions/search/get-started/semantic-search.md b/solutions/search/get-started/semantic-search.md index 8bc5665a9c..babff9646b 100644 --- a/solutions/search/get-started/semantic-search.md +++ b/solutions/search/get-started/semantic-search.md @@ -1,81 +1,221 @@ --- -mapped_pages: - - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html - - https://www.elastic.co/guide/en/serverless/current/elasticsearch-reference-semantic-search.html +navigation_title: Semantic search +description: An introduction to semantic search in Elasticsearch. applies_to: - stack: - serverless: + serverless: all + stack: all products: - id: elasticsearch - - id: cloud-serverless --- +# Get started with semantic search -# Semantic search [semantic-search] +_Semantic search_ is a type of AI-powered search that enables you to use natural language in your queries. +It returns results that match the meaning of a query, as opposed to literal keyword matches. +For example, if you want to search for workplace guidelines on a second income, you could search for "side hustle", which is not a term you're likely to see in a formal HR document. -:::{note} -This page focuses on the semantic search workflows available in {{es}}. For detailed information about lower-level vector search implementations, refer to [vector search](vector.md). +Semantic search uses {{es}} [vector database](https://www.elastic.co/what-is/vector-database) and [vector search](https://www.elastic.co/what-is/vector-search) technology. +Each _vector_ (or _vector embedding_) is an array of numbers that represent different characteristics of the text, such as sentiment, context, and syntactics. +These numeric representations make vector comparisons very efficient. + +In this quickstart guide, you'll create vectors for a small set of sample data, store them in {{es}}, then run a semantic query. +By playing with a simple use case, you'll take the first steps toward understanding whether it's applicable to your own data. + +## Prerequisites + +- If you're using {{es-serverless}}, you must have a `developer` or `admin` predefined role or an equivalent custom role to add the sample data. +- If you're using [{{ech}}](/deploy-manage/deploy/elastic-cloud/cloud-hosted.md) or [running {{es}} locally](/deploy-manage/deploy/self-managed/local-development-installation-quickstart.md), start {{es}} and {{kib}}. To add the sample data, log in with a user that has the `superuser` built-in role. + +To learn about role-based access control, check out [](/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles.md). + +## Create a vector database + +When you create vectors (or _vectorize_ your data), you convert complex and nuanced documents into multidimensional numerical representations. +You can choose from many different vector embedding models. Some are extremely hardware efficient and can be run with less computational power. Others have a greater understanding of the context, can answer questions, and lead a threaded conversation. +The examples in this guide use the default Learned Sparse Encoder ([ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md)) model, which provides great relevance across domains without the need for additional fine tuning. + +The way that you store vectors has a significant impact on the performance and accuracy of search results. +They must be stored in specialized data structures designed to ensure efficient similarity search and speedy vector distance calculations. +This guide uses the [semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md), which provides sensible defaults and automation. + +:::::{stepper} +::::{step} Create an index +An index is a collection of documents uniquely identified by a name or an alias. +You can follow the guided index workflow: + +- If you're using {{es-serverless}}, {{ech}}, or running {{es}} locally: + 1. Go to the **Index Management** page using the navigation menu or the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). + 2. Select **Create index**, select **Semantic Search**, and follow the guided workflow. + +When you complete the workflow, you will have sample data and can skip to the steps related to exploring and searching it. +Alternatively, run the following API request in [Console](/explore-analyze/query-filter/tools/console.md): + +```console +PUT /semantic-index +``` + +:::{tip} +For an introduction to the concept of indices, check out [](/manage-data/data-store/index-basics.md). ::: +:::: +::::{step} Create a semantic_text field mapping +Each index has mappings that define how data is stored and indexed, like a schema in a relational database. +The following example creates a mapping for a single field ("content"): + +```console +PUT /semantic-index/_mapping +{ + "properties": { + "content": { + "type": "semantic_text" + } + } +} +``` + +When you use `semantic_text` fields, the type of vector is determined by the vector embedding model. +In this case, the default ELSER model will be used to create sparse vectors. + +For a deeper dive, check out [Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector](https://www.elastic.co/search-labs/blog/mapping-embeddings-to-elasticsearch-field-types). +:::: + +::::{step} Add documents -{{es}} provides various semantic search capabilities using [natural language processing (NLP)](/explore-analyze/machine-learning/nlp.md) and [vector search](vector.md). +You can use the Elasticsearch bulk API to ingest an array of documents: -To understand the infrastructure that powers semantic search and other NLP tasks, including managed services and inference endpoints, see the [Elastic Inference overview](../../explore-analyze/elastic-inference.md) page. +```console +POST /_bulk?pretty +{ "index": { "_index": "semantic-index" } } +{"content":"Yellowstone National Park is one of the largest national parks in the United States. It ranges from the Wyoming to Montana and Idaho, and contains an area of 2,219,791 acress across three different states. Its most famous for hosting the geyser Old Faithful and is centered on the Yellowstone Caldera, the largest super volcano on the American continent. Yellowstone is host to hundreds of species of animal, many of which are endangered or threatened. Most notably, it contains free-ranging herds of bison and elk, alongside bears, cougars and wolves. The national park receives over 4.5 million visitors annually and is a UNESCO World Heritage Site."} +{ "index": { "_index": "semantic-index" } } +{"content":"Yosemite National Park is a United States National Park, covering over 750,000 acres of land in California. A UNESCO World Heritage Site, the park is best known for its granite cliffs, waterfalls and giant sequoia trees. Yosemite hosts over four million visitors in most years, with a peak of five million visitors in 2016. The park is home to a diverse range of wildlife, including mule deer, black bears, and the endangered Sierra Nevada bighorn sheep. The park has 1,200 square miles of wilderness, and is a popular destination for rock climbers, with over 3,000 feet of vertical granite to climb. Its most famous and cliff is the El Capitan, a 3,000 feet monolith along its tallest face."} +{ "index": { "_index": "semantic-index" } } +{"content":"Rocky Mountain National Park is one of the most popular national parks in the United States. It receives over 4.5 million visitors annually, and is known for its mountainous terrain, including Longs Peak, which is the highest peak in the park. The park is home to a variety of wildlife, including elk, mule deer, moose, and bighorn sheep. The park is also home to a variety of ecosystems, including montane, subalpine, and alpine tundra. The park is a popular destination for hiking, camping, and wildlife viewing, and is a UNESCO World Heritage Site."} +``` -Learn more about use cases for AI-powered search in the [overview](ai-search/ai-search.md) page. +The bulk ingestion might take longer than the default request timeout. +If it times out, wait for the ELSER model to load (typically 1-5 minutes) then retry it. +You can check the model state by going to the **{{models-app}}** page from the navigation menu or the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). -## Overview of semantic search workflows [semantic-search-workflows-overview] +First, the content is divided into smaller, manageable chunks to ensure that meaningful segments can be more effectively processed and searched. +Each chunk of text is then transformed into a sparse vector by using the ELSER model's text expansion techniques. -You have several options for using NLP models for semantic search in the {{stack}}: +![Semantic search chunking](https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blt9bbe5e260012b15d/67ffffc8165067d96124b586/animated-gif-semantic-search-chunking.gif) + +The vectors are stored in {{es}} and are ready to be used for semantic search. +:::: +::::{step} Explore the data -* [Option 1](#_semantic_text_workflow): Use the `semantic_text` workflow (recommended) -* [Option 2](#_infer_api_workflow): Use the {{infer}} API workflow -* [Option 3](#_model_deployment_workflow): Deploy models directly in {{es}} +To familiarize yourself with this data set, open [Discover](/explore-analyze/discover.md) from the navigation menu or the global search field. -This diagram summarizes the relative complexity of each workflow: +In **Discover**, you can click the expand icon {icon}`expand` to show details about documents in the table: -:::{image} /solutions/images/elasticsearch-reference-semantic-options.svg -:alt: Overview of semantic search workflows in {{es}} +:::{image} /solutions/images/serverless-discover-semantic.png +:screenshot: +:alt: Discover table view with document expanded ::: -## Choose a semantic search workflow [using-nlp-models] +For more tips, check out [](/explore-analyze/discover/discover-get-started.md). +:::: +::::: -### Option 1: `semantic_text` [_semantic_text_workflow] +## Test semantic search -The simplest way to use NLP models in the {{stack}} is through the [`semantic_text` workflow](semantic-search/semantic-search-semantic-text.md). We recommend using this approach because it abstracts away a lot of manual work. All you need to do is create an index mapping to start ingesting, embedding, and querying data. There is no need to define model-related settings and parameters, or to create {{infer}} ingest pipelines. For guidance on the available query types for `semantic_text`, see [Querying `semantic_text` fields](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md#querying-semantic-text-fields). +When you run a semantic search, the text in your query must be turned into vectors that use the same embedding model as your vector database. +This step is performed automatically when you use `semantic_text` fields. +You therefore only need to pick a query language and a method for comparing the vectors. -To learn more about supported services, refer to [](/explore-analyze/elastic-inference/inference-api.md) and the [{{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference) documentation. For an end-to-end tutorial, refer to [Semantic search with `semantic_text`](semantic-search/semantic-search-semantic-text.md). +:::::{stepper} +::::{step} Choose a query language -### Option 2: Inference API [_infer_api_workflow] +{{es}} provides a variety of query languages for interacting with your data. +For an overview of their features and use cases, check out [](/explore-analyze/query-filter/languages.md). +The [Elasticsearch Query Language](elasticsearch://reference/query-languages/esql.md) (ES|QL) is designed to be easy to read and write. +It enables you to query your data directly in **Discover**, so it's a good one to start with. -The {{infer}} API workflow is more complex but offers greater control over the {{infer}} endpoint configuration. You need to create an {{infer}} endpoint, provide various model-related settings and parameters, and define an index mapping. Optionally you can also set up an {{infer}} ingest pipeline for automatic embedding during data ingestion, or alternatively, you can manually call the {{infer}} API. +Go to **Discover** and select **Try ES|QL** from the application menu bar. +:::: +::::{step} Choose a vector comparison method +You can search data that is stored in `semantic_text` fields by using a specific subset of queries, including `knn`, `match`, `semantic`, and `sparse_vector`. +For the definitive list of supported queries, refer to [Semantic text field type](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md). -For an end-to-end tutorial, refer to [Semantic search with the {{infer}} API](semantic-search/semantic-search-inference.md). +In ES|QL, you can perform semantic searches on `semantic_text` field types using the same match syntax as full-text search. +For example: -### Option 3: Manual model deployment [_model_deployment_workflow] +```esql +FROM semantic-index <1> +| WHERE content: "what's the biggest park?" <2> +| LIMIT 10 <3> +``` -You can also deploy NLP in {{es}} manually, without using an {{infer}} endpoint. This is the most complex and labor intensive workflow for performing semantic search in the {{stack}}. You need to select an NLP model from the [list of supported dense and sparse vector models](../../explore-analyze/machine-learning/nlp/ml-nlp-model-ref.md#ml-nlp-model-ref-text-embedding), deploy it using the Eland client, create an index mapping, and set up a suitable ingest pipeline to start ingesting and querying data. +1. The FROM source command returns a table of data from the specified index. +2. A simplified syntax for the MATCH search function, this command performs a semantic query on the specified field. Think of some queries that are relevant to the documents you explored, such as finding the biggest park or the best for rappelling. +3. The LIMIT processing command defines the maximum number of rows to return. -For an end-to-end tutorial, refer to [Semantic search with a model deployed in {{es}}](vector/dense-versus-sparse-ingest-pipelines.md). +When you click **▶Run**, the results appear in a table. +Each row in the table represents a document. -::::{tip} -Refer to [vector queries and field types](vector.md#vector-queries-and-field-types) for a quick reference overview. +To learn more about these commands, refer to [ES|QL syntax reference](elasticsearch://reference/query-languages/esql/esql-syntax-reference.md) and [](/solutions/search/esql-for-search.md). :::: +::::{step} Analyze the results + +To have a better understanding of how well each document matches your query, add commands to include the relevance score and sort the results based on that value. +For example: + +```esql +FROM semantic-index METADATA _score <1> + | WHERE content: "best spot for rappelling" + | KEEP content, _score <2> + | SORT _score DESC <3> + | LIMIT 10 +``` + +1. The `METADATA` clause provides access to the query relevance score, which is a [metadata field](elasticsearch://reference/query-languages/esql/esql-metadata-fields.md). +2. The KEEP processing command affects the columns and their order in the results table. +3. The results are sorted in descending order based on the `_score`. + +:::{tip} +Click the **ES|QL help** button to open the in-product reference documentation for all commands and functions or to get recommended queries. For more tips, check out [Using ES|QL in Discover](/explore-analyze/discover/try-esql.md). +::: + +In this example, the first row in the table is the document related to Rocky Mountain National Park, which had the highest relevance score for the query: -## Learn more [semantic-search-read-more] +:::{image} /solutions/images/serverless-discover-semantic-esql.png +:screenshot: +:alt: Run an ES|QL semantic query in Discover +::: + +Optionally, try out the same search as an API request in **Console**: + +```console +POST /_query?format=txt +{ + "query": """ + FROM semantic-index METADATA _score + | WHERE content: "best spot for rappelling" + | KEEP content, _score + | SORT _score DESC + | LIMIT 10 + """ +} +``` + +When you finish your tests and no longer need the sample data set, delete the index: + +```console +DELETE /semantic-index +``` + +:::: +::::: -### Interactive examples +## Next steps -- The [`elasticsearch-labs`](https://github.com/elastic/elasticsearch-labs) repo contains a number of interactive semantic search examples in the form of executable Python notebooks, using the {{es}} Python client -- [Semantic search with ELSER using the model deployment workflow](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/03-ELSER.ipynb) -- [Semantic search with `semantic_text`](https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb) +Thanks for taking the time to try out semantic search. +For a deeper dive, go to [](/solutions/search/semantic-search.md). -### Blogs +If you want to extend this example, try an index with more fields. +For example, if you have both a `text` field and a `semantic_text` field, you can combine the strengths of traditional keyword search and advanced semantic search. +A [hybrid search](/solutions/search/hybrid-semantic-text.md) provides comprehensive search capabilities to find relevant information based on both the raw text and its underlying meaning. -- [{{es}} new semantic_text mapping: Simplifying semantic search](https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text) -- [GA information for `semantic_text`](https://www.elastic.co/search-labs/blog/semantic-text-ga) -- [Introducing ELSER: Elastic's AI model for semantic search](https://www.elastic.co/blog/may-2023-launch-sparse-encoder-ai-model) -- [How to get the best of lexical and AI-powered search with Elastic's vector database](https://www.elastic.co/blog/lexical-ai-powered-search-elastic-vector-database) -- Information retrieval blog series: - - [Part 1: Steps to improve search relevance](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-search-relevance) - - [Part 2: Benchmarking passage retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval) - - [Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model](https://www.elastic.co/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model) - - [Part 4: Hybrid retrieval](https://www.elastic.co/blog/improving-information-retrieval-elastic-stack-hybrid) \ No newline at end of file +To learn about more options, such as vector and keyword search, go to [](/solutions/search/search-approaches.md). +For a summary of the AI-powered search use cases, go to [](/solutions/search/ai-search/ai-search.md). From 512c16a6d39956ff79ebcfab4f3ef4678c11cce5 Mon Sep 17 00:00:00 2001 From: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Date: Mon, 17 Nov 2025 09:18:51 +0100 Subject: [PATCH 7/7] delete now moot tip --- solutions/search/get-started/index-basics.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/solutions/search/get-started/index-basics.md b/solutions/search/get-started/index-basics.md index 9b2aaf6434..0c05902b0c 100644 --- a/solutions/search/get-started/index-basics.md +++ b/solutions/search/get-started/index-basics.md @@ -7,11 +7,6 @@ applies_to: This quickstart provides a hands-on introduction to the fundamental concepts of {{es}}: [indices, documents, and field type mappings](../../../manage-data/data-store/index-basics.md). You'll learn how to create an index, add documents, work with dynamic and explicit mappings, and perform your first basic searches. -:::::{tip} -The code examples are in [Console](/explore-analyze/query-filter/tools/console.md) syntax by default. -You can [convert into other programming languages](/explore-analyze/query-filter/tools/console.md#import-export-console-requests) in the Console UI. -::::: - ## Requirements [getting-started-requirements] You can follow this guide using any {{es}} deployment.