diff --git a/deploy-manage/_snippets/field-doc-sec-limitations.md b/deploy-manage/_snippets/field-doc-sec-limitations.md index 100839898f..f4c95dd741 100644 --- a/deploy-manage/_snippets/field-doc-sec-limitations.md +++ b/deploy-manage/_snippets/field-doc-sec-limitations.md @@ -40,5 +40,5 @@ When a user’s role enables document or [field level security](/deploy-manage/u * The request cache is disabled for search requests if either of the following are true: - * The role query that defines document level security is [templated](/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md#templating-role-query) using a [stored script](/explore-analyze/scripting/modules-scripting-using.md#script-stored-scripts). + * The role query that defines document level security is [templated](/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md#templating-role-query) using a [stored script](/explore-analyze/scripting/modules-scripting-store-and-retrieve.md). * The target indices are a mix of local and remote indices. diff --git a/deploy-manage/tools/snapshot-and-restore.md b/deploy-manage/tools/snapshot-and-restore.md index c77ab1c5a8..6e1b8215d4 100644 --- a/deploy-manage/tools/snapshot-and-restore.md +++ b/deploy-manage/tools/snapshot-and-restore.md @@ -91,7 +91,7 @@ By default, a snapshot of a cluster contains the cluster state, all regular data - [Legacy index templates](https://www.elastic.co/guide/en/elasticsearch/reference/8.18/indices-templates-v1.html) - [Ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) - [ILM policies](/manage-data/lifecycle/index-lifecycle-management.md) -- [Stored scripts](/explore-analyze/scripting/modules-scripting-using.md#script-stored-scripts) +- [Stored scripts](/explore-analyze/scripting/modules-scripting-store-and-retrieve.md) - For snapshots taken after 7.12.0, [feature states](#feature-state) You can also take snapshots of only specific data streams or indices in the cluster. A snapshot that includes a data stream or index automatically includes its aliases. When you restore a snapshot, you can choose whether to restore these aliases. diff --git a/explore-analyze/alerts-cases/watcher/how-watcher-works.md b/explore-analyze/alerts-cases/watcher/how-watcher-works.md index 9f99af402a..c26d23ede3 100644 --- a/explore-analyze/alerts-cases/watcher/how-watcher-works.md +++ b/explore-analyze/alerts-cases/watcher/how-watcher-works.md @@ -184,7 +184,7 @@ Deactivating a watch also enables you to keep it around for future use without d You can use scripts and templates when defining a watch. Scripts and templates can reference elements in the watch execution context, including the watch payload. The execution context defines variables you can use in a script and parameter placeholders in a template. -{{watcher}} uses the Elasticsearch script infrastructure, which supports [inline](#inline-templates-scripts) and [stored](#stored-templates-scripts). Scripts and templates are compiled and cached by Elasticsearch to optimize recurring execution. Autoloading is also supported. For more information, see [Scripting](../../scripting.md) and [*How to write scripts*](../../scripting/modules-scripting-using.md). +{{watcher}} uses the Elasticsearch script infrastructure, which supports [inline](#inline-templates-scripts) and [stored](#stored-templates-scripts). Scripts and templates are compiled and cached by Elasticsearch to optimize recurring execution. Autoloading is also supported. For more information, see [Scripting](../../scripting.md) and [*How to write Painless scripts*](../../scripting/modules-scripting-using.md). ### Watch execution context [watch-execution-context] diff --git a/explore-analyze/scripting/common-script-uses.md b/explore-analyze/scripting/common-script-uses.md index 63134a1913..c949c41b0a 100644 --- a/explore-analyze/scripting/common-script-uses.md +++ b/explore-analyze/scripting/common-script-uses.md @@ -8,10 +8,19 @@ products: - id: elasticsearch --- -# Common scripting use cases [common-script-uses] +# Painless script tutorials [common-script-uses] -You can write a script to do almost anything, and sometimes, that’s the trouble. It’s challenging to know what’s possible with scripts, so the following examples address common uses cases where scripts are really helpful. - -* [Field extraction](scripting-field-extraction.md) +You can write a script to do almost anything, and sometimes, that’s the challenge. It’s difficult to know what’s possible with scripts, so these tutorials address common use cases where scripts are particularly helpful. +Painless scripting becomes powerful when applied to real-world scenarios. These tutorials walk you through essential patterns and operations, providing working examples you can modify for your specific use cases. +* [Accessing document fields and special variables](/explore-analyze/scripting/modules-scripting-fields.md) +* [Accessing fields in a document](/explore-analyze/scripting/script-fields-api.md) +* [Converting data types](/explore-analyze/scripting/modules-scripting-type-casting-tutorial.md) +* [Dissecting data](/explore-analyze/scripting/dissect.md) +* [Extracting fields](/explore-analyze/scripting/scripting-field-extraction.md) +* [Grokking grok](/explore-analyze/scripting/grok.md) +* [Scripts, caching, and search speed](/explore-analyze/scripting/scripts-search-speed.md) +* [Updating documents](/explore-analyze/scripting/modules-scripting-document-update-tutorial.md) +* [Using Painless regular expressions](/explore-analyze/scripting/modules-scripting-regular-expressions-tutorial.md) +* [Working with dates](/explore-analyze/scripting/modules-scripting-datetime-tutorial.md) diff --git a/explore-analyze/scripting/modules-scripting-datetime-tutorial.md b/explore-analyze/scripting/modules-scripting-datetime-tutorial.md new file mode 100644 index 0000000000..5e2b3fad3a --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-datetime-tutorial.md @@ -0,0 +1,423 @@ +--- +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Working with dates [datetime-tutorial] + +In this tutorial you’ll learn how to use Painless scripts to work with dates in three scenarios: + +* Add quarter and fiscal year fields during ingestion +* Calculate delivery dates with runtime fields +* Transform time data during reindex + +## Prerequisites + +This tutorial uses the `kibana_sample_data_ecommerce` dataset. Refer to [Context example data](elasticsearch://reference/scripting-languages/painless/painless-context-examples.md) to get started. + +The examples work with the dataset’s `order_date` field, which contains ISO 8601-formatted datetime strings such as `2025-08-29T16:49:26+00:00`. For more details refer to [Date field type](elasticsearch://reference/elasticsearch/mapping-reference/date.md). + +## Add quarter and fiscal year fields during ingestion (ingest context) + +The [ingest pipeline](/manage-data/ingest/transform-enrich/ingest-pipelines.md) allows you to add standardized time period fields (like a fiscal quarter) during document ingestion. This is ideal when you need reporting fields such as quarters, fiscal years, and week classifications without calculating them repeatedly at query time. + +### Writing the Painless script + +To achieve this, we create the ingest pipeline with a script processor that adds documents to the fields we want to use: + +```json +PUT _ingest/pipeline/kibana_sample_data_ecommerce-add_reporting_fields +{ + "description": "Add reporting period fields from order_date", + "processors": [ + { + "script": { + "lang": "painless", + "source": """ + // Parse order_date string to Calendar object + def calendar = Calendar.getInstance(); + calendar.setTime(Date.from(Instant.parse(ctx.order_date))); + + // Extract date components + int year = calendar.get(Calendar.YEAR); + int month = calendar.get(Calendar.MONTH) + 1; // Calendar.MONTH is 0-based + int dayOfWeek = calendar.get(Calendar.DAY_OF_WEEK); + + // Calculate derived periods + int quarter = (int)Math.ceil((double)month / 3.0); + int fiscalYear = (month >= 4) ? year : year - 1; // Fiscal year starts in April + boolean isWeekend = (dayOfWeek == 1 || dayOfWeek == 7); // Sunday=1, Saturday=7 + + // Add reporting fields + ctx.reporting_year = year; + ctx.reporting_month = month; + ctx.reporting_quarter = quarter; + ctx.reporting_fiscal_year = fiscalYear; + ctx.is_weekend_order = isWeekend; + + ctx.updated_timestamp = new Date(); + """ + } + } + ] +} +``` + +This script includes the following steps: + +* Parse the datetime string: Uses `Instant.parse()` to convert the ISO 8601 string into a `Date` object via `Calendar.getInstance()` +* Extract date components: Retrieves the year, month, and day of the week, noting that `Calendar.MONTH` is zero-based (January \= 0\) +* Calculate business periods: Computes quarter using `Math.ceil()`, fiscal year assuming April start, and weekend classification +* Add new fields: Stores all calculated values in the document context (`ctx`) + + +For more details about Painless scripting in the ingest context, refer to [Ingest processor context](elasticsearch://reference/scripting-languages/painless/painless-ingest-processor-context.md) and Painless syntax-context bridge. + +### Test the pipeline + +To confirm the pipeline works correctly, [simulate](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-simulate) it with `kibana_sample_data_ecommerce` sample documents: + +```json +POST _ingest/pipeline/kibana_sample_data_ecommerce-add_reporting_fields/_simulate +{ + "docs": [ + { + "_source": { + "updated_timestamp": "2025-08-20T18:21:30.943Z", + "order_date": "2025-08-29T16:49:26+00:00" + } + }, + { + "_source": { + "updated_timestamp": "2025-08-20T18:21:30.943Z", + "order_date": "2025-08-14T10:14:53+00:00" + } + } + ] +} +``` + +The simulation confirms that your pipeline correctly adds the reporting fields to each document + +:::{dropdown} Response + +```json +{ + "docs": [ + { + "doc": { + "_index": "_index", + "_version": "-3", + "_id": "_id", + "_source": { + "order_date": "2025-08-29T16:49:26+00:00", + "reporting_quarter": 3, + "reporting_year": 2025, + "updated_timestamp": "2025-08-21T16:48:31.318Z", + "reporting_fiscal_year": 2025, + "is_weekend_order": false, + "reporting_month": 8 + }, + "_ingest": { + "timestamp": "2025-08-21T16:48:31.318150744Z" + } + } + }, + { + "doc": { + "_index": "_index", + "_version": "-3", + "_id": "_id", + "_source": { + "order_date": "2025-08-14T10:14:53+00:00", + "reporting_quarter": 3, + "reporting_year": 2025, + "updated_timestamp": "2025-08-21T16:48:31.318Z", + "reporting_fiscal_year": 2025, + "is_weekend_order": false, + "reporting_month": 8 + }, + "_ingest": { + "timestamp": "2025-08-21T16:48:31.318189178Z" + } + } + } + ] +} +``` + +::: + +## Calculate delivery dates with runtime fields (runtime fields context) + +In this script, we will calculate the delivery time of an order by projecting the order date a certain number of days into the future. If the resulting date falls on a weekend, it will automatically shift to the following Monday. + +### Understanding runtime fields with date calculations + +Runtime fields compute values at query time, which means you can embed scheduling rules directly in the calculation. In this case, we add a configurable number of delivery days to the order date, then check whether the resulting day falls on a weekend. If it does, the script automatically shifts the delivery date to the next Monday. + +Create the runtime field: + +```json +PUT kibana_sample_data_ecommerce/_mapping +{ + "runtime": { + "delivery_timestamp": { + "type": "date", + "script": { + "lang": "painless", + "source": """ + if (doc.containsKey('order_date')) { + long orderTime = doc['order_date'].value.millis; + long deliveryDays = (long) params.delivery_days; + + // Add delivery days to order date + long deliveryTime = orderTime + (deliveryDays * 24 * 60 * 60 * 1000L); + + // Check if delivery falls on weekend + ZonedDateTime deliveryDateTime = Instant.ofEpochMilli(deliveryTime).atZone(ZoneId.of('UTC')); + int dayOfWeek = deliveryDateTime.getDayOfWeek().getValue(); // 1=Monday, 7=Sunday + + // If weekend, move to next Monday + if (dayOfWeek == 6 || dayOfWeek == 7) { + int daysToAdd = (dayOfWeek == 6) ? 2 : 1; + deliveryTime = deliveryTime + (daysToAdd * 24 * 60 * 60 * 1000L); + } + + emit(deliveryTime); + } + """, + "params": { + "delivery_days": 3 + } + } + } + } +} +``` + +This script includes the following steps: + +* **Access order date**: Uses `doc['order_date'].value.millis` to retrieve the timestamp in milliseconds +* **Add delivery days**: Converts days into milliseconds and adds them to the order date +* **Check day of week**: Uses `ZonedDateTime` to extract the weekday from the calculated delivery time +* **Skip weekends**: If Saturday or Sunday, moves the date to the following Monday +* **Emit results**: Returns the adjusted delivery timestamp as a date + +If everything works correctly, you should see: + +```json +{ + "acknowledged": true +} +``` + +Now you can sort and display orders by their adjusted delivery timestamp: + +```json +GET kibana_sample_data_ecommerce/_search +{ + "size": 5, + "fields": [ + "delivery_timestamp" + ], + "_source": { + "includes": [ + "order_id", + "order_date", + "customer_full_name" + ] + }, + "sort": [ + { + "delivery_timestamp": { + "order": "asc" + } + } + ] +} +``` + +The results show delivery dates that avoid weekends. For example, orders placed on Thursday or Friday with 3 delivery days are shifted to Monday instead of falling on Saturday or Sunday: + +:::{dropdown} Response + +```json +{ + "hits": { + "hits": [ + { + "_source": { + "customer_full_name": "Sultan Al Benson", + "order_date": "2025-08-07T00:04:19+00:00", + "order_id": 550375 + }, + "fields": { + "delivery_timestamp": [ + "2025-08-11T00:04:19.000Z" + ] + } + }, + { + "_source": { + "customer_full_name": "Pia Webb", + "order_date": "2025-08-07T00:08:38+00:00", + "order_id": 550385 + }, + "fields": { + "delivery_timestamp": [ + "2025-08-11T00:08:38.000Z" + ] + } + }, + { + "_source": { + "customer_full_name": "Jackson Bailey", + "order_date": "2025-08-08T00:12:58+00:00", + "order_id": 713287 + }, + "fields": { + "delivery_timestamp": [ + "2025-08-11T00:12:58.000Z" + ] + } + } + ] + } +} +``` + +::: + +## Transform time data during reindex (reindex context) + +In this example, we'll extract orders from a specific time window and add event-specific timing metadata for flash sale analysis. The script will calculate elapsed minutes from the event start, classify orders into time-based segments, and add fields for event analysis. + +### Understanding the reindex scenario + +Flash sale events generate concentrated purchasing activity within short time windows, making time-based analysis crucial for business intelligence. + +* **Rush periods:** Identifying peak demand moments for server capacity planning +* **Conversion timing:** Analyzing whether early shoppers differ from late-decision customers +* **Promotional effectiveness:** Measuring how purchasing behavior changes throughout the event duration + +**Our 12 AM flash sale example:** E-commerce flash sales commonly start at midnight to capture global audiences and create urgency through limited-time offers. A 12:00 AM start maximizes reach across time zones and leverages the psychological impact of "new day" promotions. By categorizing orders into timing segments (`rush_start`, `peak_hour`, `final_rush`), we can analyze: + +* **Rush\_start (0-30 min):** Early adopters who planned their purchase and stayed up for the launch +* **Peak\_hour (30-60 min):** Customers drawn in by social sharing and notifications +* **Final\_rush (60+ min):** Last-minute buyers motivated by scarcity + +Event metadata to be added: + +* `event_name`: Event identifier +* `event_segment`: Time-based classification +* `minutes_from_event_start`: Minutes elapsed since event start +* `event_hour`: Hour classification + +:::{important} +Depending on when you added the dataset to {{es}}, the dates may vary. Make sure to use a recent date range to ensure that the reindex call processes documents. +::: + +### Writing the reindex script + +The [reindex](elasticsearch://reference/elasticsearch/rest-apis/reindex-indices.md) operation will automatically generate the `flash_sale_event_analysis` index as it transfers and transforms documents from the source index. This destination index inherits the same field mappings as the source, with additional fields created by our script: + +```json +POST _reindex +{ + "source": { + "index": "kibana_sample_data_ecommerce", + "query": { + "range": { + "order_date": { + "gte": "2025-08-14T00:00:00", + "lte": "2025-08-14T02:00:00" + } + } + } + }, + "dest": { + "index": "flash_sale_event_analysis" + }, + "script": { + "lang": "painless", + "source": """ + // Parse the order_date string using ZonedDateTime + ZonedDateTime eventTime = ZonedDateTime.parse(ctx._source.order_date); + + int hour = eventTime.getHour(); + int minute = eventTime.getMinute(); + + // Calculate minutes from event start (12 AM = 00:00) + int minutesFromStart = hour * 60 + minute; + + // Classify by event timing + String eventSegment; + if (minutesFromStart <= 30) { + eventSegment = 'rush_start'; // First 30 minutes + } else if (minutesFromStart <= 60) { + eventSegment = 'peak_hour'; // 30-60 minutes + } else { + eventSegment = 'final_rush'; // Last hour + } + + // Add event analysis fields + ctx._source.event_name = 'flash_sale'; + ctx._source.event_segment = eventSegment; + ctx._source.minutes_from_event_start = minutesFromStart; + ctx._source.event_hour = 'hour_' + (hour + 1); // hour_1, hour_2 + """ + } +} +``` + +This script includes the following steps: + +* **Parse timestamps**: Converts ISO 8601 strings to ZonedDateTime objects +* **Calculate event timing**: Determines minutes elapsed since 12:00 AM start +* **Classify behavior periods**: Groups orders into `rush_start`/`peak_hour`/`final_rush` segments +* **Add metadata**: Adds event analysis fields for business intelligence + + +For more details about Painless scripting in the reindex context, refer to the [Reindex context documentation](elasticsearch://reference/scripting-languages/painless/painless-reindex-context.md) or [Reindex API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex). + +With the following request, we can see the final result in the `flash_sale_event_analysis` index: + +```json +GET flash_sale_event_analysis/_search +``` + +:::{dropdown} Response + +```json +"hits": [ + { + "_index": "flash_sale_event_analysis", + "_id": "c3acyJgBTbKqUnB55U8I", + "_score": 1, + "_source": { + "minutes_from_event_start": 2, + ... + "products": [...], + "event_hour": "hour_1", + ... + "event_segment": "rush_start", + ... + "order_date": "2025-08-25T00:23:02+00:00", + "event_name": "flash_sale", + ... + "order_id": 712908, + ... + } + } +] +``` + +::: + +## Learn more about datetime in Painless + +This tutorial showed you practical datetime scripting across ingest, runtime fields, and reindex contexts. For deeper datetime capabilities and advanced patterns, explore the [Using datetime in Painless](elasticsearch://reference/scripting-languages/painless/using-datetime-in-painless.md) reference documentation. diff --git a/explore-analyze/scripting/modules-scripting-document-update-tutorial.md b/explore-analyze/scripting/modules-scripting-document-update-tutorial.md new file mode 100644 index 0000000000..2bf3876f4f --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-document-update-tutorial.md @@ -0,0 +1,561 @@ +--- +navigation_title: Updating documents +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Updating documents [updating-documents-tutorial] + +In this tutorial you’ll learn how to use Painless scripts to update documents in three scenarios: + +* Update a single document with `_update` +* Update multiple documents that match a query with `_update_by_query` +* Apply tax calculations across product categories with `_update_by_query` + +## Prerequisites + +This tutorial uses the kibana\_sample\_data\_ecommerce dataset. Refer to [Context example data](elasticsearch://reference/scripting-languages/painless/painless-context-examples.md) to get started. + +## Update a single document with `_update` + +The goal is to change the price of a specific product in an order and then update the total price. This tutorial shows how to find a product in an order by its ID, update all of its price fields, and recalculate the order total automatically. + +### Understanding the document structure + +First, you need to find a valid document ID: + +```json +GET kibana_sample_data_ecommerce/_search +{ + "size": 1, + "_source": false +} +``` + +Then you run the script, with your ID to check how the document is structured: + +```json +GET kibana_sample_data_ecommerce/_doc/YOUR_DOCUMENT_ID +``` + +The request returns the following document. Notice the product structure within the `products` array. We will target the price-related fields: + +* `products.price` +* `products.base_price` +* `products.taxful_price` +* `products.taxless_price` +* `products.base_unit_price` + + +This way, all price-related fields are updated together: + +:::{dropdown} Response + +```json +{ + ... + "_source": { + ... + "products": [ + { + "tax_amount": 0, + "taxful_price": 11.99, + "quantity": 1, + "taxless_price": 11.99, + "discount_amount": 0, + "base_unit_price": 11.99, + "discount_percentage": 0, + "product_name": "Basic T-shirt - dark blue/white", + "manufacturer": "Elitelligence", + "min_price": 6.35, + "created_on": "2016-12-26T09:28:48+00:00", + "unit_discount_amount": 0, + "price": 11.99, + "product_id": 6283, + "base_price": 11.99, + "_id": "sold_product_584677_6283", + "category": "Men's Clothing", + "sku": "ZO0549605496" + } + ], + ... + } +} +``` + +::: + +### Writing the update script + +Next, we use the `_update` API to change the price of a specific product inside an order. This ensures that only the document with a specific ID is updated. + +:::{important} +Before running this script, make sure to use a `product_id` that exists in your dataset. You can find valid product IDs by examining the document structure as shown in the previous step, or by running a search query to return a list of available products. +::: + +```json +POST kibana_sample_data_ecommerce/_update/YOUR_DOCUMENT_ID +{ + "script": { + "lang": "painless", + "source": """ + for (product in ctx._source.products) { + if (product.product_id == params.product_id) { + double old_price = product.taxful_price; + double new_price = params.new_price; + double price_diff = (new_price - old_price) * product.quantity; + + // Update products prices + product.price = new_price; + product.taxful_price = new_price; + product.taxless_price = new_price; + product.base_price = new_price; + product.base_unit_price = new_price; + + // Total amount of the order + ctx._source.taxful_total_price += price_diff; + ctx._source.taxless_total_price += price_diff; + + break; + } + } + """, + "params": { + "product_id": 6283, + "new_price": 70 + } + } +} +``` + +This script includes the following steps: + +1. **Iterate through products:** The script loops through each product in the `ctx._source.products` array +2. **Find the target product:** It compares each `product.product_id` with the parameter value +3. **Calculate price difference:** It determines how much the total order amount should change +4. **Update all price fields:** Multiple price fields are updated to maintain data consistency +5. **Update order totals:** The script adjusts the total order amounts +6. **Exit the loop:** The `break` statement prevents unnecessary iterations after finding the product + + +For more details about Painless scripting in the update context, refer to the [Painless update context documentation](elasticsearch://reference/scripting-languages/painless/painless-update-context.md). + +:::{dropdown} Response + +```json +{ + "_index": "kibana_sample_data_ecommerce", + "_id": "MnacyJgBTbKqUnB54Eep", + "_version": 2, + "result": "updated", + "_shards": { + "total": 2, + "successful": 2, + "failed": 0 + }, + "_seq_no": 4675, + "_primary_term": 1 +} +``` + +::: + +### Verifying the update + +To confirm the update worked correctly, search for the document again: + +```json +GET kibana_sample_data_ecommerce/_doc/YOUR_DOCUMENT_ID +``` + +If everything works correctly, when we update the price, all fields that include it will also be updated. The final document will look like the following if the price is changed to 70.00: + +```json +{ + ... + "_source": { + ... + "products": [ + { + "tax_amount": 0, + "taxful_price": 70, + "quantity": 1, + "taxless_price": 70, + "discount_amount": 0, + "base_unit_price": 70, + "discount_percentage": 0, + "product_name": "Basic T-shirt - dark blue/white", + "manufacturer": "Elitelligence", + "min_price": 6.35, + "created_on": "2016-12-26T09:28:48+00:00", + "unit_discount_amount": 0, + "price": 70, + "product_id": 6283, + "base_price": 70, + "_id": "sold_product_584677_6283", + "category": "Men's Clothing", + "sku": "ZO0549605496" + } + ], + ... + "taxful_total_price": 94.99000000000001, + "taxless_total_price": 94.99000000000001, + ... + } +} +``` + +## Update multiple documents with `_update_by_query` + +The `_update_by_query` API allows you to update multiple documents that match a specific query. You can apply changes to different documents at the same time for tasks like data cleanup, standardization, or any other case where you need to update multiple documents at once. + +### Finding documents to update + +In the following example, we will update all orders where the customer phone number is empty by setting it to “N/A”. + +Let’s find orders with empty customer phone numbers that need to be standardized: + +```json +GET kibana_sample_data_ecommerce/_search +{ + "size": 5, + "query": { + "bool": { + "filter": [ + { + "term": { + "customer_phone": "" + } + } + ] + } + } +} +``` + +This returns documents where the `customer_phone` field is empty: + +```json +{ + ... + "hits": { + ... + "hits": [ + { + ... + "_source": { + ... + "customer_phone": "", + ... + } + }, + ... + ] + } +} +``` + +### Writing the update script for multiple documents + +Now we’ll update all documents with empty phone numbers to have a standardized “N/A” value and add audit fields: + +```json +POST kibana_sample_data_ecommerce/_update_by_query +{ + "query": { + "bool": { + "filter": [ + {"term": {"customer_phone": ""}} + ] + } + }, + "script": { + "lang": "painless", + "source": """ + ctx._source.customer_phone = 'N/A'; + ctx._source.updated_at = new Date(); + """ + } +} +``` + +This script includes the following steps: + +1. **Set a standard value:** Changes empty phone numbers to “N/A” +2. **Record timestamps:** Captures when the update occurred + +For more details about the update by query API parameters and options, refer to the [Update by query API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update-by-query). + +:::{dropdown} Response + +```json +{ + "took": 1997, + "timed_out": false, + "total": 4675, + "updated": 4675, + "deleted": 0, + "batches": 5, + "version_conflicts": 0, + "noops": 0, + "retries": { + "bulk": 0, + "search": 0 + }, + "throttled_millis": 0, + "requests_per_second": -1, + "throttled_until_millis": 0, + "failures": [] +} +``` + +::: + +### Verifying the update + +Confirm the updates were applied by searching for documents with the new phone value: + +```json +GET kibana_sample_data_ecommerce/_search +{ + "size": 5, + "query": { + "bool": { + "filter": [ + { + "term": { + "customer_phone": "N/A" + } + } + ] + } + } +} +``` + +:::{dropdown} Response + +```json +{ + ... + "hits": { + ... + "hits": [ + { + ... + "_source": { + ... + "customer_phone": "N/A", + ... + } + } + ] + } +} + + +``` + +::: + +## Apply tax calculations with `_update_by_query` + +In this example, we need to fix an incorrect tax assignment for a specific product category. Many e-commerce systems need to apply tax corrections after the initial data import or when tax regulations change. + +### Understanding the tax correction scenario + +Currently, some products in the “Men’s Clothing” category have incorrect tax information where taxes haven’t been applied: + +```json +{ + "tax_amount": 0, + "taxful_price": 24.99, + "taxless_price": 24.99, + "category": "Men's Clothing" +} +``` + +We need to apply a 21% VAT to all untaxed “Men’s Clothing” products and recalculate both individual product prices and order totals. + +### Writing the tax calculation script + +To update all affected documents, we use a query filtered by category and a script that recalculates taxes. + +```json +POST kibana_sample_data_ecommerce/_update_by_query +{ + "query": { + "bool": { + "filter": [ + { + "term": { + "products.category.keyword": "Men's Clothing" + } + } + ] + } + }, + "script": { + "lang": "painless", + "source": """ + double tax_rate = params.tax_rate; // 21% VAT + double total_tax_adjustment = 0; + + for (product in ctx._source.products) { + if (product.category == "Men's Clothing" && product.tax_amount == 0) { + // Calculate tax based on the taxless price + double tax_amount = Math.round((product.taxless_price * tax_rate) * 100.0) / 100.0; + double new_taxful_price = product.taxless_price + tax_amount; + + // Update tax fields of the product + product.tax_amount = tax_amount; + product.taxful_price = new_taxful_price; + product.price = new_taxful_price; + + total_tax_adjustment += tax_amount * product.quantity; + } + } + + // Update order totals + if (total_tax_adjustment > 0) { + ctx._source.taxful_total_price += total_tax_adjustment; + ctx._source.updated_timestamp = new Date(); + } + """, + "params": { + "tax_rate": 0.21 + } + } +} +``` + +This script includes the following steps: + +1. **Filter by category:** Only processes orders containing “Men’s Clothing” products +2. **Identify untaxed items:** Checks for products where `tax_amount` equals 0 +3. **Calculate VAT:** Applies 21% tax rate to the `taxless_price` +4. **Update product fields:** Sets `tax_amount`, `taxful_price`, and `price` +5. **Accumulate adjustments:** Tracks total tax changes across all products +6. **Update order totals:** Adjusts the overall order `taxful_total_price` + +For more details and examples, refer to the [Update by query context documentation](elasticsearch://reference/scripting-languages/painless/painless-update-by-query-context.md). + +:::{dropdown} Response + +```json + +{ + "took": 789, + "timed_out": false, + "total": 2024, + "updated": 2024, + "deleted": 0, + "batches": 3, + "version_conflicts": 0, + "noops": 0, + "retries": { + "bulk": 0, + "search": 0 + }, + "throttled_millis": 0, + "requests_per_second": -1, + "throttled_until_millis": 0, + "failures": [] +} +``` + +::: + +### Verifying the tax calculation update + +Finally, running a search confirms that the tax update was applied: + +```json + +GET kibana_sample_data_ecommerce/_search +{ + "size": 1, + "query": { + "bool": { + "filter": [ + { + "term": { + "products.category.keyword": "Men's Clothing" + } + } + ] + } + } +} +``` + +:::{dropdown} Response + +```json + +{ + ... + "hits": { + ... + "hits": [ + { + ... + "_source": { + ... + "products": [ + { + "tax_amount": 226.8, + "taxful_price": 1306.78, + "quantity": 2, + "taxless_price": 1079.98, + "discount_amount": 0, + "base_unit_price": 539.99, + "discount_percentage": 0, + "product_name": "Leather jacket - black", + "manufacturer": "Oceanavigations", + "min_price": 259.2, + "created_on": "2016-12-05T06:16:12+00:00", + "unit_discount_amount": 0, + "price": 1306.78, + "product_id": 2669, + "base_price": 1079.98, + "_id": "sold_product_739290_2669", + "category": "Men's Clothing", + "sku": "ZO0288302883" + }, + ... + ], + ... + "taxless_total_price": 2249.92, + ... + "taxful_total_price": 3194.92 + } + } + ] + } +} +``` + +::: + +## When to use each method + +Choose the right API based on your use case: + +### Use `_update` when: + +* Updating, deleting or skipping the modification of a document +* [Updating a part of a document](elasticsearch://reference/elasticsearch/rest-apis/update-document.md#update-part-document) +* [Inserting or updating documents with upsert](elasticsearch://reference/elasticsearch/rest-apis/update-document.md#upsert) + +### Use `_update_by_query` when: + +* [Running a basic update](elasticsearch://reference/elasticsearch/rest-apis/update-by-query-api.md#run-basic-updates) +* Updating multiple documents based on search criteria +* [Updating the document source](elasticsearch://reference/elasticsearch/rest-apis/update-by-query-api.md#update-the-document-source) +* [Updating documents using an ingest pipeline](elasticsearch://reference/elasticsearch/rest-apis/update-by-query-api.md#update-documents-using-an-ingest-pipeline) + +For detailed API specifications, refer to the [Update a document](elasticsearch://reference/elasticsearch/rest-apis/update-document.md) and [Update by query API documentation](elasticsearch://reference/elasticsearch/rest-apis/update-by-query-api.md). diff --git a/explore-analyze/scripting/modules-scripting-engine.md b/explore-analyze/scripting/modules-scripting-engine.md index f8552dc57f..880dc1fd0e 100644 --- a/explore-analyze/scripting/modules-scripting-engine.md +++ b/explore-analyze/scripting/modules-scripting-engine.md @@ -8,167 +8,81 @@ products: - id: elasticsearch --- -# Advanced scripts using script engines [modules-scripting-engine] +# Implementing custom scripting language in Elasticsearch [modules-scripting-engine] -A `ScriptEngine` is a backend for implementing a scripting language. It may also be used to write scripts that need to use advanced internals of scripting. For example, a script that wants to use term frequencies while scoring. +A `ScriptEngine` is a backend for implementing a scripting language in {{es}}. -The plugin [documentation](elasticsearch://extend/index.md) has more information on how to write a plugin so that Elasticsearch will properly load it. To register the `ScriptEngine`, your plugin should implement the `ScriptPlugin` interface and override the `getScriptEngine(Settings settings)` method. +## How it works -The following is an example of a custom `ScriptEngine` which uses the language name `expert_scripts`. It implements a single script called `pure_df` which may be used as a search script to override each document’s score as the document frequency of a provided term. +Custom script engines integrate with {{es}} scripting framework through the `ScriptEngine` interface. To register the `ScriptEngine`, your plugin should implement the `ScriptPlugin` interface and override the `getScriptEngine(Settings settings)` method during plugin initialization. + +## When to implement + +Consider implementing a custom script engine when you need to use advanced internals of scripting, such as scripts that require term frequencies while scoring, or when implementing specialized scripting languages with custom syntax beyond standard Painless capabilities. + +## Example implementation + +The plugin [documentation](elasticsearch://extend/index.md) has more information on how to write a plugin so {{es}} will properly load it. For the complete ScriptEngine interface reference, refer to the [official implementation](https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/script/ScriptEngine.java). + +### What this script does + +This code creates a custom script engine that allows you to use `expert_scripts` as the language name and `pure_df` as the script source in your {{es}} queries. The script calculates document scores using term frequency data instead of {{es}} standard scoring algorithm. + +The following example shows the essential parts of implementing a custom `ScriptEngine`: ```java private static class MyExpertScriptEngine implements ScriptEngine { + + // 1. Define your custom language name @Override public String getType() { - return "expert_scripts"; + return "expert_scripts"; // This becomes your "lang" value } + // 2. Define your script source and compilation @Override - public T compile( - String scriptName, - String scriptSource, - ScriptContext context, - Map params - ) { - if (context.equals(ScoreScript.CONTEXT) == false) { - throw new IllegalArgumentException(getType() - + " scripts cannot be used for context [" - + context.name + "]"); - } - // we use the script "source" as the script identifier + public T compile(String scriptName, String scriptSource, + ScriptContext context, Map params) { + // This recognizes "pure_df" as your script source if ("pure_df".equals(scriptSource)) { ScoreScript.Factory factory = new PureDfFactory(); return context.factoryClazz.cast(factory); } - throw new IllegalArgumentException("Unknown script name " - + scriptSource); - } - - @Override - public void close() { - // optionally close resources + throw new IllegalArgumentException("Unknown script: " + scriptSource); } + + // ... (additional required methods) +} +// 3. Where the actual score calculation happens +private static class ScoreScriptImpl extends ScoreScript { @Override - public Set> getSupportedContexts() { - return Set.of(ScoreScript.CONTEXT); - } - - private static class PureDfFactory implements ScoreScript.Factory, - ScriptFactory { - @Override - public boolean isResultDeterministic() { - // PureDfLeafFactory only uses deterministic APIs, this - // implies the results are cacheable. - return true; - } - - @Override - public LeafFactory newFactory( - Map params, - SearchLookup lookup - ) { - return new PureDfLeafFactory(params, lookup); + public double execute(ExplanationHolder explanation) { + // This is where you define your custom scoring logic + // In this example: return term frequency as the score + try { + return postings.freq(); // Custom score calculation + } catch (IOException e) { + return 0.0d; } } +} - private static class PureDfLeafFactory implements LeafFactory { - private final Map params; - private final SearchLookup lookup; - private final String field; - private final String term; - - private PureDfLeafFactory( - Map params, SearchLookup lookup) { - if (params.containsKey("field") == false) { - throw new IllegalArgumentException( - "Missing parameter [field]"); - } - if (params.containsKey("term") == false) { - throw new IllegalArgumentException( - "Missing parameter [term]"); - } - this.params = params; - this.lookup = lookup; - field = params.get("field").toString(); - term = params.get("term").toString(); - } +``` - @Override - public boolean needs_score() { - return false; // Return true if the script needs the score - } +### Key points - @Override - public boolean needs_termStats() { - return false; // Return true if the script needs term statistics via get_termStats() - } +* **Language Definition**: The `getType()` method returns `expert_scripts`, which becomes the value you use for the `lang` parameter in your scripts. +* **Script Recognition:** The `compile()` method identifies `pure_df` as a valid script source, which becomes the value you use for the `source` parameter. +* **Custom Scoring:** The `execute()` method replaces {{es}} standard scoring with your custom logic. In this case, using term frequency as the document score. - @Override - public ScoreScript newInstance(DocReader docReader) - throws IOException { - DocValuesDocReader dvReader = DocValuesDocReader) docReader); PostingsEnum postings = dvReader.getLeafReaderContext() .reader().postings(new Term(field, term; - if (postings == null) { - /* - * the field and/or term don't exist in this segment, - * so always return 0 - */ - return new ScoreScript(params, lookup, docReader) { - @Override - public double execute( - ExplanationHolder explanation - ) { - if(explanation != null) { - explanation.set("An example optional custom description to explain details for this script's execution; we'll provide a default one if you leave this out."); - } - return 0.0d; - } - }; - } - return new ScoreScript(params, lookup, docReader) { - int currentDocid = -1; - @Override - public void setDocument(int docid) { - /* - * advance has undefined behavior calling with - * a docid <= its current docid - */ - if (postings.docID() < docid) { - try { - postings.advance(docid); - } catch (IOException e) { - throw new UncheckedIOException(e); - } - } - currentDocid = docid; - } - @Override - public double execute(ExplanationHolder explanation) { - if(explanation != null) { - explanation.set("An example optional custom description to explain details for this script's execution; we'll provide a default one if you leave this out."); - } - if (postings.docID() != currentDocid) { - /* - * advance moved past the current doc, so this - * doc has no occurrences of the term - */ - return 0.0d; - } - try { - return postings.freq(); - } catch (IOException e) { - throw new UncheckedIOException(e); - } - } - }; - } - } -} -``` +**For the complete implementation, refer to the [official script engine example](https://github.com/elastic/elasticsearch/blob/main/plugins/examples/script-expert-scoring/src/main/java/org/elasticsearch/example/expertscript/ExpertScriptPlugin.java).** + +### Usage example -You can execute the script by specifying its `lang` as `expert_scripts`, and the name of the script as the script source: +This example shows how to use your custom script engine in a search query: -```console +```json POST /_search { "query": { @@ -195,5 +109,7 @@ POST /_search } } } + ``` + diff --git a/explore-analyze/scripting/modules-scripting-painless.md b/explore-analyze/scripting/modules-scripting-painless.md index b1adcfed6b..050bfd1b4f 100644 --- a/explore-analyze/scripting/modules-scripting-painless.md +++ b/explore-analyze/scripting/modules-scripting-painless.md @@ -1,4 +1,5 @@ --- +navigation_title: Painless mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-painless.html applies_to: @@ -8,20 +9,80 @@ products: - id: elasticsearch --- -# Painless scripting language [modules-scripting-painless] +# Introduction to Painless [modules-scripting-painless] -*Painless* is a performant, secure scripting language designed specifically for {{es}}. You can use Painless to safely write inline and stored scripts anywhere scripts are supported in {{es}}. +:::{tip} +This introduction is designed for users new to Painless scripting. If you're already familiar with Painless, refer to the [Painless language specification](elasticsearch://reference/scripting-languages/painless/painless-language-specification.md) for syntax details and advanced features. +::: -$$$painless-features$$$ -Painless provides numerous capabilities that center around the following core principles: +Painless is a secure, performant, and flexible scripting language designed specifically for {{es}}. As the default scripting language for {{es}}, Painless lets you safely customize search behavior, data processing, and operations workflows across your {[stack]} deployments. -* **Safety**: Ensuring the security of your cluster is of utmost importance. To that end, Painless uses a fine-grained allowlist with a granularity down to the members of a class. Anything that is not part of the allowlist results in a compilation error. See the [Painless API Reference](https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-api-reference.html) for a complete list of available classes, methods, and fields per script context. -* **Performance**: Painless compiles directly into JVM bytecode to take advantage of all possible optimizations that the JVM provides. Also, Painless typically avoids features that require additional slower checks at runtime. -* **Simplicity**: Painless implements a syntax with a natural familiarity to anyone with some basic coding experience. Painless uses a subset of Java syntax with some additional improvements to enhance readability and remove boilerplate. +## What is Painless? +Painless was introduced in [{{es}} 5.0](https://www.elastic.co/blog/painless-a-new-scripting-language) as a replacement for Groovy, with improved security and performance compared to previous scripting solutions. Built on the [Java Virtual Machine (JVM)](https://docs.oracle.com/en/java/javase/24/vm/java-virtual-machine-technology-overview.html), Painless provides the familiar syntax of Java while improving the security boundaries with guardrails and a sandbox environment. -## Start scripting [_start_scripting] +Unlike general scripting languages, Painless is purpose-built for {{es}}, enabling native performance while preventing unauthorized access to system resources. This architecture makes Painless both powerful for data manipulation and safe for production environments. -Ready to start scripting with Painless? Learn how to [write your first script](modules-scripting-using.md). +Common use cases include creating new fields based on existing data, calculating time differences between dates, extracting structured data from log messages, and implementing custom business logic in search scoring. For more examples, refer to our step-by-step [tutorials](/explore-analyze/scripting/common-script-uses.md). -If you’re already familiar with Painless, see the [Painless Language Specification](elasticsearch://reference/scripting-languages/painless/painless-language-specification.md) for a detailed description of the Painless syntax and language features. +## Benefits + +Painless enables scripting in various contexts throughout {{es}}, such as: + +### Search enhancement + +* Custom search scoring based on business requirements +* Runtime field creation that calculates values during query execution +* Real-time filtering and transformation without reindexing data + +### Data processing + +* Transform documents during indexing +* Parse and extract structured data from unstructured fields +* Calculate metrics and summaries from your data + +### Operational automation + +* Monitor data patterns and trigger alerts with Watcher solutions +* Transform alert payloads for targeted notifications and actions + +## How it works + +You can write Painless scripts inline for quick operations or create reusable functions for your data operation. Here’s a sample Painless script applied to data transformation: + +```java +String productTitle(String manufacturer, String productName) { + return manufacturer + " - " + productName; +} + +return productTitle("Elitelligence", "Winter jacket"); +``` + +This script demonstrates a few facets of Painless scripting: + +* **Function definition:** Custom `productTitle` function with typed parameters +* **Data types:** String and integer parameter handling +* **Return values:** Function returns formatted string output + +Painless provides three core benefits across all scripting contexts: + +* **Security:** Fine-grained allowlists that prevent access to restricted Java APIs. +* **Performance**: Direct compilation to [bytecode](https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html) eliminates interpretation overhead and leverages JVM optimization. +* **Flexibility**: Wide range of scripting syntax and contexts across {{es}}, from search scoring to data processing to operational processing. + +## Where to write in Painless + +You can use Painless in multiple contexts throughout {{es}}: + +* [**Dev Tools Console**](/explore-analyze/query-filter/tools/console.md)**:** for interactive script development and testing +* [**Ingest pipelines**](/manage-data/ingest/transform-enrich/ingest-pipelines.md)**:** for data transformation during indexing +* [**Search queries**](/solutions/search.md)**:** for custom scoring and script fields +* [**Runtime fields**](/manage-data/data-store/mapping/runtime-fields.md)**:** for dynamic field creation +* [**Update API:**](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) for document modification +* [**Watcher**](/explore-analyze/alerts-cases/watcher.md)**:** for alert conditions and actions + +## Start scripting + +Write your first Painless script by trying out our [guide](/explore-analyze/scripting/modules-scripting-using.md) or jump into one of our [tutorials](/explore-analyze/scripting/common-script-uses.md) for real-world examples using sample data. + +For complete syntax and language features, refer to the [Painless language specification](elasticsearch://reference/scripting-languages/painless/painless-language-specification.md). diff --git a/explore-analyze/scripting/modules-scripting-regular-expressions-tutorial.md b/explore-analyze/scripting/modules-scripting-regular-expressions-tutorial.md new file mode 100644 index 0000000000..37bf9b3c62 --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-regular-expressions-tutorial.md @@ -0,0 +1,440 @@ +--- +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Using Painless regular expressions [regular-expressions-tutorial] + +## Prerequisites + +This tutorial uses the `kibana_sample_data_ecommerce` dataset. Refer to [Context example data](elasticsearch://reference/scripting-languages/painless/painless-context-examples.md) to get started. + +## Extract customer email domains for marketing segmentation + +The goal is to extract and categorize email domains during document ingestion for marketing segmentation. We will use regular expression pattern matching to parse a set of email addresses and break them into their component parts. The expression `/([^@]+)@(.+)/` splits emails at the @ symbol, while domain-specific patterns like `(gmail|yahoo|hotmail|outlook|icloud|aol)\.com/` identify personal email providers for automated categorization. + +### Understanding email domain extraction + +The ingest pipeline allows you to extract structured data from email addresses during indexing. Regular expressions parse email components and categorize domains, creating fields for targeted marketing campaigns and customer analysis. + +### Writing the Painless script + +Create an ingest pipeline that extracts email domains and categorizes them: + +```json +PUT _ingest/pipeline/kibana_sample_data_ecommerce-extract_email_domains +{ + "description": "Extract and categorize email domains from customer emails", + "processors": [ + { + "script": { + "lang": "painless", + "source": """ + // Extract email domain using regex + Pattern emailPattern = /([^@]+)@(.+)/; + Matcher emailMatcher = emailPattern.matcher(ctx.email); + + if (emailMatcher.matches()) { + String username = emailMatcher.group(1); + String domain = emailMatcher.group(2); + + // Store extracted components + ctx.email_username = username; + ctx.email_domain = domain; + + // Categorize domain type using regex patterns + Pattern personalDomains = /(gmail|yahoo|hotmail|outlook|icloud|aol)\.com/; + Pattern businessDomains = /\.(co|corp|inc|ltd|org|edu|gov)$/; + Pattern testDomains = /\.zzz$/; + + if (testDomains.matcher(domain).find()) { + ctx.email_category = "test"; + } else if (personalDomains.matcher(domain).find()) { + ctx.email_category = "personal"; + } else if (businessDomains.matcher(domain).find()) { + ctx.email_category = "business"; + } else { + ctx.email_category = "other"; + } + + // Extract top-level domain + Pattern tldPattern = /\.([a-zA-Z]{2,})$/; + Matcher tldMatcher = tldPattern.matcher(domain); + if (tldMatcher.find()) { + ctx.email_tld = tldMatcher.group(1); + } + } + """ + } + } + ] +} +``` + +This script includes the following steps: + +* **Pattern matching:** Uses regex `/([^@]+)@(.+)/` to split email into username and domain parts +* **Domain categorization:** Classifies domains using specific patterns: + * Personal: [gmail.com](http://gmail.com), [yahoo.com](http://yahoo.com), [hotmail.com](http://hotmail.com) + * Business: domains ending in .co, .corp, .inc, .org, .edu, .gov + * Test: domains ending in .zzz + * Other: Any domain not matching the above patterns +* **TLD extraction:** Extracts the top-level domain (com, org, edu, etc.) using `/\.([a-zA-Z]{2,})$/` regular expression +* **Test domain detection:** Identifies sample data domains ending in `.zzz` (used in the Kibana ecommerce sample dataset) +* **Fields addition:** Adds new fields (`email_username`, `email_domain`, `email_category`, `email_tld`) to the document for analytics and segmentation + +For more details about Painless regular expressions, refer to [Painless regex documentation](elasticsearch://reference/scripting-languages/painless/painless-regexes.md). + +### Test the pipeline + +Simulate the pipeline with sample e-commerce customer data: + +```json +POST _ingest/pipeline/kibana_sample_data_ecommerce-extract_email_domains/_simulate +{ + "docs": [ + { + "_source": { + "customer_full_name": "Eddie Underwood", + "email": "eddie@underwood-family.zzz" + } + }, + { + "_source": { + "customer_full_name": "John Smith", + "email": "john.smith@gmail.com" + } + }, + { + "_source": { + "customer_full_name": "Sarah Wilson", + "email": "s.wilson@acme-corp.com" + } + } + ] +} +``` + +:::{dropdown} Response + +```json +{ + "docs": [ + { + "doc": { + ..., + "_source": { + "customer_full_name": "Eddie Underwood", + "email_tld": "zzz", + "email_username": "eddie", + "email_domain": "underwood-family.zzz", + "email_category": "test", + "email": "eddie@underwood-family.zzz" + }, + "_ingest": { + "timestamp": "2025-08-27T16:58:19.746710068Z" + } + } + }, + { + "doc": { + ..., + "_source": { + "customer_full_name": "John Smith", + "email_tld": "com", + "email_username": "john.smith", + "email_domain": "gmail.com", + "email_category": "personal", + "email": "john.smith@gmail.com" + }, + "_ingest": { + "timestamp": "2025-08-27T16:58:19.746774486Z" + } + } + }, + { + "doc": { + ..., + "_source": { + "customer_full_name": "Sarah Wilson", + "email_tld": "com", + "email_username": "s.wilson", + "email_domain": "acme-corp.com", + "email_category": "business", + "email": "s.wilson@acme-corp.com" + }, + "_ingest": { + "timestamp": "2025-08-27T16:58:19.746786425Z" + } + } + } + ] +} +``` + +::: + +## Analyze product SKU patterns with aggregations + +The goal is to parse SKU codes using regular expressions to extract manufacturer identifiers, then group products by these patterns to reveal inventory distribution across clothing, accessories, and footwear categories without creating additional index fields. + +### Understanding SKU pattern analysis with aggregations + +Aggregations with runtime fields allow you to analyze SKU patterns across thousands of products. The regex patterns validate SKU formats and extract meaningful segments from codes like “ZO0549605496" to categorize products by manufacturer patterns. + +### Writing the Painless script + +Create aggregations that analyze SKU patterns: + +```json +GET kibana_sample_data_ecommerce/_search +{ + "size": 0, + "runtime_mappings": { + "sku_category": { + "type": "keyword", + "script": { + "lang": "painless", + "source": """ + if (doc.containsKey('sku')) { + String sku = doc['sku'].value; + + // Validate SKU format using regex + Pattern skuPattern = /^ZO(\d{4})(\d{6})$/; + Matcher skuMatcher = skuPattern.matcher(sku); + + if (skuMatcher.matches()) { + String manufacturerCode = skuMatcher.group(1); // First 4 digits after ZO + + // Determine product category based on manufacturer code patterns + if (manufacturerCode =~ /^0[1-3].*/) { + emit("clothing"); + } else if (manufacturerCode =~ /^0[4-6].*/) { + emit("accessories"); + } else if (manufacturerCode =~ /^0[0-9].*/) { + emit("footwear"); + } else { + emit("other"); + } + } else { + emit("invalid_or_unknown"); + } + } else { + emit("invalid_or_unknown"); + } + """ + } + } + }, + "aggs": { + "sku_category_breakdown": { + "terms": { + "field": "sku_category" + } + } + } +} +``` + +This script includes the following steps: + +* **Format validation:** Uses regex `/^ZO(\d{4})(\d{6})$/` to ensure the SKU formats follow the expected pattern (ZO \+ 4 digits \+ 6 digits) +* **Component extraction:** Separates the manufacturer code product ID to enable category analysis +* **Category classification:** Maps manufacturer patterns to product types for inventory reporting +* **Aggregated analysis:** Counts products by category to show inventory distribution without storing new fields + +For more details about Painless regular expressions, refer to [Painless regex documentation](elasticsearch://reference/scripting-languages/painless/painless-regexes.md). + +The results provide comprehensive SKU format analysis across your product catalog: + +:::{dropdown} Response + +```json +{ + ..., + "hits": { + ... + }, + "aggregations": { + "sku_category_breakdown": { + "doc_count_error_upper_bound": 0, + "sum_other_doc_count": 0, + "buckets": [ + { + "key": "clothing", + "doc_count": 2337 + }, + { + "key": "accessories", + "doc_count": 1295 + }, + { + "key": "footwear", + "doc_count": 1043 + } + ] + } + } +} +``` + +::: + +## Convert custom date formats during data ingestion + +The goal is to convert custom timestamp formats like “25-AUG-2025@14h30m” into ISO 8601 standard “2025-08-25T14:30:00Z” during data ingestion. This ensures consistent date fields across datasets from different source systems. + +### Understanding custom date format conversion + +Legacy manufacturing and ERP systems often use non-standard date formats that {{es}} cannot directly parse. Converting these during ingestion eliminates the need for repeated parsing at query time and ensures proper date range filtering and sorting. + +### Writing the Painless script + +Create an ingest pipeline that converts custom date formats: + +```json +PUT _ingest/pipeline/kibana_sample_data_ecommerce-convert_custom_dates +{ + "description": "Convert custom date formats to ISO 8601 standard", + "processors": [ + { + "script": { + "lang": "painless", + "source": """ + // Format: "25-AUG-2025@14h30m" (DD-MMM-YYYY@HHhMMm) + if (ctx.containsKey('manufacturing_date_custom')) { + String customDate = ctx.manufacturing_date_custom; + + // Extract date components using regex + Pattern customDatePattern = /(\d{1,2})-([A-Z]{3})-(\d{4})@(\d{2})h(\d{2})m/; + Matcher customDateMatcher = customDatePattern.matcher(customDate); + + if (customDateMatcher.matches()) { + String day = customDateMatcher.group(1); + String monthAbbr = customDateMatcher.group(2); + String year = customDateMatcher.group(3); + String hour = customDateMatcher.group(4); + String minute = customDateMatcher.group(5); + + Map monthMap = [ + 'JAN': '01', 'FEB': '02', 'MAR': '03', 'APR': '04', + 'MAY': '05', 'JUN': '06', 'JUL': '07', 'AUG': '08', + 'SEP': '09', 'OCT': '10', 'NOV': '11', 'DEC': '12' + ]; + + // Getting month based on month abbreviation + String monthNum = monthMap.getOrDefault(monthAbbr, '01'); + + // Format day with leading zero using regex pattern validation + Pattern singleDigitPattern = /^\d$/; + String dayFormatted = singleDigitPattern.matcher(day).matches() ? "0" + day : day; + + // Create ISO 8601 formatted date + ctx.manufacturing_date = year + "-" + monthNum + "-" + dayFormatted + "T" + hour + ":" + minute + ":00Z"; + ctx.manufacturing_date_parsed = true; + + // Validate final ISO format using regex + Pattern validDatePattern = /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z$/; + ctx.iso_format_valid = validDatePattern.matcher(ctx.manufacturing_date).matches(); + + } else { + ctx.manufacturing_date_parsed = false; + ctx.parse_error = "Invalid custom date format"; + } + } + """ + } + } + ] +} +``` + +This script includes the following steps: + +* **Pattern extraction:** Uses regex `/(\d{1,2})-([A-Z]{3})-(\d{4})@(\d{2})h(\d{2})m/` to parse the custom format "25-AUG-2025@14h30m" +* **Month conversion:** Uses a Map lookup instead of multiple regex conditions for cleaner code +* **Regex validation:** Uses the `/^\d$/` pattern to detect single-digit days that need zero-padding +* **Format standardization:** Reconstructs date components into ISO 8601 format (YYYY-MM-DDTHH:mm:ssZ) +* **ISO validation:** Additional regex `/^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z$/` confirms the final format is valid + + +For more details about Painless scripting in the ingest context, refer to [Ingest processor context](elasticsearch://reference/scripting-languages/painless/painless-ingest-processor-context.md). + +### Test the pipeline + +Simulate the pipeline with sample custom date formats: + +```json +POST _ingest/pipeline/kibana_sample_data_ecommerce-convert_custom_dates/_simulate +{ + "docs": [ + { + "_source": { + "product_name": "Winter Jacket", + "manufacturing_date_custom": "25-AUG-2025@14h30m" + } + }, + { + "_source": { + "product_name": "Summer Shoes", + "manufacturing_date_custom": "03-JUN-2025@09h15m" + } + } + ] +} +``` + +The simulation shows successful conversion to ISO 8601 format: + +:::{dropdown} Response + +```json +{ + "docs": [ + { + "doc": { + ..., + "_source": { + "manufacturing_date_custom": "25-AUG-2025@14h30m", + "manufacturing_date": "2025-08-25T14:30:00Z", + "iso_format_valid": true, + "product_name": "Winter Jacket", + "manufacturing_date_parsed": true + }, + "_ingest": { + "timestamp": "2025-08-27T18:57:00.481410405Z" + } + } + }, + { + "doc": { + ..., + "_source": { + "manufacturing_date_custom": "03-JUN-2025@09h15m", + "manufacturing_date": "2025-06-03T09:15:00Z", + "iso_format_valid": true, + "product_name": "Summer Shoes", + "manufacturing_date_parsed": true + }, + "_ingest": { + "timestamp": "2025-08-27T18:57:00.481449186Z" + } + } + } + ] +} +``` + +::: + +## Learn more about regular expressions in Painless + +This tutorial demonstrates practical regex applications across ingest, aggregation, and runtime field contexts. For comprehensive regex capabilities and advanced pattern techniques, explore these resources: + +* [Painless regex documentation](elasticsearch://reference/scripting-languages/painless/painless-regexes.md): Complete syntax, operators, and pattern flags +* [Painless syntax-context bridge](/explore-analyze/scripting/painless-syntax-context-bridge.md): How data access methods apply across different Painless contexts +* [Painless Context](elasticsearch://reference/scripting-languages/painless/painless-contexts.md): General view for each context in Painless + diff --git a/explore-analyze/scripting/modules-scripting-security.md b/explore-analyze/scripting/modules-scripting-security.md index 65acd64490..c3a3333da4 100644 --- a/explore-analyze/scripting/modules-scripting-security.md +++ b/explore-analyze/scripting/modules-scripting-security.md @@ -8,56 +8,48 @@ products: - id: elasticsearch --- -# Scripting and security [modules-scripting-security] +# Scripting and security in Painless [modules-scripting-security] -Painless and {{es}} implement layers of security to build a defense in depth strategy for running scripts safely. +As part of its core design, Painless provides secure scripting capabilities across {{es}}. -Painless uses a fine-grained allowlist. Anything that is not part of the allowlist results in a compilation error. This capability is the first layer of security in a defense in depth strategy for scripting. +Introduced in [{{es}} 5.0](https://www.elastic.co/blog/painless-a-new-scripting-language) as a replacement for Groovy, Painless is purpose-built for {{es}}, enabling native performance while preventing unauthorized access to system resources. + +By operating in a controlled sandbox environment, Painless ensures that you won’t get compromised when using it. Painless only allows pre-approved operations through fine-grained allowlists. Scripts cannot access file systems, networks, or other system resources that could compromise your cluster while still providing the flexibility you need for search scoring, data processing, and operational automation. -The second layer of security is the [Java Security Manager](https://www.oracle.com/java/technologies/javase/seccodeguide.html). As part of its startup sequence, {{es}} enables the Java Security Manager to limit the actions that portions of the code can take. [Painless](modules-scripting-painless.md) uses the Java Security Manager as an additional layer of defense to prevent scripts from doing things like writing files and listening to sockets. +## Security Architecture Overview -{{es}} uses [seccomp](https://en.wikipedia.org/wiki/Seccomp) in Linux, [Seatbelt](https://www.chromium.org/developers/design-documents/sandbox/osx-sandboxing-design) in macOS, and [ActiveProcessLimit](https://msdn.microsoft.com/en-us/library/windows/desktop/ms684147) on Windows as additional security layers to prevent {{es}} from forking or running other processes. +The fine-grained allowlist operates as the first security layer. Anything that is not part of the allowlist will result in an error. -Finally, scripts used in [scripted metrics aggregations](elasticsearch://reference/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) can be restricted to a defined list of scripts, or forbidden altogether. This can prevent users from running particularly slow or resource intensive aggregation queries. +As another layer of security, {{es}} uses [Seccomp](https://en.wikipedia.org/wiki/Seccomp) in Linux, [Seatbelt](https://www.chromium.org/developers/design-documents/sandbox/osx-sandboxing-design) in macOS, and [ActiveProcessLimit](https://msdn.microsoft.com/en-us/library/windows/desktop/ms684147) on Windows to prevent {{es}} from forking or running other processes. -You can modify the following script settings to restrict the type of scripts that are allowed to run, and control the available [contexts](elasticsearch://reference/scripting-languages/painless/painless-contexts.md) that scripts can run in. To implement additional layers in your defense in depth strategy, follow the [{{es}} security principles](../../deploy-manage/security.md). +Finally, scripts used in [scripted metrics aggregations](elasticsearch://reference/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md) can be restricted to a defined list of scripts or forbidden altogether. This can prevent users from running particularly slow or resource-intensive aggregation queries. +You can modify the allowed script types setting to restrict the type of scripts that are allowed to run and control the available [contexts](elasticsearch://reference/scripting-languages/painless/painless-contexts.md) that scripts can run in. As well, you can use the {{es}} [security features](/deploy-manage/security.md) to enhance your defence strategy. ## Allowed script types setting [allowed-script-types-setting] -{{es}} supports two script types: `inline` and `stored`. By default, {{es}} is configured to run both types of scripts. To limit what type of scripts are run, set `script.allowed_types` to `inline` or `stored`. To prevent any scripts from running, set `script.allowed_types` to `none`. - -::::{important} -If you use {{kib}}, set `script.allowed_types` to both or just `inline`. Some {{kib}} features rely on inline scripts and do not function as expected if {{es}} does not allow inline scripts. -:::: - +{{es}} supports two script types: `inline` and `stored`. By default, {{es}} is configured to run both types of scripts. To limit what type of scripts are run, set `script.allowed_types` to `inline` or `stored`. To prevent any scripts from running, set `script.allowed_types` to `none`. If you use Kibana, set `script.allowed_types` to both or just `inline`. Some Kibana features rely on inline scripts and do not function as expected if {{es}} does not allow inline scripts. For example, to run `inline` scripts but not `stored` scripts: -```yaml +``` script.allowed_types: inline ``` - ## Allowed script contexts setting [allowed-script-contexts-setting] -By default, all script contexts are permitted. Use the `script.allowed_contexts` setting to specify the contexts that are allowed. To specify that no contexts are allowed, set `script.allowed_contexts` to `none`. - -For example, to allow scripts to run only in `scoring` and `update` contexts: +By default, all script contexts are permitted. Use the `script.allowed_contexts` setting to specify the contexts that are allowed. To specify that no contexts are allowed, set `script.allowed_contexts` to `none`. For example, to allow scripts to run only in `scoring` and `update` contexts: -```yaml +``` script.allowed_contexts: score, update ``` - ## Allowed scripts in scripted metrics aggregations [allowed-script-in-aggs-settings] By default, all scripts are permitted in [scripted metrics aggregations](elasticsearch://reference/aggregations/search-aggregations-metrics-scripted-metric-aggregation.md). To restrict the set of allowed scripts, set [`search.aggs.only_allowed_metric_scripts`](elasticsearch://reference/elasticsearch/configuration-reference/search-settings.md#search-settings-only-allowed-scripts) to `true` and provide the allowed scripts using [`search.aggs.allowed_inline_metric_scripts`](elasticsearch://reference/elasticsearch/configuration-reference/search-settings.md#search-settings-allowed-inline-scripts) and/or [`search.aggs.allowed_stored_metric_scripts`](elasticsearch://reference/elasticsearch/configuration-reference/search-settings.md#search-settings-allowed-stored-scripts). To disallow certain script types, omit the corresponding script list (`search.aggs.allowed_inline_metric_scripts` or `search.aggs.allowed_stored_metric_scripts`) or set it to an empty array. When both script lists are not empty, the given stored scripts and the given inline scripts will be allowed. -The following example permits only 4 specific stored scripts to be used, and no inline scripts: - ```yaml search.aggs.only_allowed_metric_scripts: true search.aggs.allowed_inline_metric_scripts: [] @@ -79,4 +71,3 @@ search.aggs.allowed_inline_metric_scripts: - 'long sum = 0; for (a in states) { sum += a } return sum' search.aggs.allowed_stored_metric_scripts: [] ``` - diff --git a/explore-analyze/scripting/modules-scripting-shorten-script.md b/explore-analyze/scripting/modules-scripting-shorten-script.md new file mode 100644 index 0000000000..f2fce3a2b1 --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-shorten-script.md @@ -0,0 +1,56 @@ +--- +navigation_title: Shorten scripts +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Shorten your script [script-shorten-syntax] + +Using syntactic abilities that are native to Painless, you can reduce verbosity in your scripts and make them shorter. Here’s a simple script that we can make shorter: + +```console +GET my-index-000001/_search +{ + "script_fields": { + "my_doubled_field": { + "script": { + "lang": "painless", + "source": "doc['my_field'].value * params.get('multiplier');", + "params": { + "multiplier": 2 + } + } + } + } +} +``` + +Let’s look at a shortened version of the script to see what improvements it includes over the previous iteration: + +```console +GET my-index-000001/_search +{ + "script_fields": { + "my_doubled_field": { + "script": { + "source": "field('my_field').get(null) * params['multiplier']", + "params": { + "multiplier": 2 + } + } + } + } +} +``` + +This version of the script removes several components and simplifies the syntax significantly: + +* The `lang` declaration. Because Painless is the default language, you don’t need to specify the language if you’re writing a Painless script. +* The `return` keyword. Painless automatically uses the final statement in a script (when possible) to produce a return value in a script context that requires one. +* The `get` method, which is replaced with brackets `[]`. Painless uses a shortcut specifically for the `Map` type that allows us to use brackets instead of the lengthier `get` method. +* The semicolon at the end of the `source` statement. Painless does not require semicolons for the final statement of a block. However, it does require them in other cases to remove ambiguity. + +You can use this abbreviated syntax anywhere that {{es}} supports scripts, such as when you’re creating [runtime fields](../../manage-data/data-store/mapping/map-runtime-field.md). Be mindful, however, that the `field` access API is not a direct replacement for `doc`. This shortened version of the original script includes a default value (the `null`), so depending on the field type the script may access either `doc` values or `_source`. Some fields will use `_source` as a fallback if `doc` values aren't available for a specific field. \ No newline at end of file diff --git a/explore-analyze/scripting/modules-scripting-store-and-retrieve.md b/explore-analyze/scripting/modules-scripting-store-and-retrieve.md new file mode 100644 index 0000000000..db7912b59e --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-store-and-retrieve.md @@ -0,0 +1,68 @@ +--- +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Store and retrieve scripts [script-stored-scripts] + +You can store and retrieve scripts from the cluster state using the [stored script APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-script). Stored scripts allow you to reference shared scripts for operations like scoring, aggregating, filtering, and reindexing. Instead of embedding scripts inline in each query, you can reference these shared operations. + +Stored scripts can also reduce request payload size. Depending on script size and request frequency, this can help lower latency and data transfer costs. + +::::{note} +Unlike regular scripts, stored scripts require that you specify a script language using the `lang` parameter. +:::: + + +To create a script, use the [create stored script API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-put-script). For example, the following request creates a stored script named `calculate-score`. + +```console +POST _scripts/calculate-score +{ + "script": { + "lang": "painless", + "source": "Math.log(_score * 2) + params['my_modifier']" + } +} +``` + +You can retrieve that script by using the [get stored script API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-get-script). + +```console +GET _scripts/calculate-score +``` + +To use the stored script in a query, include the script `id` in the `script` declaration: + +```console +GET my-index-000001/_search +{ + "query": { + "script_score": { + "query": { + "match": { + "message": "some message" + } + }, + "script": { + "id": "calculate-score", <1> + "params": { + "my_modifier": 2 + } + } + } + } +} +``` + +1. `id` of the stored script + + +To delete a stored script, submit a [delete stored script API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-delete-script) request. + +```console +DELETE _scripts/calculate-score +``` diff --git a/explore-analyze/scripting/modules-scripting-type-casting-tutorial.md b/explore-analyze/scripting/modules-scripting-type-casting-tutorial.md new file mode 100644 index 0000000000..cccb4a0e25 --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-type-casting-tutorial.md @@ -0,0 +1,286 @@ +--- +navigation_title: Converting data types +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Converting data types [type-casting-tutorial] + +In this tutorial you’ll learn how to use Painless scripts to convert data types in two scenarios: + +* Tax corrections using type casting for precise calculations +* Convert calculated scores to boolean flags + +## Prerequisites + +This tutorial uses the `kibana_sample_data_ecommerce` dataset. Refer to [Context example data](elasticsearch://reference/scripting-languages/painless/painless-context-examples.md) to get started. + +## Type casting used in this tutorial + +Painless supports multiple [casting](elasticsearch://reference/scripting-languages/painless/painless-casting.md) approaches. In this tutorial we use: + +* **Explicit casting** uses the `(type)` operator to force a conversion, for example `(long)value`. +* **Implicit casting** happens automatically when combining types in expressions, such as dividing a `long` by a `double`, which promotes the result to `double`. + +## Tax Corrections using type casting for precise calculations (ingest context) + +In this example, we will recalculate taxes by region using explicit and implicit type casting in Painless. This ensures precise totals across orders and products in line with regional tax policies. + +### Understanding the precision in financial calculations + +Binary floating-point arithmetic introduces errors in monetary calculations (0.1 \+ 0.2 \= 0.30000000000000004). To avoid this, {{es}} recommends using integers for the smallest currency unit. The [`scaled_float`](elasticsearch://reference/elasticsearch/mapping-reference/number.md) field type uses this pattern, storing prices as integer cents, while the API treats them as doubles. This ensures exact arithmetic, such as 10 \+ 20 \= 30 cents. + +### Writing the Painless script + +The script recalculates regional taxes using explicit casting to convert prices to cents (long), applies country-specific tax rates (5% for AE, 20% for GB), and uses implicit casting when dividing the result back to decimal format. This avoids floating-point errors during tax computations. + +Create an ingest pipeline that recalculates regional taxes using type casting for precision: + +```json +PUT _ingest/pipeline/kibana_sample_data_ecommerce-tax_correction +{ + "description": "Recalculate taxes by region using type casting", + "processors": [ + { + "script": { + "lang": "painless", + "source": """ + // Explicit casting: convert prices to long for high-precision calculations + if (ctx.taxless_total_price != null) { + long taxlessAmountCents = (long) (ctx.taxless_total_price * 100); + + // Regional tax rates by country + Map taxRates = [ + 'US': 0.08, 'GB': 0.20, 'DE': 0.19, 'FR': 0.20, + 'AE': 0.05, 'EG': 0.14, 'default': 0.10 + ]; + + String countryCode = ctx.geoip?.country_iso_code ?: 'default'; + double taxRate = taxRates.getOrDefault(countryCode, 0.10); + + // Explicit casting: double to long + long taxAmountCents = (long) (taxlessAmountCents * taxRate); + // Implicit casting: long to double + double correctedTaxAmount = taxAmountCents / 100.0; + + ctx.corrected_tax_amount = correctedTaxAmount; + ctx.corrected_total_price = ctx.taxless_total_price + correctedTaxAmount; + ctx.tax_country_code = countryCode; + } + """ + } + } + ] +} +``` + +This script includes the following steps: + +* **Explicit casting:** `(long) (taxlessAmountCents * taxRate)` converts the double result to long for precise calculations +* **Implicit casting:** `taxAmountCents / 100.0` automatically promotes long to double during division + + +For more details about Painless scripting in the ingest context, refer to [Ingest processor context](elasticsearch://reference/scripting-languages/painless/painless-ingest-processor-context.md). + +### Test the pipeline + +Simulate the pipeline with sample documents to confirm tax calculations works correctly: + +```json +POST _ingest/pipeline/kibana_sample_data_ecommerce-tax_correction/_simulate +{ + "docs": [ + { + "_source": { + "taxless_total_price": 44.98, + "geoip": { + "country_iso_code": "AE" + } + } + }, + { + "_source": { + "taxless_total_price": 189.50, + "geoip": { + "country_iso_code": "GB" + } + } + } + ] +} +``` + +The simulation confirms precise tax calculations by region: + +:::{dropdown} Response + +```json +{ + "docs": [ + { + "doc": { + ..., + "_source": { + "corrected_total_price": 47.22, + "taxless_total_price": 44.98, + "corrected_tax_amount": 2.24, + "geoip": { + "country_iso_code": "AE" + }, + "tax_country_code": "AE" + }, + "_ingest": { + "timestamp": "2025-08-28T21:01:03.54255615Z" + } + } + }, + { + "doc": { + ..., + "_source": { + "corrected_total_price": 227.4, + "taxless_total_price": 189.5, + "corrected_tax_amount": 37.9, + "geoip": { + "country_iso_code": "GB" + }, + "tax_country_code": "GB" + }, + "_ingest": { + "timestamp": "2025-08-28T21:01:03.542605776Z" + } + } + } + ] +} +``` + +::: + +## Convert calculated scores to boolean flags (runtime field context) + +The goal is to create a dynamic “high value” flag by combining order price and product diversity into a weighted score. This makes it easy to identify valuable orders. In this example, type casting ensures accurate calculations when combining integer and double values. The score is rounded to an integer, then compared to create a boolean classification. + +### Understanding weighted scoring for boolean classification + +Business analytics often requires combining multiple factors into a simple true/false classification. This weighted scoring model assigns different importance levels to various metrics. Order price carries 70% weight while product diversity carries 30%. The combined score is then rounded to create clear boolean categories. + +### Writing the runtime field script + +This script calculates a weighted score for each order by combining the total price and the number of unique products, applying explicit and implicit type casting where needed. The score is then rounded to an integer and converted into a boolean flag, indicating whether the order qualifies as “high value.” + +Create a runtime field that calculates weighted scores and classifies them to boolean values: + +```json +PUT kibana_sample_data_ecommerce/_mapping +{ + "runtime": { + "is_high_value_order": { + "type": "boolean", + "script": { + "lang": "painless", + "source": """ + double price = doc['taxful_total_price'].value; + long productsLong = doc['total_unique_products'].value; + + // Explicit casting: convert long to double for calculations + double products = (double) productsLong; + + // Weighted score: 70% price weight + 30% product diversity weight + // Implicit casting: integer literals (200 and 5) promote to double in division + double priceWeight = (price / 200) * 0.7; + double productWeight = (products / 5) * 0.3; + double totalScore = priceWeight + productWeight; + + // Round to nearest integer: scores ≥0.5 become 1 (high value), <0.5 become 0 (regular) + long rounded = Math.round(totalScore); + boolean result = (rounded == 1); + + emit(result); + """ + } + } + } +} +``` + +This script includes the following steps: + +* **Explicit casting:** `(double) productsLong` converts long integer to double for mathematical operations +* **Implicit casting:** Division operations automatically handle type promotion during arithmetic +* **Weighted scoring:** Combines order price (70%) and product diversity (30%) to calculate order value +* **Normalization:** Scales values using realistic maximums (price/200, products/5) for consistent scoring +* **Boolean classification:** Uses `Math.round()` and comparison logic to create true/false flags for business classification + + +For more details about Painless scripting in runtime fields, refer to [Runtime field context](elasticsearch://reference/scripting-languages/painless/painless-runtime-fields-context.md). + +### Test the runtime fields + +Query the runtime field to see boolean conversion results: + +```json +GET kibana_sample_data_ecommerce/_search +{ + "fields": [ + "is_high_value_order" + ], + "_source": { + "includes": [ + "customer_id", + "taxful_total_price", + "total_unique_products" + ] + } +} +``` + +The results show how the calculated scores are classified into boolean values: + +:::{dropdown} Response + +```json +{ + "hits": { + "hits": [ + { + "_source": { + "customer_id": 38, + "taxful_total_price": 36.98, + "total_unique_products": 2 + }, + "fields": { + "is_high_value_order": [ + false + ] + } + }, + { + "_source": { + "customer_id": 24, + "taxful_total_price": 103.94, + "total_unique_products": 4 + }, + "fields": { + "is_high_value_order": [ + true + ] + } + } + ] + } +} +``` + +::: + +## Learn more about type casting in Painless + +This tutorial showed you practical type conversion scenarios across ingest pipelines and runtime fields. To expand your Painless scripting: + +* **Painless Casting Table:** See the full list of allowed type conversions, explicit and implicit [casting](elasticsearch://reference/scripting-languages/painless/painless-casting.md) rules, and examples for supported types in Painless. +* **Apply to other contexts:** Use these casting techniques in update scripts, aggregations, and search queries. For context-specific data access patterns (`doc`, `ctx`, `_source`), refer to Painless syntax-context bridge. +* **Explore advanced casting:** The [Painless language specification](elasticsearch://reference/scripting-languages/painless/painless-language-specification.md) covers explicit versus implicit casting rules, numeric precision handling, and reference type conversions. diff --git a/explore-analyze/scripting/modules-scripting-update-documents.md b/explore-analyze/scripting/modules-scripting-update-documents.md new file mode 100644 index 0000000000..3bbd36b626 --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-update-documents.md @@ -0,0 +1,100 @@ +--- +navigation_title: Update documents using scripts +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Update documents using scripts [scripts-update-scripts] + +You can use the [update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) to update documents with a specified script. The script can update, delete, or skip modifying the document. The update API also supports passing a partial document, which is merged into the existing document. + +First, let’s index a simple document: + +```console +PUT my-index-000001/_doc/1 +{ + "counter" : 1, + "tags" : ["red"] +} +``` + +To increment the counter, you can submit an update request with the following script: + +```console +POST my-index-000001/_update/1 +{ + "script" : { + "source": "ctx._source.counter += params.count", + "lang": "painless", + "params" : { + "count" : 4 + } + } +} +``` + +Similarly, you can use an update script to add a tag to the list of tags. Because this is just a list, the tag is added even it exists: + +```console +POST my-index-000001/_update/1 +{ + "script": { + "source": "ctx._source.tags.add(params['tag'])", + "lang": "painless", + "params": { + "tag": "blue" + } + } +} +``` + +You can also remove a tag from the list of tags. The `remove` method of a Java `List` is available in Painless. It takes the index of the element you want to remove. To avoid a possible runtime error, you first need to make sure the tag exists. If the list contains duplicates of the tag, this script just removes one occurrence. + +```console +POST my-index-000001/_update/1 +{ + "script": { + "source": "if (ctx._source.tags.contains(params['tag'])) { ctx._source.tags.remove(ctx._source.tags.indexOf(params['tag'])) }", + "lang": "painless", + "params": { + "tag": "blue" + } + } +} +``` + +You can also add and remove fields from a document. For example, this script adds the field `new_field`: + +```console +POST my-index-000001/_update/1 +{ + "script" : "ctx._source.new_field = 'value_of_new_field'" +} +``` + +Conversely, this script removes the field `new_field`: + +```console +POST my-index-000001/_update/1 +{ + "script" : "ctx._source.remove('new_field')" +} +``` + +Instead of updating the document, you can also change the operation that is executed from within the script. For example, this request deletes the document if the `tags` field contains `green`. Otherwise it does nothing (`noop`): + +```console +POST my-index-000001/_update/1 +{ + "script": { + "source": "if (ctx._source.tags.contains(params['tag'])) { ctx.op = 'delete' } else { ctx.op = 'none' }", + "lang": "painless", + "params": { + "tag": "green" + } + } +} +``` diff --git a/explore-analyze/scripting/modules-scripting-use-parameters.md b/explore-analyze/scripting/modules-scripting-use-parameters.md new file mode 100644 index 0000000000..e0b7a30f0d --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-use-parameters.md @@ -0,0 +1,39 @@ +--- +navigation_title: Use parameters +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Use parameters in your script [prefer-params] + +The first time {{es}} sees a new script, it compiles the script and stores the compiled version in a cache. Compilation can be a heavy process. Rather than hard-coding values in your script, pass them as named `params` instead. + +For example, in the previous script, we could have just hard coded values and written a script that is seemingly less complex. We could just retrieve the first value for `my_field` and then multiply it by `2`: + +```painless +"source": "return doc['my_field'].value * 2" +``` + +Though it works, this solution is pretty inflexible. We have to modify the script source to change the multiplier, and {{es}} has to recompile the script every time that the multiplier changes. + +Instead of hard-coding values, use named `params` to make scripts flexible, and also reduce compilation time when the script runs. You can now make changes to the `multiplier` parameter without {{es}} recompiling the script. + +```painless +"source": "doc['my_field'].value * params['multiplier']", +"params": { + "multiplier": 2 +} +``` + +You can compile up to 150 scripts per 5 minutes by default. For ingest contexts, the default script compilation rate is unlimited. + +```js +script.context.field.max_compilations_rate=100/10m +``` + +::::{important} +If you compile too many unique scripts within a short time, {{es}} rejects the new dynamic scripts with a `circuit_breaking_exception` error. +:::: diff --git a/explore-analyze/scripting/modules-scripting-using.md b/explore-analyze/scripting/modules-scripting-using.md index d62f249b91..bea16e7477 100644 --- a/explore-analyze/scripting/modules-scripting-using.md +++ b/explore-analyze/scripting/modules-scripting-using.md @@ -8,7 +8,13 @@ products: - id: elasticsearch --- -# How to write scripts [modules-scripting-using] +# How to write Painless scripts [modules-scripting-using] + +:::{tip} +This guide provides a beginner-friendly introduction to Painless scripting with step-by-step tutorials and practical examples. If you're new to scripting or Painless, this is the recommended starting point. + +For users with Java or Painless experience looking for technical specifications and advanced features, refer to [A Brief Painless walkthrough](elasticsearch://reference/scripting-languages/painless/brief-painless-walkthrough.md) in the Reference section. +::: Wherever scripting is supported in the {{es}} APIs, the syntax follows the same pattern; you specify the language of your script, provide the script logic (or source), and add parameters that are passed into the script: @@ -27,287 +33,12 @@ Wherever scripting is supported in the {{es}} APIs, the syntax follows the same : The script itself, which you specify as `source` for an inline script or `id` for a stored script. Use the [stored script APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-script) to create and manage stored scripts. `params` -: Specifies any named parameters that are passed into the script as variables. [Use parameters](#prefer-params) instead of hard-coded values to decrease compile time. - - -## Write your first script [hello-world-script] - -[Painless](modules-scripting-painless.md) is the default scripting language for {{es}}. It is secure, performant, and provides a natural syntax for anyone with a little coding experience. - -A Painless script is structured as one or more statements and optionally has one or more user-defined functions at the beginning. A script must always have at least one statement. - -The [Painless execute API](elasticsearch://reference/scripting-languages/painless/painless-api-examples.md) provides the ability to test a script with simple user-defined parameters and receive a result. Let’s start with a complete script and review its constituent parts. - -First, index a document with a single field so that we have some data to work with: - -```console -PUT my-index-000001/_doc/1 -{ - "my_field": 5 -} -``` - -We can then construct a script that operates on that field and run evaluate the script as part of a query. The following query uses the [`script_fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) parameter of the search API to retrieve a script valuation. There’s a lot happening here, but we’ll break it down the components to understand them individually. For now, you only need to understand that this script takes `my_field` and operates on it. - -```console -GET my-index-000001/_search -{ - "script_fields": { - "my_doubled_field": { - "script": { <1> - "source": "doc['my_field'].value * params['multiplier']", <2> - "params": { - "multiplier": 2 - } - } - } - } -} -``` - -1. `script` object -2. `script` source - - -The `script` is a standard JSON object that defines scripts under most APIs in {{es}}. This object requires `source` to define the script itself. The script doesn’t specify a language, so it defaults to Painless. - - -## Use parameters in your script [prefer-params] - -The first time {{es}} sees a new script, it compiles the script and stores the compiled version in a cache. Compilation can be a heavy process. Rather than hard-coding values in your script, pass them as named `params` instead. - -For example, in the previous script, we could have just hard coded values and written a script that is seemingly less complex. We could just retrieve the first value for `my_field` and then multiply it by `2`: - -```painless -"source": "return doc['my_field'].value * 2" -``` - -Though it works, this solution is pretty inflexible. We have to modify the script source to change the multiplier, and {{es}} has to recompile the script every time that the multiplier changes. - -Instead of hard-coding values, use named `params` to make scripts flexible, and also reduce compilation time when the script runs. You can now make changes to the `multiplier` parameter without {{es}} recompiling the script. - -```painless -"source": "doc['my_field'].value * params['multiplier']", -"params": { - "multiplier": 2 -} -``` - -You can compile up to 150 scripts per 5 minutes by default. For ingest contexts, the default script compilation rate is unlimited. - -```js -script.context.field.max_compilations_rate=100/10m -``` - -::::{important} -If you compile too many unique scripts within a short time, {{es}} rejects the new dynamic scripts with a `circuit_breaking_exception` error. -:::: - - - -## Shorten your script [script-shorten-syntax] - -Using syntactic abilities that are native to Painless, you can reduce verbosity in your scripts and make them shorter. Here’s a simple script that we can make shorter: - -```console -GET my-index-000001/_search -{ - "script_fields": { - "my_doubled_field": { - "script": { - "lang": "painless", - "source": "doc['my_field'].value * params.get('multiplier');", - "params": { - "multiplier": 2 - } - } - } - } -} -``` - -Let’s look at a shortened version of the script to see what improvements it includes over the previous iteration: - -```console -GET my-index-000001/_search -{ - "script_fields": { - "my_doubled_field": { - "script": { - "source": "field('my_field').get(null) * params['multiplier']", - "params": { - "multiplier": 2 - } - } - } - } -} -``` - -This version of the script removes several components and simplifies the syntax significantly: - -* The `lang` declaration. Because Painless is the default language, you don’t need to specify the language if you’re writing a Painless script. -* The `return` keyword. Painless automatically uses the final statement in a script (when possible) to produce a return value in a script context that requires one. -* The `get` method, which is replaced with brackets `[]`. Painless uses a shortcut specifically for the `Map` type that allows us to use brackets instead of the lengthier `get` method. -* The semicolon at the end of the `source` statement. Painless does not require semicolons for the final statement of a block. However, it does require them in other cases to remove ambiguity. - -Use this abbreviated syntax anywhere that {{es}} supports scripts, such as when you’re creating [runtime fields](../../manage-data/data-store/mapping/map-runtime-field.md). - - -## Store and retrieve scripts [script-stored-scripts] - -You can store and retrieve scripts from the cluster state using the [stored script APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-script). Stored scripts allow you to reference shared scripts for operations like scoring, aggregating, filtering, and reindexing. Instead of embedding scripts inline in each query, you can reference these shared operations. - -Stored scripts can also reduce request payload size. Depending on script size and request frequency, this can help lower latency and data transfer costs. - -::::{note} -Unlike regular scripts, stored scripts require that you specify a script language using the `lang` parameter. -:::: - - -To create a script, use the [create stored script API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-put-script). For example, the following request creates a stored script named `calculate-score`. - -```console -POST _scripts/calculate-score -{ - "script": { - "lang": "painless", - "source": "Math.log(_score * 2) + params['my_modifier']" - } -} -``` - -You can retrieve that script by using the [get stored script API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-get-script). - -```console -GET _scripts/calculate-score -``` - -To use the stored script in a query, include the script `id` in the `script` declaration: - -```console -GET my-index-000001/_search -{ - "query": { - "script_score": { - "query": { - "match": { - "message": "some message" - } - }, - "script": { - "id": "calculate-score", <1> - "params": { - "my_modifier": 2 - } - } - } - } -} -``` - -1. `id` of the stored script - - -To delete a stored script, submit a [delete stored script API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-delete-script) request. - -```console -DELETE _scripts/calculate-score -``` - - -## Update documents with scripts [scripts-update-scripts] - -You can use the [update API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update) to update documents with a specified script. The script can update, delete, or skip modifying the document. The update API also supports passing a partial document, which is merged into the existing document. - -First, let’s index a simple document: - -```console -PUT my-index-000001/_doc/1 -{ - "counter" : 1, - "tags" : ["red"] -} -``` - -To increment the counter, you can submit an update request with the following script: - -```console -POST my-index-000001/_update/1 -{ - "script" : { - "source": "ctx._source.counter += params.count", - "lang": "painless", - "params" : { - "count" : 4 - } - } -} -``` - -Similarly, you can use an update script to add a tag to the list of tags. Because this is just a list, the tag is added even it exists: - -```console -POST my-index-000001/_update/1 -{ - "script": { - "source": "ctx._source.tags.add(params['tag'])", - "lang": "painless", - "params": { - "tag": "blue" - } - } -} -``` - -You can also remove a tag from the list of tags. The `remove` method of a Java `List` is available in Painless. It takes the index of the element you want to remove. To avoid a possible runtime error, you first need to make sure the tag exists. If the list contains duplicates of the tag, this script just removes one occurrence. - -```console -POST my-index-000001/_update/1 -{ - "script": { - "source": "if (ctx._source.tags.contains(params['tag'])) { ctx._source.tags.remove(ctx._source.tags.indexOf(params['tag'])) }", - "lang": "painless", - "params": { - "tag": "blue" - } - } -} -``` - -You can also add and remove fields from a document. For example, this script adds the field `new_field`: - -```console -POST my-index-000001/_update/1 -{ - "script" : "ctx._source.new_field = 'value_of_new_field'" -} -``` - -Conversely, this script removes the field `new_field`: - -```console -POST my-index-000001/_update/1 -{ - "script" : "ctx._source.remove('new_field')" -} -``` - -Instead of updating the document, you can also change the operation that is executed from within the script. For example, this request deletes the document if the `tags` field contains `green`. Otherwise it does nothing (`noop`): - -```console -POST my-index-000001/_update/1 -{ - "script": { - "source": "if (ctx._source.tags.contains(params['tag'])) { ctx.op = 'delete' } else { ctx.op = 'none' }", - "lang": "painless", - "params": { - "tag": "green" - } - } -} -``` - - +: Specifies any named parameters that are passed into the script as variables. [Use parameters](/explore-analyze/scripting/modules-scripting-use-parameters.md) instead of hard-coded values to decrease compile time. +Get started with Painless scripting: +* [](/explore-analyze/scripting/modules-scripting-write-first-script.md) +* [](/explore-analyze/scripting/modules-scripting-use-parameters.md) +* [](/explore-analyze/scripting/modules-scripting-shorten-script.md) +* [](/explore-analyze/scripting/modules-scripting-store-and-retrieve.md) +* [](/explore-analyze/scripting/modules-scripting-update-documents.md) diff --git a/explore-analyze/scripting/modules-scripting-write-first-script.md b/explore-analyze/scripting/modules-scripting-write-first-script.md new file mode 100644 index 0000000000..1349efbf51 --- /dev/null +++ b/explore-analyze/scripting/modules-scripting-write-first-script.md @@ -0,0 +1,54 @@ +--- +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Write your first script [hello-world-script] + +[Painless](modules-scripting-painless.md) is the default scripting language for {{es}}. It is secure, performant, and provides a natural syntax for anyone with a little coding experience. + +A Painless script is structured as one or more statements and optionally has one or more user-defined functions at the beginning. A script must always have at least one statement. + +The [Painless execute API](elasticsearch://reference/scripting-languages/painless/painless-api-examples.md) provides the ability to test a script with simple user-defined parameters and receive a result. Let’s start with a complete script and review its constituent parts. + +1. Index a document + + Index a document with a single field so that we have some data to work with: + + ```console + PUT my-index-000001/_doc/1 + { + "my_field": 5 + } + ``` + +1. Operate on a field + + You can now construct a script that operates on that field and then evaluate the script as part of a query. The following query uses the [`script_fields`](elasticsearch://reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) parameter of the search API to retrieve a script valuation. + + The components of this script are detailed in later pages. For now, note that the script takes `my_field` as input and operates on it. + + ```console + GET my-index-000001/_search + { + "script_fields": { + "my_doubled_field": { + "script": { <1> + "source": "doc['my_field'].value * params['multiplier']", <2> + "params": { + "multiplier": 2 + } + } + } + } + } + ``` + + 1. `script` object + 2. `script` source + + + The `script` is a standard JSON object that defines scripts under most APIs in {{es}}. This object requires `source` to define the script itself. Since `script` isn't set, the [scripting language](/explore-analyze/scripting.md) is interpreted as being Painless, by default. diff --git a/explore-analyze/scripting/painless-syntax-context-bridge.md b/explore-analyze/scripting/painless-syntax-context-bridge.md new file mode 100644 index 0000000000..44b1d6b88e --- /dev/null +++ b/explore-analyze/scripting/painless-syntax-context-bridge.md @@ -0,0 +1,138 @@ +--- +applies_to: + stack: ga + serverless: ga +products: + - id: elasticsearch +--- + +# Painless syntax-context bridge [painless-syntax-context-bridge] + +One of the most distinctive aspects of Painless scripting is how data access methods (`doc`, `ctx`, and `_source`) are directly tied to the context of use. Unlike other scripting languages where data access patterns remain consistent, Painless provides different access mechanisms that are optimized for specific use cases and contexts within {{es}}. +Understanding when and why to use each access method is crucial for writing efficient Painless scripts. + +:::{tip} +If you're new to Painless contexts, refer to [Painless contexts](elasticsearch://reference/scripting-languages/painless/painless-contexts.md) in the Reference section for comprehensive context documentation. For hands-on examples of field access, refer to our set of [Painless script tutorials](/explore-analyze/scripting/common-script-uses.md). +::: + +## Technical differences + +* [**`doc` values**](#when-to-use-doc-values) are a columnar field value store, enabled by default on all the fields except analyzed text fields. They can only return simple field values such as numbers, dates, geo-points, and terms. +* [**`ctx` access**](#when-to-use-ctx) provides structured access to document content during modification contexts, with fields accessible as map and list structures for existing document fields. +* [**`_source` access**](#when-to-use-source) loads the complete document as a map-of-maps, optimized for returning several fields per result but slower than doc values for single field access. + + +Check the [decision matrix](#decision-matrix) to decide between these. + + + + +## When to use `doc` values (recommended) [when-to-use-doc-values] + +* **You should always start with `doc` values** as your first option for field access. This is the fastest and most efficient way to access field values in Painless scripts. Refer to [Doc values](/explore-analyze/scripting/modules-scripting-fields.md#modules-scripting-doc-vals) to learn more. +* **Painless context examples:** + * [Sort context](elasticsearch://reference/scripting-languages/painless/painless-sort-context.md) + * [Aggregation scripts](elasticsearch://reference/scripting-languages/painless/painless-metric-agg-init-context.md) + * [Score scripts](elasticsearch://reference/scripting-languages/painless/painless-score-context.md) +* **Syntax pattern:** `doc[‘field_name’].value` + +### Example: Aggregation calculation + +The following example calculates the average price per item across all orders by dividing `taxful_total_price` by `total_quantity` for each document. The `avg` [aggregation](/explore-analyze/query-filter/aggregations.md) then computes the average of these calculated values. + +```java + +GET kibana_sample_data_ecommerce/_search +{ + "size": 0, + "aggs": { + "avg_price_per_item": { + "avg": { + "script": { + "source": "doc['taxful_total_price'].value / doc['total_quantity'].value" + } + } + } + } +} +``` + +## When to use `ctx` [when-to-use-ctx] + +* Use `ctx` for document modification and pipeline processing where you need access to document metadata, content, and operational control. +* **Painless context examples:** + * [Update context](elasticsearch://reference/scripting-languages/painless/painless-update-context.md) + * [Ingest processor context](elasticsearch://reference/scripting-languages/painless/painless-ingest-processor-context.md) + * [Reindex context](elasticsearch://reference/scripting-languages/painless/painless-reindex-context.md) +* **Syntax pattern:** `ctx.field_name,` `` ctx._source.field_name, and `ctx[‘field_name’]` `` + +### Example: Ingest pipeline processing + +The following example creates an [ingest pipeline](/manage-data/ingest/transform-enrich/ingest-pipelines.md) named `create_summary` with a [script processor](elasticsearch://reference/enrich-processor/script-processor.md). This script assigns a text value to the field `order_summary` by combining the customer name and the price. + +```java +PUT _ingest/pipeline/create_summary +{ + "processors": [ + { + "script": { + "source": """ + ctx.order_summary = ctx.customer_full_name + ' - $' + ctx.taxful_total_price; + """ + } + } + ] +} +``` + +## When to use `_source` [when-to-use-source] + +* **Use `_source` for** document updates and transformations where you need full JSON document access. +* **Painless context examples:** + * [Update by query](elasticsearch://reference/scripting-languages/painless/painless-update-by-query-context.md) + * [Runtime fields contexts](elasticsearch://reference/scripting-languages/painless/painless-runtime-fields-context.md) +* **Syntax patterns:** `ctx._source.field_name` + +### Example: Document transformation with calculations + +Let’s use `_update_by_query` to calculate loyalty points from the order’s total price multiplied by a parameter rate for high-value orders. + +```java +POST /kibana_sample_data_ecommerce/_update_by_query +{ + "query": { + "range": { + "taxful_total_price": {"gte": 1000} + } + }, + "script": { + "source": """ + ctx._source.loyalty_points = Math.round(ctx._source.taxful_total_price * params.points_rate); + """, + "params": { + "points_rate": 2.0 + } + } +} +``` + +## Decision matrix [decision-matrix] + +| Scenario | Required Access Method | Reason | +| :---- | :---- | :---- | +| [Aggregation calculations](/explore-analyze/scripting/modules-scripting-fields.md#modules-scripting-source) | `doc` | Columnar storage provides fastest performance | +| [Document scoring](/explore-analyze/scripting/modules-scripting-fields.md#_search_and_aggregation_scripts) | `doc` | Optimized for search-time calculations | +| [Script fields (top results)](/explore-analyze/scripting/modules-scripting-fields.md#modules-scripting-source) | `_source` | Optimized for returning several fields per result | +| [Adding fields during ingest](elasticsearch://reference/scripting-languages/painless/painless-ingest-processor-context.md) | `ctx` | Direct field access during pipeline processing | +| [Updating existing documents](/explore-analyze/scripting/modules-scripting-fields.md#_update_scripts) | `ctx._source` | Full document modification capabilities | +| [Document transformation during reindex](elasticsearch://reference/scripting-languages/painless/painless-reindex-context.md) | `ctx._source` | Complete document restructuring with metadata access | +| [Sort operations](elasticsearch://reference/scripting-languages/painless/painless-sort-context.md) | `doc` | Single-field performance optimization for sorting | +| [Runtime field with simple values](elasticsearch://reference/scripting-languages/painless/painless-runtime-fields-context.md) | `doc` | Performance advantage for repeated calculations | +| [Runtime field with complex logic](elasticsearch://reference/scripting-languages/painless/painless-runtime-fields-context.md) | `params[‘_source’]` | Access to complete document structure with `emit` | + +## Next steps + +* **New users:** Explore [Accessing document fields and special variables](/explore-analyze/scripting/modules-scripting-fields.md) +* **Advanced users:** Review [Painless contexts](elasticsearch://reference/scripting-languages/painless/painless-contexts.md) for context-specific implementation details + + diff --git a/explore-analyze/scripting/script-fields-api.md b/explore-analyze/scripting/script-fields-api.md index 33a692099c..ea4377b896 100644 --- a/explore-analyze/scripting/script-fields-api.md +++ b/explore-analyze/scripting/script-fields-api.md @@ -1,5 +1,5 @@ --- -navigation_title: Access fields in a document +navigation_title: Accessing fields in a document mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/script-fields-api.html applies_to: @@ -11,7 +11,7 @@ products: -# Access fields in a document [script-fields-api] +# Accessing fields in a document [script-fields-api] ::::{warning} diff --git a/explore-analyze/scripting/scripting-field-extraction.md b/explore-analyze/scripting/scripting-field-extraction.md index ee6abc6d73..7b56f5bada 100644 --- a/explore-analyze/scripting/scripting-field-extraction.md +++ b/explore-analyze/scripting/scripting-field-extraction.md @@ -8,7 +8,7 @@ products: - id: elasticsearch --- -# Field extraction [scripting-field-extraction] +# Extracing fields [scripting-field-extraction] The goal of field extraction is simple; you have fields in your data with a bunch of information, but you only want to extract pieces and parts. diff --git a/explore-analyze/toc.yml b/explore-analyze/toc.yml index eb67af306d..4bba857ea2 100644 --- a/explore-analyze/toc.yml +++ b/explore-analyze/toc.yml @@ -138,20 +138,32 @@ toc: - file: scripting.md children: - file: scripting/modules-scripting-painless.md - - file: scripting/modules-scripting-using.md children: - - file: scripting/scripts-search-speed.md - - file: scripting/dissect.md - - file: scripting/grok.md - - file: scripting/script-fields-api.md - - file: scripting/common-script-uses.md - children: - - file: scripting/scripting-field-extraction.md - - file: scripting/modules-scripting-fields.md - - file: scripting/modules-scripting-security.md + - file: scripting/modules-scripting-using.md + children: + - file: scripting/modules-scripting-write-first-script.md + - file: scripting/modules-scripting-use-parameters.md + - file: scripting/modules-scripting-shorten-script.md + - file: scripting/modules-scripting-store-and-retrieve.md + - file: scripting/modules-scripting-update-documents.md + - file: scripting/common-script-uses.md + children: + - file: scripting/modules-scripting-fields.md + - file: scripting/script-fields-api.md + - file: scripting/modules-scripting-type-casting-tutorial.md + - file: scripting/dissect.md + - file: scripting/scripting-field-extraction.md + - file: scripting/grok.md + - file: scripting/scripts-search-speed.md + - file: scripting/modules-scripting-document-update-tutorial.md + - file: scripting/modules-scripting-regular-expressions-tutorial.md + - file: scripting/modules-scripting-datetime-tutorial.md + - file: scripting/painless-syntax-context-bridge.md + - file: scripting/modules-scripting-security.md + - file: scripting/painless-lab.md - file: scripting/modules-scripting-expression.md - file: scripting/modules-scripting-engine.md - - file: scripting/painless-lab.md + - file: ai-assistant.md - file: manage-access-to-ai-assistant.md - file: discover.md diff --git a/manage-data/ingest/transform-enrich/ingest-pipelines.md b/manage-data/ingest/transform-enrich/ingest-pipelines.md index e5883c7413..c64fc7b0b5 100644 --- a/manage-data/ingest/transform-enrich/ingest-pipelines.md +++ b/manage-data/ingest/transform-enrich/ingest-pipelines.md @@ -937,7 +937,7 @@ PUT _ingest/pipeline/my-pipeline } ``` -You can also specify a [stored script](../../../explore-analyze/scripting/modules-scripting-using.md#script-stored-scripts) as the `if` condition. +You can also specify a [stored script](../../../explore-analyze/scripting/modules-scripting-store-and-retrieve.md) as the `if` condition. ```console PUT _scripts/my-prod-tag-script