Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion manage-data/data-store/data-streams.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ When a backing index is created, the index is named using the following conventi

Some operations, such as a [shrink](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-shrink) or [restore](../../deploy-manage/tools/snapshot-and-restore/restore-snapshot.md), can change a backing index’s name. These name changes do not remove a backing index from its data stream.

The generation of the data stream can change without a new index being added to the data stream (e.g. when an existing backing index is shrunk). This means the backing indices for some generations will never exist. You should not derive any intelligence from the backing indices names.
The generation of the data stream can change without a new index being added to the data stream (for example, when an existing backing index is shrunk). This means the backing indices for some generations will never exist. You should not derive any intelligence from the backing indices names.


## Append-only (mostly) [data-streams-append-only]
Expand Down
12 changes: 6 additions & 6 deletions manage-data/data-store/data-streams/failure-store-recipes.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,7 @@ Without tags in place it would not be as clear where in the pipeline the indexin

## Alerting on failed ingestion [failure-store-examples-alerting]

Since failure stores can be searched just like a normal data stream, we can use them as inputs to [alerting rules](../../../explore-analyze/alerts-cases/alerts.md) in
Since failure stores can be searched like a normal data stream, we can use them as inputs to [alerting rules](../../../explore-analyze/alerts-cases/alerts.md) in
{{kib}}. Here is a simple alerting example that is triggered when more than ten indexing failures have occurred in the last five minutes for a data stream:

:::::{stepper}
Expand Down Expand Up @@ -382,7 +382,7 @@ We recommend a few best practices for remediating failure data.

**Use an ingest pipeline to convert failure documents back into their original document.** Failure documents store failure information along with the document that failed ingestion. The first step for remediating documents should be to use an ingest pipeline to extract the original source from the failure document and then discard any other information about the failure.

**Simulate first to avoid repeat failures.** If you must run a pipeline as part of your remediation process, it is best to simulate the pipeline against the failure first. This will catch any unforeseen issues that may fail the document a second time. Remember, ingest pipeline failures will capture the document before an ingest pipeline is applied to it, which can further complicate remediation when a failure document becomes nested inside a new failure. The easiest way to simulate these changes is via the [pipeline simulate API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-simulate) or the [simulate ingest API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-simulate-ingest).
**Simulate first to avoid repeat failures.** If you must run a pipeline as part of your remediation process, it is best to simulate the pipeline against the failure first. This will catch any unforeseen issues that may fail the document a second time. Remember, ingest pipeline failures will capture the document before an ingest pipeline is applied to it, which can further complicate remediation when a failure document becomes nested inside a new failure. The easiest way to simulate these changes is using the [pipeline simulate API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ingest-simulate) or the [simulate ingest API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-simulate-ingest).

### Remediating ingest node failures [failure-store-examples-remediation-ingest]

Expand Down Expand Up @@ -511,7 +511,7 @@ Because ingest pipeline failures need to be reprocessed by their original pipeli
```
1. The `data.id` field is expected to be present. If it isn't present this pipeline will fail.

Fixing a failure's root cause is a often a bespoke process. In this example, instead of discarding the data, we will make this identifier field optional.
Fixing a failure's root cause is often a bespoke process. In this example, instead of discarding the data, we will make this identifier field optional.

```console
PUT _ingest/pipeline/my-datastream-default-pipeline
Expand Down Expand Up @@ -658,7 +658,7 @@ POST _ingest/pipeline/_simulate
]
}
```
1. The index has been updated via the reroute processor.
1. The index has been updated through the reroute processor.
2. The document ID has stayed the same.
3. The source should cleanly match the contents of the original document.

Expand Down Expand Up @@ -995,7 +995,7 @@ PUT _ingest/pipeline/my-datastream-remediation-pipeline
2. Capture the source of the original document.
3. Discard the `error` field since it wont be needed for the remediation.
4. Also discard the `document` field.
5. We extract all the fields from the original document's source back to the root of the document. The `@timestamp` field is not overwritten and thus will be present in the final document.
5. We extract all the fields from the original document's source back to the root of the document. The `@timestamp` field is not overwritten and will be present in the final document.

:::{important}
Remember that a document that has failed during indexing has already been processed by the ingest processor! It shouldn't need to be processed again unless you made changes to your pipeline to fix the original problem. Make sure that any fixes applied to the ingest pipeline are reflected in the pipeline logic here.
Expand Down Expand Up @@ -1088,7 +1088,7 @@ Caused by: j.l.IllegalArgumentException: data stream timestamp field [@timestamp
]
}
```
1. The index has been updated via the script processor.
1. The index has been updated through the script processor.
2. The source should reflect any fixes and match the expected document shape for the final index.
3. In this example case, we find that the failure timestamp has stayed in the source.

Expand Down
12 changes: 6 additions & 6 deletions manage-data/data-store/data-streams/failure-store.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ After a matching data stream is created, its failure store will be enabled.

### Set up for existing data streams [set-up-failure-store-existing]

Enabling the failure store via [index templates](../templates.md) can only affect data streams that are newly created. Existing data streams that use a template are not affected by changes to the template's `data_stream_options` field.
Enabling the failure store using [index templates](../templates.md) can only affect data streams that are newly created. Existing data streams that use a template are not affected by changes to the template's `data_stream_options` field.
To modify an existing data stream's options, use the [put data stream options](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-stream-options) API:

```console
Expand Down Expand Up @@ -96,7 +96,7 @@ PUT _data_stream/my-datastream-existing/_options
You can also enable the data stream failure store in {{kib}}. Locate the data stream on the **Streams** page, where a stream maps directly to a data stream. Select a stream to view its details and go to the **Retention** tab where you can find the **Enable failure store** option.
:::

### Enable failure store via cluster setting [set-up-failure-store-cluster-setting]
### Enable failure store using cluster setting [set-up-failure-store-cluster-setting]

If you have a large number of existing data streams you may want to enable their failure stores in one place. Instead of updating each of their options individually, set `data_streams.failure_store.enabled` to a list of index patterns in the [cluster settings](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). Any data streams that match one of these patterns will operate with their failure store enabled.

Expand Down Expand Up @@ -257,7 +257,7 @@ If the document could have been redirected to a data stream's failure store but
3. The response status is `400 Bad Request` due to the mapping problem.


If the document was redirected to a data stream's failure store but that failed document could not be stored (e.g. due to shard unavailability or a similar problem), then the `failure_store` field on the response will be `failed`, and the response will display the error for the original failure, as well as a suppressed error detailing why the failure could not be stored:
If the document was redirected to a data stream's failure store but that failed document could not be stored (for example, due to shard unavailability or a similar problem), then the `failure_store` field on the response will be `failed`, and the response will display the error for the original failure, as well as a suppressed error detailing why the failure could not be stored:

```console-result
{
Expand Down Expand Up @@ -306,7 +306,7 @@ Once you have accumulated some failures, the failure store can be searched much
:::{warning}
Documents redirected to the failure store in the event of a failed ingest pipeline will be stored in their original, unprocessed form. If an ingest pipeline normally redacts sensitive information from a document, then failed documents in their original, unprocessed form may contain sensitive information.

Furthermore, failed documents are likely to be structured differently than normal data in a data stream, and thus special care should be taken when making use of [document level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md#document-level-security) or [field level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md#field-level-security). Any security policies that expect to utilize these features for both regular documents and failure documents should account for any differences in document structure between the two document types.
Furthermore, failed documents are likely to be structured differently than normal data in a data stream, and special care should be taken when making use of [document level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md#document-level-security) or [field level security](../../../deploy-manage/users-roles/cluster-or-deployment-auth/controlling-access-at-document-field-level.md#field-level-security). Any security policies that expect to utilize these features for both regular documents and failure documents should account for any differences in document structure between the two document types.

To limit visibility on potentially sensitive data, users require the [`read_failure_store`](elasticsearch://reference/elasticsearch/security-privileges.md#privileges-list-indices) index privilege for a data stream in order to search that data stream's failure store data.
:::
Expand All @@ -324,7 +324,7 @@ POST _query?format=txt
"query": """FROM my-datastream::failures | DROP error.stack_trace | LIMIT 1""" <1>
}
```
1. We drop the `error.stack_trace` field here just to keep the example free of newlines.
1. We drop the `error.stack_trace` field here to keep the example free of newlines.

An example of a search result with the failed document present:

Expand Down Expand Up @@ -820,7 +820,7 @@ PUT _cluster/settings
}
```

You can also specify the failure store retention period for a data stream on its data stream options. These can be specified via the index template for new data streams, or via the [put data stream options](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-stream-options) API for existing data streams.
You can also specify the failure store retention period for a data stream on its data stream options. These can be specified using the index template for new data streams, or using the [put data stream options](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-stream-options) API for existing data streams.

```console
PUT _data_stream/my-datastream/_options
Expand Down
2 changes: 1 addition & 1 deletion manage-data/data-store/data-streams/run-downsampling.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ stack: ga
serverless: ga
```

To downsample a time series via a [data stream lifecycle](/manage-data/lifecycle/data-stream.md), add a [downsampling](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) section to the data stream lifecycle (for existing data streams) or the index template (for new data streams).
To downsample a time series using a [data stream lifecycle](/manage-data/lifecycle/data-stream.md), add a [downsampling](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) section to the data stream lifecycle (for existing data streams) or the index template (for new data streams).

* Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval.
* Set `after` to the minimum time to wait after an index rollover, before running downsampling.
Expand Down
2 changes: 1 addition & 1 deletion manage-data/data-store/mapping.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ products:
% - [x] ./raw-migrated-files/elasticsearch/elasticsearch-reference/index-modules-mapper.md
% Notes: redirect only

% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc):
% Internal links rely on the following IDs being on this page (for example, as a heading ID, paragraph ID, and so on):

$$$mapping-limit-settings$$$

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ products:

You can specify a `runtime_mappings` section in a search request to create runtime fields that exist only as part of the query. You specify a script as part of the `runtime_mappings` section, just as you would if [adding a runtime field to the mappings](map-runtime-field.md).

Defining a runtime field in a search request uses the same format as defining a runtime field in the index mapping. Just copy the field definition from the `runtime` in the index mapping to the `runtime_mappings` section of the search request.
Defining a runtime field in a search request uses the same format as defining a runtime field in the index mapping. Copy the field definition from the `runtime` in the index mapping to the `runtime_mappings` section of the search request.

The following search request adds a `day_of_week` field to the `runtime_mappings` section. The field values will be calculated dynamically, and only within the context of this search request:

Expand Down
6 changes: 3 additions & 3 deletions manage-data/data-store/mapping/dynamic-templates.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ The `match_pattern` parameter adjusts the behavior of the `match` parameter to s
"match": "^profit_\d+$"
```

The following example matches all `string` fields whose name starts with `long_` (except for those which end with `_text`) and maps them as `long` fields:
The following example matches all `string` fields whose name starts with `long_` (except for those that end with `_text`) and maps them as `long` fields:

```console
PUT my-index-000001
Expand Down Expand Up @@ -265,7 +265,7 @@ PUT my-index-000001/_doc/1

## `path_match` and `path_unmatch` [path-match-unmatch]

The `path_match` and `path_unmatch` parameters work in the same way as `match` and `unmatch`, but operate on the full dotted path to the field, not just the final name, e.g. `some_object.*.some_field`.
The `path_match` and `path_unmatch` parameters work in the same way as `match` and `unmatch`, but operate on the full dotted path to the field, not just the final name, for example, `some_object.*.some_field`.

This example copies the values of any fields in the `name` object to the top-level `full_name` field, except for the `middle` field:

Expand Down Expand Up @@ -342,7 +342,7 @@ PUT my-index-000001/_doc/2
}
```

Note that the `path_match` and `path_unmatch` parameters match on object paths in addition to leaf fields. As an example, indexing the following document will result in an error because the `path_match` setting also matches the object field `name.title`, which cant be mapped as text:
The `path_match` and `path_unmatch` parameters match on object paths in addition to leaf fields. As an example, indexing the following document will result in an error because the `path_match` setting also matches the object field `name.title`, which can't be mapped as text:

```console
PUT my-index-000001/_doc/2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ The mapping contains two fields: `@timestamp` and `message`.

If you want to retrieve results that include `clientip`, you can add that field as a runtime field in the mapping. The following runtime script defines a [grok pattern](../../../explore-analyze/scripting/grok.md) that extracts structured fields out of a single text field within a document. A grok pattern is like a regular expression that supports aliased expressions that you can reuse.

The script matches on the `%{{COMMONAPACHELOG}}` log pattern, which understands the structure of Apache logs. If the pattern matches (`clientip != null`), the script emits the value of the matching IP address. If the pattern doesnt match, the script just returns the field value without crashing.
The script matches on the `%{{COMMONAPACHELOG}}` log pattern, which understands the structure of Apache logs. If the pattern matches (`clientip != null`), the script emits the value of the matching IP address. If the pattern doesn't match, the script returns the field value without crashing.

```console
PUT my-index-000001/_mappings
Expand All @@ -116,7 +116,7 @@ PUT my-index-000001/_mappings
1. This condition ensures that the script doesn’t crash even if the pattern of the message doesn’t match.


Alternatively, you can define the same runtime field but in the context of a search request. The runtime definition and the script are exactly the same as the one defined previously in the index mapping. Just copy that definition into the search request under the `runtime_mappings` section and include a query that matches on the runtime field. This query returns the same results as if you defined a search query for the `http.clientip` runtime field in your index mappings, but only in the context of this specific search:
Alternatively, you can define the same runtime field but in the context of a search request. The runtime definition and the script are exactly the same as the one defined previously in the index mapping. Copy that definition into the search request under the `runtime_mappings` section and include a query that matches on the runtime field. This query returns the same results as if you defined a search query for the `http.clientip` runtime field in your index mappings, but only in the context of this specific search:

```console
GET my-index-000001/_search
Expand Down
2 changes: 1 addition & 1 deletion manage-data/data-store/mapping/index-runtime-field.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ products:

# Index a runtime field [runtime-indexed]

Runtime fields are defined by the context where they run. For example, you can define runtime fields in the [context of a search query](define-runtime-fields-in-search-request.md) or within the [`runtime` section](map-runtime-field.md) of an index mapping. If you decide to index a runtime field for greater performance, just move the full runtime field definition (including the script) to the context of an index mapping. {{es}} automatically uses these indexed fields to drive queries, resulting in a fast response time. This capability means you can write a script only once, and apply it to any context that supports runtime fields.
Runtime fields are defined by the context where they run. For example, you can define runtime fields in the [context of a search query](define-runtime-fields-in-search-request.md) or within the [`runtime` section](map-runtime-field.md) of an index mapping. If you decide to index a runtime field for greater performance, move the full runtime field definition (including the script) to the context of an index mapping. {{es}} automatically uses these indexed fields to drive queries, resulting in a fast response time. This capability means you can write a script only once, and apply it to any context that supports runtime fields.

::::{note}
Indexing a `composite` runtime field is currently not supported.
Expand Down
Loading
Loading