-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solutions][Detection Engine] Enhances alert documents to have the fields of constant_keyword, runtime fields, aliases, and copy_to #102280
Conversation
…oving those types
…strategy for merging
…r bug with fields API
…and fixed failures
...on/server/lib/detection_engine/signals/source_fields_merging/utils/is_array_of_primitives.ts
Outdated
Show resolved
Hide resolved
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
* | ||
* @param fieldsValue The fields value that contains the nested field or not. | ||
* @param valueInMergedDocument The document to compare against fields value to see if it is also an array or not | ||
* @returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing return type here
(valueInMergedDocument === undefined && arrayInPathExists(fieldsKey, merged)) || | ||
(isObjectLikeOrArrayOfObjectLikes(valueInMergedDocument) && | ||
!isNestedObject(fieldsValue) && | ||
!isTypeObject(fieldsValue)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a more descriptive name for this function -- typeObject
being an object returned within fields
that has a type
key (i.e. geopoint).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed this is hard to rename and we are going to leave it as is.
...etection_engine/signals/source_fields_merging/strategies/merge_missing_fields_with_source.ts
Show resolved
Hide resolved
x-pack/test/detection_engine_api_integration/security_and_spaces/tests/runtime.ts
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked out, tested locally, and pair code-reviewed. Thank you for taking the better part of your afternoon to review all the details with me and discuss all the caveats around the different algorithms that you have introduced here. Appreciate the thoroughness and confidence not only in tests, but also the README and nomenclature you've used for outlining all the different permutations and how they're supported. Glad to have a first pass at combining _source
and fields
that everyone can leverage. LGTM! 🎉
Note: In testing @FrankHassanabad and I opened this issue (#103581) around support for runtime fields configured on Kibana Index Patterns.
Merging this now and will do a smaller follow up to address points above. |
… of constant_keyword, runtime fields, aliases, and copy_to fields (elastic#102280) ## Summary This adds utilities and two strategies for merging using the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html) and the `_source` document during signal generation. This gives us the ability to support `constant_keyword`, field alias value support, some runtime fields support, and `copy_to` support. Previously we did not copy any of these values and only generated signals based on the `_source` record values. This changes the behavior to allow us to copy some of the mentioned values above. The folder of `source_fields_merging` contains a `strategy` folder and a `utils` folder which contains both the strategies and the utilities for this implementation. The two strategies are `merge_all_fields_with_source` and `merge_missing_fields_with_source`. The defaulted choice for this PR is we use `merge_missing_fields_with_source` and not the `merge_all_fields_with_source`. The reasoning is that this is much lower risk and lower behavior changes to the signals detection engine. The main driving force behind this PR is that ECS has introduced `constant_keyword` and that field has the possibility of only showing up in the fields section of a document and not `_source` when index authors do not push the `constant_keyword` into the `_source` section. The secondary driving forces behind this behavioral change is that some users have been expecting their runtime fields, `copy_to` fields, and field alias values of their indexes to be copied into the signals index. Both strategies of `merge_missing_fields_with_source` and `merge_all_fields_with_source` are considered Best Effort meaning that both strategies will not always merge as expected when they encounter ambiguous use cases as outlined in the `README.md` text at the top of `source_fields_merging` in detail. The default used strategy of `merge_missing_fields_with_source` which has the simplest behavior will work in most common use cases. This is simply if the `_source` document is missing a value that is present in the `fields`, and the `fields` value is a primitive concrete value such as a `string` or `number` or `boolean` and the `_source` document does not contain an existing object or ambiguous array, then the value will be merged into `_source` and a new reference is returned. If you call the strategy twice it should be idempotent meaning that the second call will detect a value is now present in `_source` and not re-merge a second time. * 301 unit tests were added * Extensive README.md docs are added * e2e tests are updated to test scenarios and ambiguity and conflicts from previously to support this effort. * Other e2e tests were updated * One bug with EQL and fields was found with a workaround implemented. See elastic/elasticsearch#74582 * SearchTypes adjusted to use recursive TypeScript types * Changed deprecated for `@deprecated` in a few spots * Removed some `ts-expect-error` in favor of `??` in a few areas * Added a new handling of epoch strings and tests to `detection_engine/signals/utils.ts` since fields returns `epoch_millis` as a string instead of as a number. * Uses lodash safer set to reduce changes of prototype pollution ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### Risk Matrix | Risk | Probability | Severity | Mitigation/Notes | |---------------------------|-------------|----------|-------------------------| | Prototype pollution | Low | High | Used lodash safer set | | Users which have existing rules that work, upgrade and now we do not generate signals due to bad merging of fields and _source | Mid | High | We use the safer strategy method, `merge_missing_fields_with_source `, that is lighter weight to start with. We might add a follow up PR which enables a key in Kibana to turn off merging of fields with source. We added extensive unit tests and e2e tests. However, unexpected unknowns and behaviors from runtime fields and fields API such as geo-points looking like nested fields or `epoch_milliseconds` being a string value or runtime fields allowing invalid values were uncovered and tests and utilities around that have been added which makes this PR risky | | Found a bug with using fields and EQL which caused EQL rules to not run. | Low | High | Implemented workaround for tests to pass and created an Elastic ticket and communicated the bug to EQL developers. |
💚 Backport successful
This backport PR will be merged automatically after passing CI. |
… of constant_keyword, runtime fields, aliases, and copy_to fields (#102280) (#103590) ## Summary This adds utilities and two strategies for merging using the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html) and the `_source` document during signal generation. This gives us the ability to support `constant_keyword`, field alias value support, some runtime fields support, and `copy_to` support. Previously we did not copy any of these values and only generated signals based on the `_source` record values. This changes the behavior to allow us to copy some of the mentioned values above. The folder of `source_fields_merging` contains a `strategy` folder and a `utils` folder which contains both the strategies and the utilities for this implementation. The two strategies are `merge_all_fields_with_source` and `merge_missing_fields_with_source`. The defaulted choice for this PR is we use `merge_missing_fields_with_source` and not the `merge_all_fields_with_source`. The reasoning is that this is much lower risk and lower behavior changes to the signals detection engine. The main driving force behind this PR is that ECS has introduced `constant_keyword` and that field has the possibility of only showing up in the fields section of a document and not `_source` when index authors do not push the `constant_keyword` into the `_source` section. The secondary driving forces behind this behavioral change is that some users have been expecting their runtime fields, `copy_to` fields, and field alias values of their indexes to be copied into the signals index. Both strategies of `merge_missing_fields_with_source` and `merge_all_fields_with_source` are considered Best Effort meaning that both strategies will not always merge as expected when they encounter ambiguous use cases as outlined in the `README.md` text at the top of `source_fields_merging` in detail. The default used strategy of `merge_missing_fields_with_source` which has the simplest behavior will work in most common use cases. This is simply if the `_source` document is missing a value that is present in the `fields`, and the `fields` value is a primitive concrete value such as a `string` or `number` or `boolean` and the `_source` document does not contain an existing object or ambiguous array, then the value will be merged into `_source` and a new reference is returned. If you call the strategy twice it should be idempotent meaning that the second call will detect a value is now present in `_source` and not re-merge a second time. * 301 unit tests were added * Extensive README.md docs are added * e2e tests are updated to test scenarios and ambiguity and conflicts from previously to support this effort. * Other e2e tests were updated * One bug with EQL and fields was found with a workaround implemented. See elastic/elasticsearch#74582 * SearchTypes adjusted to use recursive TypeScript types * Changed deprecated for `@deprecated` in a few spots * Removed some `ts-expect-error` in favor of `??` in a few areas * Added a new handling of epoch strings and tests to `detection_engine/signals/utils.ts` since fields returns `epoch_millis` as a string instead of as a number. * Uses lodash safer set to reduce changes of prototype pollution ### Checklist Delete any items that are not applicable to this PR. - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### Risk Matrix | Risk | Probability | Severity | Mitigation/Notes | |---------------------------|-------------|----------|-------------------------| | Prototype pollution | Low | High | Used lodash safer set | | Users which have existing rules that work, upgrade and now we do not generate signals due to bad merging of fields and _source | Mid | High | We use the safer strategy method, `merge_missing_fields_with_source `, that is lighter weight to start with. We might add a follow up PR which enables a key in Kibana to turn off merging of fields with source. We added extensive unit tests and e2e tests. However, unexpected unknowns and behaviors from runtime fields and fields API such as geo-points looking like nested fields or `epoch_milliseconds` being a string value or runtime fields allowing invalid values were uncovered and tests and utilities around that have been added which makes this PR risky | | Found a bug with using fields and EQL which caused EQL rules to not run. | Low | High | Implemented workaround for tests to pass and created an Elastic ticket and communicated the bug to EQL developers. | Co-authored-by: Frank Hassanabad <frank.hassanabad@elastic.co>
## Summary Small follow up to: #102280 Where I address PR concerns around docs.
## Summary Small follow up to: elastic#102280 Where I address PR concerns around docs.
…ibana.yml and updates docker to have missing keys from security solutions (#103800) ## Summary This is a follow up considered critical addition to: #102280 This adds a key of `xpack.securitySolution.alertMergeStrategy` to `kibana.yml` which allows users to change their merge strategy between their raw events and the signals/alerts that are generated. This also adds additional security keys to the docker container that were overlooked in the past from security solutions. The values you can use and add to to `xpack.securitySolution.alertMergeStrategy` are: * missingFields (The default) * allFields * noFields ## missingFields The default merge strategy we are using starting with 7.14 which will merge any primitive data types from the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html#search-fields-param) into the resulting signal/alert. This will copy over fields such as `constant_keyword`, `copy_to`, `runtime fields`, `field aliases` which previously were not copied over as long as they are primitive data types such as `keyword`, `text`, `numeric` and are not found in your original `_source` document. This will not copy copy `geo points`, `nested objects`, and in some cases if your `_source` contains arrays or top level objects or conflicts/ambiguities it will not merge them. This will _not_ merge existing values between `_source` and `fields` for `runtime fields` as well. It only merges missing primitive data types. ## allFields A very aggressive merge strategy which should be considered experimental. It will do everything `missingFields` does but in addition to that it will merge existing values between `_source` and `fields` which means if you change values or override values with `runtime fields` this strategy will attempt to merge those values. This will also merge in most instances your nested fields but it will not merge `geo` data types due to ambiguities. If you have multi-fields this will choose your default field and merge that into `_source`. This can change a lot your data between your original `_source` and `fields` when the data is copied into an alert/signal which is why it is considered an aggressive merge strategy. Both these strategies attempts to unbox single array elements when it makes sense and assumes you only want values in an array when it sees them in `_source` or if it sees multiple elements within an array. ## noFields The behavior before #102280 was introduced and is a do nothing strategy. This should only be used if you are seeing problems with alerts/signals being inserted due to conflicts and/or bugs for some reason with `missingFields`. We are not anticipating this, but if you are setting `noFields` please reach out to our [forums](https://discuss.elastic.co/c/security/83) and let us know we have a bug so we can fix it. If you are encountering undesired merge behaviors or have other strategies you want us to implement let us know on the forums as well. The missing keys added for docker are: * xpack.securitySolution.alertMergeStrategy * xpack.securitySolution.alertResultListDefaultDateRange * xpack.securitySolution.endpointResultListDefaultFirstPageIndex * xpack.securitySolution.endpointResultListDefaultPageSize * xpack.securitySolution.maxRuleImportExportSize * xpack.securitySolution.maxRuleImportPayloadBytes * xpack.securitySolution.maxTimelineImportExportSize * xpack.securitySolution.maxTimelineImportPayloadBytes * xpack.securitySolution.packagerTaskInterval * xpack.securitySolution.validateArtifactDownloads I intentionally skipped adding the other `kibana.yml` keys which are considered either experimental flags or are for internal developers and are not documented and not supported in production by us. ## Manual testing of the different strategies First add this mapping and document in the dev tools for basic tests ```json # Mapping with two constant_keywords and a runtime field DELETE frank-test-delme-17 PUT frank-test-delme-17 { "mappings": { "dynamic": "strict", "runtime": { "host.name": { "type": "keyword", "script": { "source": "emit('changed_hostname')" } } }, "properties": { "@timestamp": { "type": "date" }, "host": { "properties": { "name": { "type": "keyword" } } }, "data_stream": { "properties": { "dataset": { "type": "constant_keyword", "value": "datastream_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "datastream_module_name_1" } } }, "event": { "properties": { "dataset": { "type": "constant_keyword", "value": "event_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "event_module_name_1" } } } } } } # Document without an existing host.name PUT frank-test-delme-17/_doc/1 { "@timestamp": "2021-06-30T15:46:31.800Z" } # Document with an existing host.name PUT frank-test-delme-17/_doc/2 { "@timestamp": "2021-06-30T15:46:31.800Z", "host": { "name": "host_name" } } # Query it to ensure the fields is returned with data that does not exist in _soruce GET frank-test-delme-17/_search { "fields": [ { "field": "*" } ] } ``` For all the different key combinations do the following: Run a single detection rule against the index: <img width="1139" alt="Screen Shot 2021-06-30 at 9 49 12 AM" src="https://user-images.githubusercontent.com/1151048/123997522-b8dc6600-d98d-11eb-9407-5480d5b2cc8a.png"> Ensure two signals are created: <img width="1376" alt="Screen Shot 2021-06-30 at 10 26 03 AM" src="https://user-images.githubusercontent.com/1151048/123997739-f17c3f80-d98d-11eb-9eb9-90e9410f0cde.png"> If your `kibana.yml` or `kibana.dev.yml` you set this key (or omit it as it is the default): ```yml xpack.securitySolution.alertMergeStrategy: 'missingFields' ``` When you click on each signal you should see that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="877" alt="Screen Shot 2021-06-30 at 10 20 44 AM" src="https://user-images.githubusercontent.com/1151048/123997961-31432700-d98e-11eb-96ee-06524f21e2d6.png"> However since this only merges missing fields, you should see that in the first record the `host.name` is the runtime field defined since `host.name` does not exist in `_source` and that in the second record it still shows up as `host_name` since we do not override merges right now: First: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123998398-b2022300-d98e-11eb-87be-aa5a153a91bc.png"> Second: <img width="838" alt="Screen Shot 2021-06-30 at 10 03 44 AM" src="https://user-images.githubusercontent.com/1151048/123998413-b4fd1380-d98e-11eb-9821-d6189190918f.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'noFields' ``` Expect that your `event.module`, `event.dataset`, `data_stream.module`, `data_stream.dataset` are all non-existent since we do not copy anything over from `fields` at all and only use things within `_source`: <img width="804" alt="Screen Shot 2021-06-30 at 9 58 25 AM" src="https://user-images.githubusercontent.com/1151048/123998694-f8578200-d98e-11eb-8d71-a0858d3ed3e7.png"> Expect that `host.name` is missing in the first record and has the default value in the second: First: <img width="797" alt="Screen Shot 2021-06-30 at 9 58 37 AM" src="https://user-images.githubusercontent.com/1151048/123998797-10c79c80-d98f-11eb-81b6-5174d8ef14f2.png"> Second: <img width="806" alt="Screen Shot 2021-06-30 at 9 58 52 AM" src="https://user-images.githubusercontent.com/1151048/123998816-158c5080-d98f-11eb-87a0-0ac2f58793b3.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'allFields' ``` Expect that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="864" alt="Screen Shot 2021-06-30 at 10 03 15 AM" src="https://user-images.githubusercontent.com/1151048/123999000-48364900-d98f-11eb-9803-05349744ac10.png"> Expect that both the first and second records contain the runtime field since we merge both of them: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123999078-58e6bf00-d98f-11eb-83bd-dda6b50fabcd.png"> ### Checklist Delete any items that are not applicable to this PR. - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the [cloud](https://github.com/elastic/cloud) and added to the [docker list](https://github.com/elastic/kibana/blob/c29adfef29e921cc447d2a5ed06ac2047ceab552/src/dev/build/tasks/os_packages/docker_generator/resources/bin/kibana-docker)
…ibana.yml and updates docker to have missing keys from security solutions (elastic#103800) ## Summary This is a follow up considered critical addition to: elastic#102280 This adds a key of `xpack.securitySolution.alertMergeStrategy` to `kibana.yml` which allows users to change their merge strategy between their raw events and the signals/alerts that are generated. This also adds additional security keys to the docker container that were overlooked in the past from security solutions. The values you can use and add to to `xpack.securitySolution.alertMergeStrategy` are: * missingFields (The default) * allFields * noFields ## missingFields The default merge strategy we are using starting with 7.14 which will merge any primitive data types from the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html#search-fields-param) into the resulting signal/alert. This will copy over fields such as `constant_keyword`, `copy_to`, `runtime fields`, `field aliases` which previously were not copied over as long as they are primitive data types such as `keyword`, `text`, `numeric` and are not found in your original `_source` document. This will not copy copy `geo points`, `nested objects`, and in some cases if your `_source` contains arrays or top level objects or conflicts/ambiguities it will not merge them. This will _not_ merge existing values between `_source` and `fields` for `runtime fields` as well. It only merges missing primitive data types. ## allFields A very aggressive merge strategy which should be considered experimental. It will do everything `missingFields` does but in addition to that it will merge existing values between `_source` and `fields` which means if you change values or override values with `runtime fields` this strategy will attempt to merge those values. This will also merge in most instances your nested fields but it will not merge `geo` data types due to ambiguities. If you have multi-fields this will choose your default field and merge that into `_source`. This can change a lot your data between your original `_source` and `fields` when the data is copied into an alert/signal which is why it is considered an aggressive merge strategy. Both these strategies attempts to unbox single array elements when it makes sense and assumes you only want values in an array when it sees them in `_source` or if it sees multiple elements within an array. ## noFields The behavior before elastic#102280 was introduced and is a do nothing strategy. This should only be used if you are seeing problems with alerts/signals being inserted due to conflicts and/or bugs for some reason with `missingFields`. We are not anticipating this, but if you are setting `noFields` please reach out to our [forums](https://discuss.elastic.co/c/security/83) and let us know we have a bug so we can fix it. If you are encountering undesired merge behaviors or have other strategies you want us to implement let us know on the forums as well. The missing keys added for docker are: * xpack.securitySolution.alertMergeStrategy * xpack.securitySolution.alertResultListDefaultDateRange * xpack.securitySolution.endpointResultListDefaultFirstPageIndex * xpack.securitySolution.endpointResultListDefaultPageSize * xpack.securitySolution.maxRuleImportExportSize * xpack.securitySolution.maxRuleImportPayloadBytes * xpack.securitySolution.maxTimelineImportExportSize * xpack.securitySolution.maxTimelineImportPayloadBytes * xpack.securitySolution.packagerTaskInterval * xpack.securitySolution.validateArtifactDownloads I intentionally skipped adding the other `kibana.yml` keys which are considered either experimental flags or are for internal developers and are not documented and not supported in production by us. ## Manual testing of the different strategies First add this mapping and document in the dev tools for basic tests ```json # Mapping with two constant_keywords and a runtime field DELETE frank-test-delme-17 PUT frank-test-delme-17 { "mappings": { "dynamic": "strict", "runtime": { "host.name": { "type": "keyword", "script": { "source": "emit('changed_hostname')" } } }, "properties": { "@timestamp": { "type": "date" }, "host": { "properties": { "name": { "type": "keyword" } } }, "data_stream": { "properties": { "dataset": { "type": "constant_keyword", "value": "datastream_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "datastream_module_name_1" } } }, "event": { "properties": { "dataset": { "type": "constant_keyword", "value": "event_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "event_module_name_1" } } } } } } # Document without an existing host.name PUT frank-test-delme-17/_doc/1 { "@timestamp": "2021-06-30T15:46:31.800Z" } # Document with an existing host.name PUT frank-test-delme-17/_doc/2 { "@timestamp": "2021-06-30T15:46:31.800Z", "host": { "name": "host_name" } } # Query it to ensure the fields is returned with data that does not exist in _soruce GET frank-test-delme-17/_search { "fields": [ { "field": "*" } ] } ``` For all the different key combinations do the following: Run a single detection rule against the index: <img width="1139" alt="Screen Shot 2021-06-30 at 9 49 12 AM" src="https://user-images.githubusercontent.com/1151048/123997522-b8dc6600-d98d-11eb-9407-5480d5b2cc8a.png"> Ensure two signals are created: <img width="1376" alt="Screen Shot 2021-06-30 at 10 26 03 AM" src="https://user-images.githubusercontent.com/1151048/123997739-f17c3f80-d98d-11eb-9eb9-90e9410f0cde.png"> If your `kibana.yml` or `kibana.dev.yml` you set this key (or omit it as it is the default): ```yml xpack.securitySolution.alertMergeStrategy: 'missingFields' ``` When you click on each signal you should see that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="877" alt="Screen Shot 2021-06-30 at 10 20 44 AM" src="https://user-images.githubusercontent.com/1151048/123997961-31432700-d98e-11eb-96ee-06524f21e2d6.png"> However since this only merges missing fields, you should see that in the first record the `host.name` is the runtime field defined since `host.name` does not exist in `_source` and that in the second record it still shows up as `host_name` since we do not override merges right now: First: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123998398-b2022300-d98e-11eb-87be-aa5a153a91bc.png"> Second: <img width="838" alt="Screen Shot 2021-06-30 at 10 03 44 AM" src="https://user-images.githubusercontent.com/1151048/123998413-b4fd1380-d98e-11eb-9821-d6189190918f.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'noFields' ``` Expect that your `event.module`, `event.dataset`, `data_stream.module`, `data_stream.dataset` are all non-existent since we do not copy anything over from `fields` at all and only use things within `_source`: <img width="804" alt="Screen Shot 2021-06-30 at 9 58 25 AM" src="https://user-images.githubusercontent.com/1151048/123998694-f8578200-d98e-11eb-8d71-a0858d3ed3e7.png"> Expect that `host.name` is missing in the first record and has the default value in the second: First: <img width="797" alt="Screen Shot 2021-06-30 at 9 58 37 AM" src="https://user-images.githubusercontent.com/1151048/123998797-10c79c80-d98f-11eb-81b6-5174d8ef14f2.png"> Second: <img width="806" alt="Screen Shot 2021-06-30 at 9 58 52 AM" src="https://user-images.githubusercontent.com/1151048/123998816-158c5080-d98f-11eb-87a0-0ac2f58793b3.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'allFields' ``` Expect that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="864" alt="Screen Shot 2021-06-30 at 10 03 15 AM" src="https://user-images.githubusercontent.com/1151048/123999000-48364900-d98f-11eb-9803-05349744ac10.png"> Expect that both the first and second records contain the runtime field since we merge both of them: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123999078-58e6bf00-d98f-11eb-83bd-dda6b50fabcd.png"> ### Checklist Delete any items that are not applicable to this PR. - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the [cloud](https://github.com/elastic/cloud) and added to the [docker list](https://github.com/elastic/kibana/blob/c29adfef29e921cc447d2a5ed06ac2047ceab552/src/dev/build/tasks/os_packages/docker_generator/resources/bin/kibana-docker)
…ibana.yml and updates docker to have missing keys from security solutions (elastic#103800) ## Summary This is a follow up considered critical addition to: elastic#102280 This adds a key of `xpack.securitySolution.alertMergeStrategy` to `kibana.yml` which allows users to change their merge strategy between their raw events and the signals/alerts that are generated. This also adds additional security keys to the docker container that were overlooked in the past from security solutions. The values you can use and add to to `xpack.securitySolution.alertMergeStrategy` are: * missingFields (The default) * allFields * noFields ## missingFields The default merge strategy we are using starting with 7.14 which will merge any primitive data types from the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html#search-fields-param) into the resulting signal/alert. This will copy over fields such as `constant_keyword`, `copy_to`, `runtime fields`, `field aliases` which previously were not copied over as long as they are primitive data types such as `keyword`, `text`, `numeric` and are not found in your original `_source` document. This will not copy copy `geo points`, `nested objects`, and in some cases if your `_source` contains arrays or top level objects or conflicts/ambiguities it will not merge them. This will _not_ merge existing values between `_source` and `fields` for `runtime fields` as well. It only merges missing primitive data types. ## allFields A very aggressive merge strategy which should be considered experimental. It will do everything `missingFields` does but in addition to that it will merge existing values between `_source` and `fields` which means if you change values or override values with `runtime fields` this strategy will attempt to merge those values. This will also merge in most instances your nested fields but it will not merge `geo` data types due to ambiguities. If you have multi-fields this will choose your default field and merge that into `_source`. This can change a lot your data between your original `_source` and `fields` when the data is copied into an alert/signal which is why it is considered an aggressive merge strategy. Both these strategies attempts to unbox single array elements when it makes sense and assumes you only want values in an array when it sees them in `_source` or if it sees multiple elements within an array. ## noFields The behavior before elastic#102280 was introduced and is a do nothing strategy. This should only be used if you are seeing problems with alerts/signals being inserted due to conflicts and/or bugs for some reason with `missingFields`. We are not anticipating this, but if you are setting `noFields` please reach out to our [forums](https://discuss.elastic.co/c/security/83) and let us know we have a bug so we can fix it. If you are encountering undesired merge behaviors or have other strategies you want us to implement let us know on the forums as well. The missing keys added for docker are: * xpack.securitySolution.alertMergeStrategy * xpack.securitySolution.alertResultListDefaultDateRange * xpack.securitySolution.endpointResultListDefaultFirstPageIndex * xpack.securitySolution.endpointResultListDefaultPageSize * xpack.securitySolution.maxRuleImportExportSize * xpack.securitySolution.maxRuleImportPayloadBytes * xpack.securitySolution.maxTimelineImportExportSize * xpack.securitySolution.maxTimelineImportPayloadBytes * xpack.securitySolution.packagerTaskInterval * xpack.securitySolution.validateArtifactDownloads I intentionally skipped adding the other `kibana.yml` keys which are considered either experimental flags or are for internal developers and are not documented and not supported in production by us. ## Manual testing of the different strategies First add this mapping and document in the dev tools for basic tests ```json # Mapping with two constant_keywords and a runtime field DELETE frank-test-delme-17 PUT frank-test-delme-17 { "mappings": { "dynamic": "strict", "runtime": { "host.name": { "type": "keyword", "script": { "source": "emit('changed_hostname')" } } }, "properties": { "@timestamp": { "type": "date" }, "host": { "properties": { "name": { "type": "keyword" } } }, "data_stream": { "properties": { "dataset": { "type": "constant_keyword", "value": "datastream_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "datastream_module_name_1" } } }, "event": { "properties": { "dataset": { "type": "constant_keyword", "value": "event_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "event_module_name_1" } } } } } } # Document without an existing host.name PUT frank-test-delme-17/_doc/1 { "@timestamp": "2021-06-30T15:46:31.800Z" } # Document with an existing host.name PUT frank-test-delme-17/_doc/2 { "@timestamp": "2021-06-30T15:46:31.800Z", "host": { "name": "host_name" } } # Query it to ensure the fields is returned with data that does not exist in _soruce GET frank-test-delme-17/_search { "fields": [ { "field": "*" } ] } ``` For all the different key combinations do the following: Run a single detection rule against the index: <img width="1139" alt="Screen Shot 2021-06-30 at 9 49 12 AM" src="https://user-images.githubusercontent.com/1151048/123997522-b8dc6600-d98d-11eb-9407-5480d5b2cc8a.png"> Ensure two signals are created: <img width="1376" alt="Screen Shot 2021-06-30 at 10 26 03 AM" src="https://user-images.githubusercontent.com/1151048/123997739-f17c3f80-d98d-11eb-9eb9-90e9410f0cde.png"> If your `kibana.yml` or `kibana.dev.yml` you set this key (or omit it as it is the default): ```yml xpack.securitySolution.alertMergeStrategy: 'missingFields' ``` When you click on each signal you should see that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="877" alt="Screen Shot 2021-06-30 at 10 20 44 AM" src="https://user-images.githubusercontent.com/1151048/123997961-31432700-d98e-11eb-96ee-06524f21e2d6.png"> However since this only merges missing fields, you should see that in the first record the `host.name` is the runtime field defined since `host.name` does not exist in `_source` and that in the second record it still shows up as `host_name` since we do not override merges right now: First: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123998398-b2022300-d98e-11eb-87be-aa5a153a91bc.png"> Second: <img width="838" alt="Screen Shot 2021-06-30 at 10 03 44 AM" src="https://user-images.githubusercontent.com/1151048/123998413-b4fd1380-d98e-11eb-9821-d6189190918f.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'noFields' ``` Expect that your `event.module`, `event.dataset`, `data_stream.module`, `data_stream.dataset` are all non-existent since we do not copy anything over from `fields` at all and only use things within `_source`: <img width="804" alt="Screen Shot 2021-06-30 at 9 58 25 AM" src="https://user-images.githubusercontent.com/1151048/123998694-f8578200-d98e-11eb-8d71-a0858d3ed3e7.png"> Expect that `host.name` is missing in the first record and has the default value in the second: First: <img width="797" alt="Screen Shot 2021-06-30 at 9 58 37 AM" src="https://user-images.githubusercontent.com/1151048/123998797-10c79c80-d98f-11eb-81b6-5174d8ef14f2.png"> Second: <img width="806" alt="Screen Shot 2021-06-30 at 9 58 52 AM" src="https://user-images.githubusercontent.com/1151048/123998816-158c5080-d98f-11eb-87a0-0ac2f58793b3.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'allFields' ``` Expect that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="864" alt="Screen Shot 2021-06-30 at 10 03 15 AM" src="https://user-images.githubusercontent.com/1151048/123999000-48364900-d98f-11eb-9803-05349744ac10.png"> Expect that both the first and second records contain the runtime field since we merge both of them: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123999078-58e6bf00-d98f-11eb-83bd-dda6b50fabcd.png"> ### Checklist Delete any items that are not applicable to this PR. - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the [cloud](https://github.com/elastic/cloud) and added to the [docker list](https://github.com/elastic/kibana/blob/c29adfef29e921cc447d2a5ed06ac2047ceab552/src/dev/build/tasks/os_packages/docker_generator/resources/bin/kibana-docker)
…ibana.yml and updates docker to have missing keys from security solutions (#103800) (#104020) ## Summary This is a follow up considered critical addition to: #102280 This adds a key of `xpack.securitySolution.alertMergeStrategy` to `kibana.yml` which allows users to change their merge strategy between their raw events and the signals/alerts that are generated. This also adds additional security keys to the docker container that were overlooked in the past from security solutions. The values you can use and add to to `xpack.securitySolution.alertMergeStrategy` are: * missingFields (The default) * allFields * noFields ## missingFields The default merge strategy we are using starting with 7.14 which will merge any primitive data types from the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html#search-fields-param) into the resulting signal/alert. This will copy over fields such as `constant_keyword`, `copy_to`, `runtime fields`, `field aliases` which previously were not copied over as long as they are primitive data types such as `keyword`, `text`, `numeric` and are not found in your original `_source` document. This will not copy copy `geo points`, `nested objects`, and in some cases if your `_source` contains arrays or top level objects or conflicts/ambiguities it will not merge them. This will _not_ merge existing values between `_source` and `fields` for `runtime fields` as well. It only merges missing primitive data types. ## allFields A very aggressive merge strategy which should be considered experimental. It will do everything `missingFields` does but in addition to that it will merge existing values between `_source` and `fields` which means if you change values or override values with `runtime fields` this strategy will attempt to merge those values. This will also merge in most instances your nested fields but it will not merge `geo` data types due to ambiguities. If you have multi-fields this will choose your default field and merge that into `_source`. This can change a lot your data between your original `_source` and `fields` when the data is copied into an alert/signal which is why it is considered an aggressive merge strategy. Both these strategies attempts to unbox single array elements when it makes sense and assumes you only want values in an array when it sees them in `_source` or if it sees multiple elements within an array. ## noFields The behavior before #102280 was introduced and is a do nothing strategy. This should only be used if you are seeing problems with alerts/signals being inserted due to conflicts and/or bugs for some reason with `missingFields`. We are not anticipating this, but if you are setting `noFields` please reach out to our [forums](https://discuss.elastic.co/c/security/83) and let us know we have a bug so we can fix it. If you are encountering undesired merge behaviors or have other strategies you want us to implement let us know on the forums as well. The missing keys added for docker are: * xpack.securitySolution.alertMergeStrategy * xpack.securitySolution.alertResultListDefaultDateRange * xpack.securitySolution.endpointResultListDefaultFirstPageIndex * xpack.securitySolution.endpointResultListDefaultPageSize * xpack.securitySolution.maxRuleImportExportSize * xpack.securitySolution.maxRuleImportPayloadBytes * xpack.securitySolution.maxTimelineImportExportSize * xpack.securitySolution.maxTimelineImportPayloadBytes * xpack.securitySolution.packagerTaskInterval * xpack.securitySolution.validateArtifactDownloads I intentionally skipped adding the other `kibana.yml` keys which are considered either experimental flags or are for internal developers and are not documented and not supported in production by us. ## Manual testing of the different strategies First add this mapping and document in the dev tools for basic tests ```json # Mapping with two constant_keywords and a runtime field DELETE frank-test-delme-17 PUT frank-test-delme-17 { "mappings": { "dynamic": "strict", "runtime": { "host.name": { "type": "keyword", "script": { "source": "emit('changed_hostname')" } } }, "properties": { "@timestamp": { "type": "date" }, "host": { "properties": { "name": { "type": "keyword" } } }, "data_stream": { "properties": { "dataset": { "type": "constant_keyword", "value": "datastream_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "datastream_module_name_1" } } }, "event": { "properties": { "dataset": { "type": "constant_keyword", "value": "event_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "event_module_name_1" } } } } } } # Document without an existing host.name PUT frank-test-delme-17/_doc/1 { "@timestamp": "2021-06-30T15:46:31.800Z" } # Document with an existing host.name PUT frank-test-delme-17/_doc/2 { "@timestamp": "2021-06-30T15:46:31.800Z", "host": { "name": "host_name" } } # Query it to ensure the fields is returned with data that does not exist in _soruce GET frank-test-delme-17/_search { "fields": [ { "field": "*" } ] } ``` For all the different key combinations do the following: Run a single detection rule against the index: <img width="1139" alt="Screen Shot 2021-06-30 at 9 49 12 AM" src="https://user-images.githubusercontent.com/1151048/123997522-b8dc6600-d98d-11eb-9407-5480d5b2cc8a.png"> Ensure two signals are created: <img width="1376" alt="Screen Shot 2021-06-30 at 10 26 03 AM" src="https://user-images.githubusercontent.com/1151048/123997739-f17c3f80-d98d-11eb-9eb9-90e9410f0cde.png"> If your `kibana.yml` or `kibana.dev.yml` you set this key (or omit it as it is the default): ```yml xpack.securitySolution.alertMergeStrategy: 'missingFields' ``` When you click on each signal you should see that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="877" alt="Screen Shot 2021-06-30 at 10 20 44 AM" src="https://user-images.githubusercontent.com/1151048/123997961-31432700-d98e-11eb-96ee-06524f21e2d6.png"> However since this only merges missing fields, you should see that in the first record the `host.name` is the runtime field defined since `host.name` does not exist in `_source` and that in the second record it still shows up as `host_name` since we do not override merges right now: First: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123998398-b2022300-d98e-11eb-87be-aa5a153a91bc.png"> Second: <img width="838" alt="Screen Shot 2021-06-30 at 10 03 44 AM" src="https://user-images.githubusercontent.com/1151048/123998413-b4fd1380-d98e-11eb-9821-d6189190918f.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'noFields' ``` Expect that your `event.module`, `event.dataset`, `data_stream.module`, `data_stream.dataset` are all non-existent since we do not copy anything over from `fields` at all and only use things within `_source`: <img width="804" alt="Screen Shot 2021-06-30 at 9 58 25 AM" src="https://user-images.githubusercontent.com/1151048/123998694-f8578200-d98e-11eb-8d71-a0858d3ed3e7.png"> Expect that `host.name` is missing in the first record and has the default value in the second: First: <img width="797" alt="Screen Shot 2021-06-30 at 9 58 37 AM" src="https://user-images.githubusercontent.com/1151048/123998797-10c79c80-d98f-11eb-81b6-5174d8ef14f2.png"> Second: <img width="806" alt="Screen Shot 2021-06-30 at 9 58 52 AM" src="https://user-images.githubusercontent.com/1151048/123998816-158c5080-d98f-11eb-87a0-0ac2f58793b3.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'allFields' ``` Expect that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="864" alt="Screen Shot 2021-06-30 at 10 03 15 AM" src="https://user-images.githubusercontent.com/1151048/123999000-48364900-d98f-11eb-9803-05349744ac10.png"> Expect that both the first and second records contain the runtime field since we merge both of them: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123999078-58e6bf00-d98f-11eb-83bd-dda6b50fabcd.png"> ### Checklist Delete any items that are not applicable to this PR. - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the [cloud](https://github.com/elastic/cloud) and added to the [docker list](https://github.com/elastic/kibana/blob/c29adfef29e921cc447d2a5ed06ac2047ceab552/src/dev/build/tasks/os_packages/docker_generator/resources/bin/kibana-docker) Co-authored-by: Frank Hassanabad <frank.hassanabad@elastic.co>
…ibana.yml and updates docker to have missing keys from security solutions (#103800) (#104019) ## Summary This is a follow up considered critical addition to: #102280 This adds a key of `xpack.securitySolution.alertMergeStrategy` to `kibana.yml` which allows users to change their merge strategy between their raw events and the signals/alerts that are generated. This also adds additional security keys to the docker container that were overlooked in the past from security solutions. The values you can use and add to to `xpack.securitySolution.alertMergeStrategy` are: * missingFields (The default) * allFields * noFields ## missingFields The default merge strategy we are using starting with 7.14 which will merge any primitive data types from the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html#search-fields-param) into the resulting signal/alert. This will copy over fields such as `constant_keyword`, `copy_to`, `runtime fields`, `field aliases` which previously were not copied over as long as they are primitive data types such as `keyword`, `text`, `numeric` and are not found in your original `_source` document. This will not copy copy `geo points`, `nested objects`, and in some cases if your `_source` contains arrays or top level objects or conflicts/ambiguities it will not merge them. This will _not_ merge existing values between `_source` and `fields` for `runtime fields` as well. It only merges missing primitive data types. ## allFields A very aggressive merge strategy which should be considered experimental. It will do everything `missingFields` does but in addition to that it will merge existing values between `_source` and `fields` which means if you change values or override values with `runtime fields` this strategy will attempt to merge those values. This will also merge in most instances your nested fields but it will not merge `geo` data types due to ambiguities. If you have multi-fields this will choose your default field and merge that into `_source`. This can change a lot your data between your original `_source` and `fields` when the data is copied into an alert/signal which is why it is considered an aggressive merge strategy. Both these strategies attempts to unbox single array elements when it makes sense and assumes you only want values in an array when it sees them in `_source` or if it sees multiple elements within an array. ## noFields The behavior before #102280 was introduced and is a do nothing strategy. This should only be used if you are seeing problems with alerts/signals being inserted due to conflicts and/or bugs for some reason with `missingFields`. We are not anticipating this, but if you are setting `noFields` please reach out to our [forums](https://discuss.elastic.co/c/security/83) and let us know we have a bug so we can fix it. If you are encountering undesired merge behaviors or have other strategies you want us to implement let us know on the forums as well. The missing keys added for docker are: * xpack.securitySolution.alertMergeStrategy * xpack.securitySolution.alertResultListDefaultDateRange * xpack.securitySolution.endpointResultListDefaultFirstPageIndex * xpack.securitySolution.endpointResultListDefaultPageSize * xpack.securitySolution.maxRuleImportExportSize * xpack.securitySolution.maxRuleImportPayloadBytes * xpack.securitySolution.maxTimelineImportExportSize * xpack.securitySolution.maxTimelineImportPayloadBytes * xpack.securitySolution.packagerTaskInterval * xpack.securitySolution.validateArtifactDownloads I intentionally skipped adding the other `kibana.yml` keys which are considered either experimental flags or are for internal developers and are not documented and not supported in production by us. ## Manual testing of the different strategies First add this mapping and document in the dev tools for basic tests ```json # Mapping with two constant_keywords and a runtime field DELETE frank-test-delme-17 PUT frank-test-delme-17 { "mappings": { "dynamic": "strict", "runtime": { "host.name": { "type": "keyword", "script": { "source": "emit('changed_hostname')" } } }, "properties": { "@timestamp": { "type": "date" }, "host": { "properties": { "name": { "type": "keyword" } } }, "data_stream": { "properties": { "dataset": { "type": "constant_keyword", "value": "datastream_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "datastream_module_name_1" } } }, "event": { "properties": { "dataset": { "type": "constant_keyword", "value": "event_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "event_module_name_1" } } } } } } # Document without an existing host.name PUT frank-test-delme-17/_doc/1 { "@timestamp": "2021-06-30T15:46:31.800Z" } # Document with an existing host.name PUT frank-test-delme-17/_doc/2 { "@timestamp": "2021-06-30T15:46:31.800Z", "host": { "name": "host_name" } } # Query it to ensure the fields is returned with data that does not exist in _soruce GET frank-test-delme-17/_search { "fields": [ { "field": "*" } ] } ``` For all the different key combinations do the following: Run a single detection rule against the index: <img width="1139" alt="Screen Shot 2021-06-30 at 9 49 12 AM" src="https://user-images.githubusercontent.com/1151048/123997522-b8dc6600-d98d-11eb-9407-5480d5b2cc8a.png"> Ensure two signals are created: <img width="1376" alt="Screen Shot 2021-06-30 at 10 26 03 AM" src="https://user-images.githubusercontent.com/1151048/123997739-f17c3f80-d98d-11eb-9eb9-90e9410f0cde.png"> If your `kibana.yml` or `kibana.dev.yml` you set this key (or omit it as it is the default): ```yml xpack.securitySolution.alertMergeStrategy: 'missingFields' ``` When you click on each signal you should see that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="877" alt="Screen Shot 2021-06-30 at 10 20 44 AM" src="https://user-images.githubusercontent.com/1151048/123997961-31432700-d98e-11eb-96ee-06524f21e2d6.png"> However since this only merges missing fields, you should see that in the first record the `host.name` is the runtime field defined since `host.name` does not exist in `_source` and that in the second record it still shows up as `host_name` since we do not override merges right now: First: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123998398-b2022300-d98e-11eb-87be-aa5a153a91bc.png"> Second: <img width="838" alt="Screen Shot 2021-06-30 at 10 03 44 AM" src="https://user-images.githubusercontent.com/1151048/123998413-b4fd1380-d98e-11eb-9821-d6189190918f.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'noFields' ``` Expect that your `event.module`, `event.dataset`, `data_stream.module`, `data_stream.dataset` are all non-existent since we do not copy anything over from `fields` at all and only use things within `_source`: <img width="804" alt="Screen Shot 2021-06-30 at 9 58 25 AM" src="https://user-images.githubusercontent.com/1151048/123998694-f8578200-d98e-11eb-8d71-a0858d3ed3e7.png"> Expect that `host.name` is missing in the first record and has the default value in the second: First: <img width="797" alt="Screen Shot 2021-06-30 at 9 58 37 AM" src="https://user-images.githubusercontent.com/1151048/123998797-10c79c80-d98f-11eb-81b6-5174d8ef14f2.png"> Second: <img width="806" alt="Screen Shot 2021-06-30 at 9 58 52 AM" src="https://user-images.githubusercontent.com/1151048/123998816-158c5080-d98f-11eb-87a0-0ac2f58793b3.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'allFields' ``` Expect that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="864" alt="Screen Shot 2021-06-30 at 10 03 15 AM" src="https://user-images.githubusercontent.com/1151048/123999000-48364900-d98f-11eb-9803-05349744ac10.png"> Expect that both the first and second records contain the runtime field since we merge both of them: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123999078-58e6bf00-d98f-11eb-83bd-dda6b50fabcd.png"> ### Checklist Delete any items that are not applicable to this PR. - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the [cloud](https://github.com/elastic/cloud) and added to the [docker list](https://github.com/elastic/kibana/blob/c29adfef29e921cc447d2a5ed06ac2047ceab552/src/dev/build/tasks/os_packages/docker_generator/resources/bin/kibana-docker) Co-authored-by: Frank Hassanabad <frank.hassanabad@elastic.co>
…ibana.yml and updates docker to have missing keys from security solutions (elastic#103800) ## Summary This is a follow up considered critical addition to: elastic#102280 This adds a key of `xpack.securitySolution.alertMergeStrategy` to `kibana.yml` which allows users to change their merge strategy between their raw events and the signals/alerts that are generated. This also adds additional security keys to the docker container that were overlooked in the past from security solutions. The values you can use and add to to `xpack.securitySolution.alertMergeStrategy` are: * missingFields (The default) * allFields * noFields ## missingFields The default merge strategy we are using starting with 7.14 which will merge any primitive data types from the [fields API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html#search-fields-param) into the resulting signal/alert. This will copy over fields such as `constant_keyword`, `copy_to`, `runtime fields`, `field aliases` which previously were not copied over as long as they are primitive data types such as `keyword`, `text`, `numeric` and are not found in your original `_source` document. This will not copy copy `geo points`, `nested objects`, and in some cases if your `_source` contains arrays or top level objects or conflicts/ambiguities it will not merge them. This will _not_ merge existing values between `_source` and `fields` for `runtime fields` as well. It only merges missing primitive data types. ## allFields A very aggressive merge strategy which should be considered experimental. It will do everything `missingFields` does but in addition to that it will merge existing values between `_source` and `fields` which means if you change values or override values with `runtime fields` this strategy will attempt to merge those values. This will also merge in most instances your nested fields but it will not merge `geo` data types due to ambiguities. If you have multi-fields this will choose your default field and merge that into `_source`. This can change a lot your data between your original `_source` and `fields` when the data is copied into an alert/signal which is why it is considered an aggressive merge strategy. Both these strategies attempts to unbox single array elements when it makes sense and assumes you only want values in an array when it sees them in `_source` or if it sees multiple elements within an array. ## noFields The behavior before elastic#102280 was introduced and is a do nothing strategy. This should only be used if you are seeing problems with alerts/signals being inserted due to conflicts and/or bugs for some reason with `missingFields`. We are not anticipating this, but if you are setting `noFields` please reach out to our [forums](https://discuss.elastic.co/c/security/83) and let us know we have a bug so we can fix it. If you are encountering undesired merge behaviors or have other strategies you want us to implement let us know on the forums as well. The missing keys added for docker are: * xpack.securitySolution.alertMergeStrategy * xpack.securitySolution.alertResultListDefaultDateRange * xpack.securitySolution.endpointResultListDefaultFirstPageIndex * xpack.securitySolution.endpointResultListDefaultPageSize * xpack.securitySolution.maxRuleImportExportSize * xpack.securitySolution.maxRuleImportPayloadBytes * xpack.securitySolution.maxTimelineImportExportSize * xpack.securitySolution.maxTimelineImportPayloadBytes * xpack.securitySolution.packagerTaskInterval * xpack.securitySolution.validateArtifactDownloads I intentionally skipped adding the other `kibana.yml` keys which are considered either experimental flags or are for internal developers and are not documented and not supported in production by us. ## Manual testing of the different strategies First add this mapping and document in the dev tools for basic tests ```json # Mapping with two constant_keywords and a runtime field DELETE frank-test-delme-17 PUT frank-test-delme-17 { "mappings": { "dynamic": "strict", "runtime": { "host.name": { "type": "keyword", "script": { "source": "emit('changed_hostname')" } } }, "properties": { "@timestamp": { "type": "date" }, "host": { "properties": { "name": { "type": "keyword" } } }, "data_stream": { "properties": { "dataset": { "type": "constant_keyword", "value": "datastream_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "datastream_module_name_1" } } }, "event": { "properties": { "dataset": { "type": "constant_keyword", "value": "event_dataset_name_1" }, "module": { "type": "constant_keyword", "value": "event_module_name_1" } } } } } } # Document without an existing host.name PUT frank-test-delme-17/_doc/1 { "@timestamp": "2021-06-30T15:46:31.800Z" } # Document with an existing host.name PUT frank-test-delme-17/_doc/2 { "@timestamp": "2021-06-30T15:46:31.800Z", "host": { "name": "host_name" } } # Query it to ensure the fields is returned with data that does not exist in _soruce GET frank-test-delme-17/_search { "fields": [ { "field": "*" } ] } ``` For all the different key combinations do the following: Run a single detection rule against the index: <img width="1139" alt="Screen Shot 2021-06-30 at 9 49 12 AM" src="https://user-images.githubusercontent.com/1151048/123997522-b8dc6600-d98d-11eb-9407-5480d5b2cc8a.png"> Ensure two signals are created: <img width="1376" alt="Screen Shot 2021-06-30 at 10 26 03 AM" src="https://user-images.githubusercontent.com/1151048/123997739-f17c3f80-d98d-11eb-9eb9-90e9410f0cde.png"> If your `kibana.yml` or `kibana.dev.yml` you set this key (or omit it as it is the default): ```yml xpack.securitySolution.alertMergeStrategy: 'missingFields' ``` When you click on each signal you should see that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="877" alt="Screen Shot 2021-06-30 at 10 20 44 AM" src="https://user-images.githubusercontent.com/1151048/123997961-31432700-d98e-11eb-96ee-06524f21e2d6.png"> However since this only merges missing fields, you should see that in the first record the `host.name` is the runtime field defined since `host.name` does not exist in `_source` and that in the second record it still shows up as `host_name` since we do not override merges right now: First: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123998398-b2022300-d98e-11eb-87be-aa5a153a91bc.png"> Second: <img width="838" alt="Screen Shot 2021-06-30 at 10 03 44 AM" src="https://user-images.githubusercontent.com/1151048/123998413-b4fd1380-d98e-11eb-9821-d6189190918f.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'noFields' ``` Expect that your `event.module`, `event.dataset`, `data_stream.module`, `data_stream.dataset` are all non-existent since we do not copy anything over from `fields` at all and only use things within `_source`: <img width="804" alt="Screen Shot 2021-06-30 at 9 58 25 AM" src="https://user-images.githubusercontent.com/1151048/123998694-f8578200-d98e-11eb-8d71-a0858d3ed3e7.png"> Expect that `host.name` is missing in the first record and has the default value in the second: First: <img width="797" alt="Screen Shot 2021-06-30 at 9 58 37 AM" src="https://user-images.githubusercontent.com/1151048/123998797-10c79c80-d98f-11eb-81b6-5174d8ef14f2.png"> Second: <img width="806" alt="Screen Shot 2021-06-30 at 9 58 52 AM" src="https://user-images.githubusercontent.com/1151048/123998816-158c5080-d98f-11eb-87a0-0ac2f58793b3.png"> When you set in your `kibana.yml` or `kibana.dev.yml` this key: ```yml xpack.securitySolution.alertMergeStrategy: 'allFields' ``` Expect that `event.module` and `event.dataset` were copied over as well as `data_stream.dataset` and `data_stream.module` since they're `constant_keyword`: <img width="864" alt="Screen Shot 2021-06-30 at 10 03 15 AM" src="https://user-images.githubusercontent.com/1151048/123999000-48364900-d98f-11eb-9803-05349744ac10.png"> Expect that both the first and second records contain the runtime field since we merge both of them: <img width="887" alt="Screen Shot 2021-06-30 at 10 03 31 AM" src="https://user-images.githubusercontent.com/1151048/123999078-58e6bf00-d98f-11eb-83bd-dda6b50fabcd.png"> ### Checklist Delete any items that are not applicable to this PR. - [x] If a plugin configuration key changed, check if it needs to be allowlisted in the [cloud](https://github.com/elastic/cloud) and added to the [docker list](https://github.com/elastic/kibana/blob/c29adfef29e921cc447d2a5ed06ac2047ceab552/src/dev/build/tasks/os_packages/docker_generator/resources/bin/kibana-docker)
💔 Build Failed
Failed CI Steps
Test FailuresKibana Pipeline / general / task-queue-process-13 / X-Pack Endpoint API Integration Tests.x-pack/test/security_solution_endpoint_api_int/apis/metadata·ts.Endpoint plugin test metadata api POST /api/endpoint/metadata when index is not empty metadata api should return one entry for each host with default pagingStandard Out
Stack Trace
Kibana Pipeline / general / task-queue-process-13 / X-Pack Endpoint API Integration Tests.x-pack/test/security_solution_endpoint_api_int/apis/metadata·ts.Endpoint plugin test metadata api POST /api/endpoint/metadata when index is not empty metadata api should return one entry for each host with default pagingStandard Out
Stack Trace
Kibana Pipeline / general / Chrome UI Functional Tests.test/functional/apps/visualize/_vega_chart·ts.visualize app visualize ciGroup12 vega chart in visualize app vega chart initial render should have view and control containersStandard Out
Stack Trace
and 2 more failures, only showing the first 3. Metrics [docs]
History
To update your PR or re-run it, just comment with: |
Summary
This adds utilities and two strategies for merging using the fields API and the
_source
document during signal generation. This gives us the ability to supportconstant_keyword
, field alias value support, some runtime fields support, andcopy_to
support. Previously we did not copy any of these values and only generated signals based on the_source
record values. This changes the behavior to allow us to copy some of the mentioned values above.The folder of
source_fields_merging
contains astrategy
folder and autils
folder which contains both the strategies and the utilities for this implementation. The two strategies aremerge_all_fields_with_source
andmerge_missing_fields_with_source
. The defaulted choice for this PR is we usemerge_missing_fields_with_source
and not themerge_all_fields_with_source
. The reasoning is that this is much lower risk and lower behavior changes to the signals detection engine.The main driving force behind this PR is that ECS has introduced
constant_keyword
and that field has the possibility of only showing up in the fields section of a document and not_source
when index authors do not push theconstant_keyword
into the_source
section. The secondary driving forces behind this behavioral change is that some users have been expecting their runtime fields,copy_to
fields, and field alias values of their indexes to be copied into the signals index.Both strategies of
merge_missing_fields_with_source
andmerge_all_fields_with_source
are considered Best Effort meaning that both strategies will not always merge as expected when they encounter ambiguous use cases as outlined in theREADME.md
text at the top ofsource_fields_merging
in detail.The default used strategy of
merge_missing_fields_with_source
which has the simplest behavior will work in most common use cases. This is simply if the_source
document is missing a value that is present in thefields
, and thefields
value is a primitive concrete value such as astring
ornumber
orboolean
and the_source
document does not contain an existing object or ambiguous array, then the value will be merged into_source
and a new reference is returned. If you call the strategy twice it should be idempotent meaning that the second call will detect a value is now present in_source
and not re-merge a second time.@deprecated
in a few spotsts-expect-error
in favor of??
in a few areasdetection_engine/signals/utils.ts
since fields returnsepoch_millis
as a string instead of as a number.Checklist
Delete any items that are not applicable to this PR.
Risk Matrix
merge_missing_fields_with_source
, that is lighter weight to start with. We might add a follow up PR which enables a key in Kibana to turn off merging of fields with source. We added extensive unit tests and e2e tests. However, unexpected unknowns and behaviors from runtime fields and fields API such as geo-points looking like nested fields orepoch_milliseconds
being a string value or runtime fields allowing invalid values were uncovered and tests and utilities around that have been added which makes this PR risky