New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CTI][PIT] adds PIT to indicator search #128433
Conversation
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
@elasticmachine merge upstream |
@@ -63,3 +63,5 @@ export const DEFAULT_EVENT_ENRICHMENT_TO = 'now'; | |||
|
|||
export const TI_INTEGRATION_PREFIX = 'ti'; | |||
export const OTHER_TI_DATASET_KEY = '_others_ti_'; | |||
|
|||
export const THREAT_PIT_KEEP_ALIVE = '1h'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value (e.g. 1m, see [Time units](https://www.elastic.co/guide/en/elasticsearch/reference/current/api-conventions.html#time-units)) does not need to be long enough to process all data — it just needs to be long enough for the next request.
From docs here: https://www.elastic.co/guide/en/elasticsearch/reference/current/point-in-time-api.html
I don't think you want this to be 1 hour but rather a lower time. Just enough that you can re-call and process things again. So for example if the elastic search will be-recalled for sure within 5 minutes, 5 minutes is a better default.
// there are no concurrent getThreatList requests, so this should be OK | ||
// eslint-disable-next-line require-atomic-updates | ||
pitId = response.pit_id; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to remove the issue with atomic operation with a function closure like so:
// This avoids an atomic race condition by passing in the pit into the function
// and then atomically using it and writing it back into the closure'ed pitId.
const searchFunction = async (pitIdForSearch: string) => {
const response = await esClient.search<
ThreatListDoc,
Record<string, estypes.AggregationsAggregate>
>({
body: {
...threatListConfig,
query: queryFilter,
search_after: searchAfter,
sort: ['_shard_doc', { '@timestamp': 'asc' }],
},
track_total_hits: false,
size: INDICATOR_PER_PAGE,
pit: { id: pitIdForSearch },
});
if (response.pit_id) {
pitId = response.pit_id;
}
return response;
};
I do think the linter is bringing up a valid point. Please test what I'm posting though as I did not test it myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left two comments, I think both should be fixed before merging but an LGTM is much deserved here.
@elasticmachine merge upstream |
merge conflict between base and head |
ignore_unavailable: true, | ||
index, | ||
size: calculatedPerPage, | ||
size: INDICATOR_PER_PAGE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a static value here will break the concurrentSearches
and itemsPerSearch
functionality provided through rule params. Are we planning to remove those params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a great question. I believe that concurrent search logic has been built into indicator match rules a while back with the intention of speeding up the rule execution. That being said, there seems to be no test coverage around the feature, and as far as I am aware the parameters itemsPerSearch
and concurrentSearches
are undocumented and can only be set through the API - which makes me think that anyone creating indicator match rules via the UI have never been able to use anything but the default values: 1 for concurrentSearches
and 9000 for itemsPerSearch
. I am under the impression that the feature of executing concurrent searches is not something we have ever intentionally released - some telemetry data could confirm this. Is it possible to see how many rule SOs actually have "itemsPerSearch" and "concurrentSearches" values set out there in the world?
In my opinion it makes sense to be intentional about the concurrent searching capability of Indicator Match rules by bringing it into 8.3 planning as a part of the performance epic and reevaluate the performance with our improved benchmarking tooling. Until then, we should consider the feature unsupported - I'll get some product input on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the latest commit I brought back concurrentSearches
, itemsPerSearch
, and perPage
variables to ensure that the built-in concurrent searching logic remains intact. Created https://github.com/elastic/security-team/issues/3465 to benchmark concurrent threat searches for 8.3
@elasticmachine merge upstream |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
💛 Build succeeded, but was flakyTest Failures
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @ecezalp |
Summary
Adds PIT to indicator search.
Number of requested indicator documents per batch is updated to 1000 from 9000.