Skip to content

Commit

Permalink
[Security Solutions] Updates usage collector telemetry to use PIT (Po…
Browse files Browse the repository at this point in the history
…int in Time) and restructuring of folders (#124912)

## Summary

Changes the usage collector telemetry within security solutions to use PIT (Point in Time) and a few other bug fixes and restructuring.

* The main goal is to change the full queries for up to 10k items to be instead using 1k batched items at a time and PIT (Point in Time). See [this ticket](#93770) for more information and [here](#99031) for an example where they changed there code to use 1k batched items. I use PIT with SO object API, searches, and then composite aggregations which all support the PIT. The PIT timeouts are all set to 5 minutes and all the maximums of 10k to not increase memory more is still there. However, we should be able to increase the 10k limit at this point if we wanted to for usage collector to count beyond the 10k. The initial 10k was an elastic limitation that PIT now avoids.
* This also fixes a bug where the aggregations were only returning the top 10 items instead of the full 10k. That is changed to use [composite aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html). 
* This restructuring the folder structure to try and do [reductionism](https://en.wikipedia.org/wiki/Reductionism#In_computer_science) best we can. I could not do reductionism with the schema as the tooling does not allow it. But the rest is self-repeating in the way hopefully developers expect it to be. And also make it easier for developers to add new telemetry usage collector counters in the same fashion.
* This exchanges the hand spun TypeScript types in favor of using the `caseComments` and the `Sanitized Alerts` and the `ML job types` using Partial and other TypeScript tricks.
* This removes the [Cyclomatic Complexity](https://en.wikipedia.org/wiki/Cyclomatic_complexity) warnings coming from the linters by breaking down the functions into smaller units.
* This removes the "as casts" in all but 1 area which can lead to subtle TypeScript problems.
* This pushes down the logger and uses the logger to report errors and some debug information

### Checklist

Delete any items that are not applicable to this PR.

- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
  • Loading branch information
FrankHassanabad committed Feb 15, 2022
1 parent 16f3eb3 commit 7c850dd
Show file tree
Hide file tree
Showing 40 changed files with 1,896 additions and 1,221 deletions.
2 changes: 1 addition & 1 deletion x-pack/plugins/security_solution/server/plugin.ts
Expand Up @@ -168,10 +168,10 @@ export class Plugin implements ISecuritySolutionPlugin {

initUsageCollectors({
core,
kibanaIndex: core.savedObjects.getKibanaIndex(),
signalsIndex: DEFAULT_ALERTS_INDEX,
ml: plugins.ml,
usageCollection: plugins.usageCollection,
logger,
});

this.telemetryUsageCounter = plugins.usageCollection?.createUsageCounter(APP_ID);
Expand Down
40 changes: 16 additions & 24 deletions x-pack/plugins/security_solution/server/usage/collector.ts
Expand Up @@ -5,38 +5,26 @@
* 2.0.
*/

import { CoreSetup, SavedObjectsClientContract } from '../../../../../src/core/server';
import { CollectorFetchContext } from '../../../../../src/plugins/usage_collection/server';
import { CollectorDependencies } from './types';
import { fetchDetectionsMetrics } from './detections';
import { SAVED_OBJECT_TYPES } from '../../../cases/common/constants';
// eslint-disable-next-line no-restricted-imports
import { legacyRuleActionsSavedObjectType } from '../lib/detection_engine/rule_actions/legacy_saved_object_mappings';
import type { CollectorFetchContext } from '../../../../../src/plugins/usage_collection/server';
import type { CollectorDependencies } from './types';
import { getDetectionsMetrics } from './detections/get_metrics';
import { getInternalSavedObjectsClient } from './get_internal_saved_objects_client';

export type RegisterCollector = (deps: CollectorDependencies) => void;

export interface UsageData {
detectionMetrics: {};
}

export async function getInternalSavedObjectsClient(core: CoreSetup) {
return core.getStartServices().then(async ([coreStart]) => {
// note: we include the "cases" and "alert" hidden types here otherwise we would not be able to query them. If at some point cases and alert is not considered a hidden type this can be removed
return coreStart.savedObjects.createInternalRepository([
'alert',
legacyRuleActionsSavedObjectType,
...SAVED_OBJECT_TYPES,
]);
});
}

export const registerCollector: RegisterCollector = ({
core,
kibanaIndex,
signalsIndex,
ml,
usageCollection,
logger,
}) => {
if (!usageCollection) {
logger.debug('Usage collection is undefined, therefore returning early without registering it');
return;
}

Expand Down Expand Up @@ -525,12 +513,16 @@ export const registerCollector: RegisterCollector = ({
},
isReady: () => true,
fetch: async ({ esClient }: CollectorFetchContext): Promise<UsageData> => {
const internalSavedObjectsClient = await getInternalSavedObjectsClient(core);
const soClient = internalSavedObjectsClient as unknown as SavedObjectsClientContract;

const savedObjectsClient = await getInternalSavedObjectsClient(core);
const detectionMetrics = await getDetectionsMetrics({
signalsIndex,
esClient,
savedObjectsClient,
logger,
mlClient: ml,
});
return {
detectionMetrics:
(await fetchDetectionsMetrics(kibanaIndex, signalsIndex, esClient, soClient, ml)) || {},
detectionMetrics: detectionMetrics || {},
};
},
});
Expand Down
26 changes: 26 additions & 0 deletions x-pack/plugins/security_solution/server/usage/constants.ts
@@ -0,0 +1,26 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

/**
* We limit the max results window to prevent in-memory from blowing up when we do correlation.
* This is limiting us to 10,000 cases and 10,000 elastic detection rules to do telemetry and correlation
* and the choice was based on the initial "index.max_result_window" before this turned into a PIT (Point In Time)
* implementation.
*
* This number could be changed, and the implementation details of how we correlate could change as well (maybe)
* to avoid pulling 10,000 worth of cases and elastic rules into memory.
*
* However, for now, we are keeping this maximum as the original and the in-memory implementation
*/
export const MAX_RESULTS_WINDOW = 10_000;

/**
* We choose our max per page based on 1k as that
* appears to be what others are choosing here in the other sections of telemetry:
* https://github.com/elastic/kibana/pull/99031
*/
export const MAX_PER_PAGE = 1_000;

This file was deleted.

0 comments on commit 7c850dd

Please sign in to comment.