Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Adds validation of field selected for log pattern analysis #162319

Merged
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
9e290aa
[ML] Log pattern analysis field validation
jgowdyelastic Jul 20, 2023
bacf7d8
Merge branch 'main' into pattern-analysis-field-validation
jgowdyelastic Jul 20, 2023
5963371
conflicts after merge
jgowdyelastic Jul 20, 2023
db972cf
removing commented code
jgowdyelastic Jul 20, 2023
f174123
[CI] Auto-commit changed files from 'node scripts/lint_ts_projects --…
kibanamachine Jul 20, 2023
6478436
conflicts
jgowdyelastic Jul 20, 2023
93fecdc
fixing more conflicts
jgowdyelastic Jul 20, 2023
f401d4c
translation id
jgowdyelastic Jul 20, 2023
b1ea5a0
fixing query and cancel
jgowdyelastic Jul 20, 2023
c9e025a
making examples optional in the endpoint response
jgowdyelastic Jul 20, 2023
3868966
fixing optional examples
jgowdyelastic Jul 20, 2023
2a8e2d2
Merge branch 'main' into pattern-analysis-field-validation
jgowdyelastic Jul 20, 2023
f5801ae
text change
jgowdyelastic Jul 20, 2023
a42b7bb
Merge branch 'main' into pattern-analysis-field-validation
jgowdyelastic Jul 24, 2023
d801056
adding url state for field selection
jgowdyelastic Jul 24, 2023
34066a5
commented code
jgowdyelastic Jul 24, 2023
efa072c
auto select message field
jgowdyelastic Jul 24, 2023
9a970ff
Merge branch 'main' into pattern-analysis-field-validation
jgowdyelastic Jul 25, 2023
8000d37
variable rename
jgowdyelastic Jul 25, 2023
ed00ad1
missing commit
jgowdyelastic Jul 25, 2023
cc020ba
reseting validation results after form change
jgowdyelastic Jul 27, 2023
82bc81c
lots of patten typos
jgowdyelastic Jul 27, 2023
7944f53
api change
jgowdyelastic Jul 27, 2023
431238b
info text changed
jgowdyelastic Jul 27, 2023
9389219
plural patterns
jgowdyelastic Jul 27, 2023
383d92a
Merge remote-tracking branch 'origin/main' into pattern-analysis-fiel…
jgowdyelastic Jul 27, 2023
402d62a
removing accidentally touched file
jgowdyelastic Jul 27, 2023
ed77ab5
Merge remote-tracking branch 'origin/main' into pattern-analysis-fiel…
jgowdyelastic Jul 27, 2023
5864f72
Merge branch 'main' into pattern-analysis-field-validation
jgowdyelastic Jul 28, 2023
d02a6cc
Merge branch 'pattern-analysis-field-validation' of github.com:jgowdy…
jgowdyelastic Jul 28, 2023
40fb370
moving runtime mapping schema to package
jgowdyelastic Jul 28, 2023
914c43d
small changes based on review
jgowdyelastic Jul 28, 2023
141b182
[CI] Auto-commit changed files from 'node scripts/lint_ts_projects --…
kibanamachine Jul 28, 2023
3f10c3f
Merge branch 'main' into pattern-analysis-field-validation
jgowdyelastic Jul 28, 2023
04f6564
adding i18n tracking for runtime_field_utils
jgowdyelastic Jul 28, 2023
8903063
reverting runtimeMappingsSchema schema move
jgowdyelastic Jul 28, 2023
1de953b
jsdoc comments
jgowdyelastic Jul 28, 2023
092a206
[CI] Auto-commit changed files from 'node scripts/lint_ts_projects --…
kibanamachine Jul 28, 2023
8704072
translation id
jgowdyelastic Jul 28, 2023
922b8da
removing runtime_field_utils from translations
jgowdyelastic Jul 28, 2023
96a6c88
further clean up after code revert
jgowdyelastic Jul 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -79,3 +79,10 @@ export interface FieldExampleCheck {
*/
message: string;
}

export interface FieldValidationResults {
examples?: CategoryFieldExample[];
sampleSize: number;
overallValidStatus: CATEGORY_EXAMPLES_VALIDATION_STATUS;
validationChecks: FieldExampleCheck[];
}
1 change: 1 addition & 0 deletions x-pack/packages/ml/category_validator/index.ts
Expand Up @@ -11,6 +11,7 @@ export type {
CategoryFieldExample,
FieldExampleCheck,
Token,
FieldValidationResults,
} from './common/types/categories';
export {
CATEGORY_EXAMPLES_ERROR_LIMIT,
Expand Down
11 changes: 10 additions & 1 deletion x-pack/packages/ml/category_validator/src/examples.ts
Expand Up @@ -210,7 +210,8 @@ export function categorizationExamplesProvider(client: IScopedClusterClient) {
end: number,
analyzer: CategorizationAnalyzer,
runtimeMappings: RuntimeMappings | undefined,
indicesOptions: estypes.IndicesOptions | undefined
indicesOptions: estypes.IndicesOptions | undefined,
includeExamples = true
) {
const resp = await categorizationExamples(
indexPatternTitle,
Expand All @@ -229,6 +230,14 @@ export function categorizationExamplesProvider(client: IScopedClusterClient) {
const sampleSize = examples.length;
validationResults.createTokenCountResult(examples, sampleSize);

if (includeExamples === false) {
return {
overallValidStatus: validationResults.overallResult,
validationChecks: validationResults.results,
sampleSize,
};
}

// sort examples by number of tokens, keeping track of their original order
// with an origIndex property
const sortedExamples = examples
Expand Down
1 change: 1 addition & 0 deletions x-pack/plugins/aiops/common/api/index.ts
Expand Up @@ -15,6 +15,7 @@ import { streamReducer } from './stream_reducer';

export const AIOPS_API_ENDPOINT = {
LOG_RATE_ANALYSIS: '/internal/aiops/log_rate_analysis',
CATEGORIZATION_FIELD_VALIDATION: '/internal/aiops/categorization_field_validation',
} as const;

type AiopsApiEndpointKeys = keyof typeof AIOPS_API_ENDPOINT;
Expand Down
47 changes: 47 additions & 0 deletions x-pack/plugins/aiops/common/api/log_categorization/schema.ts
Expand Up @@ -6,6 +6,39 @@
*/

import { schema, TypeOf } from '@kbn/config-schema';
import { i18n } from '@kbn/i18n';
import { isRuntimeField } from '@kbn/ml-runtime-field-utils';

export const runtimeMappingsSchema = schema.object(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we should add runtimeMappingsSchema to ml-runtime-field-utils as we do use it in several plugins 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved in 40fb370

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted this move. I caused a 200KB bundle increase in ML and Transforms.

{},
{
unknowns: 'allow',
validate: (v: object) => {
if (Object.values(v).some((o) => !isRuntimeField(o))) {
return i18n.translate('xpack.aiops.invalidRuntimeFieldMessage', {
defaultMessage: 'Invalid runtime field',
});
}
},
}
);

export const indicesOptionsSchema = schema.object({
expand_wildcards: schema.maybe(
schema.arrayOf(
schema.oneOf([
schema.literal('all'),
schema.literal('open'),
schema.literal('closed'),
schema.literal('hidden'),
schema.literal('none'),
])
)
),
ignore_unavailable: schema.maybe(schema.boolean()),
allow_no_indices: schema.maybe(schema.boolean()),
ignore_throttled: schema.maybe(schema.boolean()),
});

export const categorizeSchema = schema.object({
index: schema.string(),
Expand All @@ -18,3 +51,17 @@ export const categorizeSchema = schema.object({
});

export type CategorizeSchema = TypeOf<typeof categorizeSchema>;

export const categorizationFieldValidationSchema = schema.object({
indexPatternTitle: schema.string(),
query: schema.any(),
size: schema.number(),
field: schema.string(),
timeField: schema.maybe(schema.string()),
start: schema.number(),
end: schema.number(),
analyzer: schema.maybe(schema.any()),
runtimeMappings: runtimeMappingsSchema,
indicesOptions: indicesOptionsSchema,
includeExamples: schema.boolean(),
});
19 changes: 16 additions & 3 deletions x-pack/plugins/aiops/public/application/utils/url_state.ts
Expand Up @@ -8,7 +8,6 @@
import type * as estypes from '@elastic/elasticsearch/lib/api/typesWithBodyKey';

import type { Filter, Query } from '@kbn/es-query';
import { isPopulatedObject } from '@kbn/ml-is-populated-object';

import { SEARCH_QUERY_LANGUAGE, SearchQueryLanguage } from './search_utils';

Expand Down Expand Up @@ -40,6 +39,20 @@ export const getDefaultAiOpsListState = (
...overrides,
});

export const isFullAiOpsListState = (arg: unknown): arg is AiOpsFullIndexBasedAppState => {
return isPopulatedObject(arg, Object.keys(getDefaultAiOpsListState()));
export interface LogCategorizationPageUrlState {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you manage to find out if it's possible to disable auto refresh on the page? It's good that we persist the field selection, but we lose the results when the page refreshes (inside ML).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find a way to do it. you can disable some parts of the time picker, but not the refresh checkbox.

pageKey: 'logCategorization';
pageUrlState: LogCategorizationAppState;
}

export interface LogCategorizationAppState extends AiOpsFullIndexBasedAppState {
field: string | undefined;
}

export const getDefaultLogCategorizationAppState = (
overrides?: Partial<LogCategorizationAppState>
): LogCategorizationAppState => {
return {
field: undefined,
...getDefaultAiOpsListState(overrides),
};
};
Expand Up @@ -24,7 +24,7 @@ import { Filter } from '@kbn/es-query';
import { useDiscoverLinks, createFilter, QueryMode, QUERY_MODE } from '../use_discover_links';
import { MiniHistogram } from '../../mini_histogram';
import { useEuiTheme } from '../../../hooks/use_eui_theme';
import type { AiOpsFullIndexBasedAppState } from '../../../application/utils/url_state';
import type { LogCategorizationAppState } from '../../../application/utils/url_state';
import type { EventRate, Category, SparkLinesPerCategory } from '../use_categorize_request';
import { useTableState } from './use_table_state';
import { getLabels } from './labels';
Expand All @@ -37,7 +37,7 @@ interface Props {
dataViewId: string;
selectedField: DataViewField | string | undefined;
timefilter: TimefilterContract;
aiopsListState: AiOpsFullIndexBasedAppState;
aiopsListState: LogCategorizationAppState;
pinnedCategory: Category | null;
setPinnedCategory: (category: Category | null) => void;
selectedCategory: Category | null;
Expand Down
Expand Up @@ -33,7 +33,7 @@ export const TableHeader: FC<Props> = ({
<EuiText size="s" data-test-subj="aiopsLogPatternsFoundCount">
<FormattedMessage
id="xpack.aiops.logCategorization.counts"
defaultMessage="{count} patterns found"
defaultMessage="{count} {count, plural, one {pattern} other {patterns}} found"
values={{ count: categoriesCount }}
/>
{selectedCategoriesCount > 0 ? (
Expand Down
@@ -0,0 +1,48 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import React, { FC } from 'react';
import {
FieldValidationResults,
CATEGORY_EXAMPLES_VALIDATION_STATUS,
} from '@kbn/ml-category-validator';

import { EuiCallOut } from '@elastic/eui';
import { i18n } from '@kbn/i18n';

interface Props {
validationResults: FieldValidationResults | null;
}

export const FieldValidationCallout: FC<Props> = ({ validationResults }) => {
if (validationResults === null) {
return null;
}

if (validationResults.overallValidStatus === CATEGORY_EXAMPLES_VALIDATION_STATUS.VALID) {
return null;
}

return (
<>
{validationResults !== null ? (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess this ternary can be removed since the same check is done above and returns early.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 914c43d

<EuiCallOut
color="warning"
title={i18n.translate('xpack.aiops.logCategorization.fieldValidationTitle', {
defaultMessage: 'The selected field is possibly not suitable for pattern analysis',
})}
>
{validationResults.validationChecks
.filter((check) => check.valid !== CATEGORY_EXAMPLES_VALIDATION_STATUS.VALID)
.map((check) => (
<div key={check.id}>{check.message}</div>
))}
</EuiCallOut>
) : null}
</>
);
};
Expand Up @@ -57,7 +57,7 @@ export const InformationText: FC<Props> = ({
<h2>
<FormattedMessage
id="xpack.aiops.logCategorization.emptyPromptTitle"
defaultMessage="Select a text field and click run categorization to start analysis"
defaultMessage="Select a text field and click run pattern analysis to start analysis"
/>
</h2>
}
Expand Down
Expand Up @@ -21,6 +21,7 @@ import {
import { buildEmptyFilter, Filter } from '@kbn/es-query';

import { usePageUrlState } from '@kbn/ml-url-state';
import type { FieldValidationResults } from '@kbn/ml-category-validator';
import { useData } from '../../hooks/use_data';
import { useSearch } from '../../hooks/use_search';
import { useCategorizeRequest } from './use_categorize_request';
Expand All @@ -32,11 +33,12 @@ import { createMergedEsQuery } from '../../application/utils/search_utils';
import { SamplingMenu } from './sampling_menu';
import { TechnicalPreviewBadge } from './technical_preview_badge';
import { LoadingCategorization } from './loading_categorization';
import { useValidateFieldRequest } from './use_validate_category_field';
import {
type AiOpsPageUrlState,
getDefaultAiOpsListState,
isFullAiOpsListState,
type LogCategorizationPageUrlState,
getDefaultLogCategorizationAppState,
} from '../../application/utils/url_state';
import { FieldValidationCallout } from './category_validation_callout';

export interface LogCategorizationPageProps {
dataView: DataView;
Expand All @@ -60,14 +62,21 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({
},
uiSettings,
} = useAiopsAppContext();

const { runValidateFieldRequest, cancelRequest: cancelValidationRequest } =
useValidateFieldRequest();
const { euiTheme } = useEuiTheme();
const { filters, query } = useMemo(() => getState(), [getState]);

const mounted = useRef(false);
const { runCategorizeRequest, cancelRequest, randomSampler } = useCategorizeRequest();
const [aiopsListState] = usePageUrlState<AiOpsPageUrlState>(
'AIOPS_INDEX_VIEWER',
getDefaultAiOpsListState({
const {
runCategorizeRequest,
cancelRequest: cancelCategorizationRequest,
randomSampler,
} = useCategorizeRequest();
const [stateFromUrl] = usePageUrlState<LogCategorizationPageUrlState>(
'logCategorization',
getDefaultLogCategorizationAppState({
searchQuery: createMergedEsQuery(query, filters, dataView, uiSettings),
})
);
Expand All @@ -80,6 +89,14 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({
categories: Category[];
sparkLines: SparkLinesPerCategory;
} | null>(null);
const [fieldValidationResult, setFieldValidationResult] = useState<FieldValidationResults | null>(
null
);

const cancelRequest = useCallback(() => {
cancelValidationRequest();
cancelCategorizationRequest();
}, [cancelCategorizationRequest, cancelValidationRequest]);

useEffect(
function cancelRequestOnLeave() {
Expand All @@ -94,7 +111,7 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({

const { searchQueryLanguage, searchString, searchQuery } = useSearch(
{ dataView, savedSearch: selectedSavedSearch },
aiopsListState,
stateFromUrl,
true
);

Expand All @@ -109,7 +126,8 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({
);

const loadCategories = useCallback(async () => {
const { title: index, timeFieldName: timeField } = dataView;
const { getIndexPattern, timeFieldName: timeField } = dataView;
const index = getIndexPattern();

if (selectedField === undefined || timeField === undefined) {
return;
Expand All @@ -119,20 +137,35 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({

setLoading(true);
setData(null);
setFieldValidationResult(null);

try {
const { categories, sparkLinesPerCategory: sparkLines } = await runCategorizeRequest(
index,
selectedField.name,
timeField,
earliest,
latest,
searchQuery,
intervalMs
);
const [validationResult, categorizationResult] = await Promise.all([
runValidateFieldRequest(
index,
selectedField.name,
timeField,
earliest,
latest,
searchQuery
),
runCategorizeRequest(
index,
selectedField.name,
timeField,
earliest,
latest,
searchQuery,
intervalMs
),
]);

if (mounted.current === true) {
setData({ categories, sparkLines });
setFieldValidationResult(validationResult);
setData({
categories: categorizationResult.categories,
sparkLines: categorizationResult.sparkLinesPerCategory,
});
}
} catch (error) {
toasts.addError(error, {
Expand All @@ -149,10 +182,11 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({
dataView,
selectedField,
cancelRequest,
runCategorizeRequest,
runValidateFieldRequest,
earliest,
latest,
searchQuery,
runCategorizeRequest,
intervalMs,
toasts,
]);
Expand Down Expand Up @@ -217,6 +251,8 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({
</EuiFlexGroup>
</EuiFlyoutHeader>
<EuiFlyoutBody data-test-subj="mlJobSelectorFlyoutBody">
<FieldValidationCallout validationResults={fieldValidationResult} />
peteharverson marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, looks like the callout isn't getting displayed if no categories are found - for example with field6 from the categorization_functional_test data set. In this case the field is populated so the message here is misleading:

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this example, the field does not produce any warnings because the data is tokenized correctly but it also does not produce any categories. I suspect this is because every doc contains the same data.


{loading === true ? <LoadingCategorization onClose={onClose} /> : null}

<InformationText
Expand All @@ -226,13 +262,10 @@ export const LogCategorizationFlyout: FC<LogCategorizationPageProps> = ({
fieldSelected={selectedField !== null}
/>

{loading === false &&
data !== null &&
data.categories.length > 0 &&
isFullAiOpsListState(aiopsListState) ? (
{loading === false && data !== null && data.categories.length > 0 ? (
<CategoryTable
categories={data.categories}
aiopsListState={aiopsListState}
aiopsListState={stateFromUrl}
dataViewId={dataView.id!}
eventRate={eventRate}
sparkLines={data.sparkLines}
Expand Down