Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] DFA: ensure at least one field is included in analysis before job can be created #65320

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ describe('Data Frame Analytics: <CreateAnalyticsForm />', () => {
);

const euiFormRows = wrapper.find('EuiFormRow');
expect(euiFormRows.length).toBe(9);
expect(euiFormRows.length).toBe(10);

const row1 = euiFormRows.at(0);
expect(row1.find('label').text()).toBe('Job type');
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,13 @@ import {
} from '../../../../common/analytics';
import { shouldAddAsDepVarOption, OMIT_FIELDS } from './form_options_validation';

const requiredFieldsErrorText = i18n.translate(
'xpack.ml.dataframe.analytics.create.requiredFieldsErrorMessage',
{
defaultMessage: 'At least one field must be included in the analysis.',
}
);

export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, state }) => {
const {
services: { docLinks },
Expand Down Expand Up @@ -96,6 +103,7 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
numTopFeatureImportanceValuesValid,
previousJobType,
previousSourceIndex,
requiredFieldsError,
sourceIndex,
sourceIndexNameEmpty,
sourceIndexNameValid,
Expand Down Expand Up @@ -158,6 +166,8 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
};

const debouncedGetExplainData = debounce(async () => {
const jobTypeOrIndexChanged =
previousSourceIndex !== sourceIndex || previousJobType !== jobType;
const shouldUpdateModelMemoryLimit = !firstUpdate.current || !modelMemoryLimit;
const shouldUpdateEstimatedMml =
!firstUpdate.current || !modelMemoryLimit || estimatedModelMemoryLimit === '';
Expand All @@ -167,7 +177,7 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
}
// Reset if sourceIndex or jobType changes (jobType requires dependent_variable to be set -
// which won't be the case if switching from outlier detection)
if (previousSourceIndex !== sourceIndex || previousJobType !== jobType) {
if (jobTypeOrIndexChanged) {
setFormState({
loadingFieldOptions: true,
});
Expand All @@ -186,8 +196,21 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
setEstimatedModelMemoryLimit(expectedMemoryWithoutDisk);
}

const fieldSelection: FieldSelectionItem[] = resp.field_selection;

let hasRequiredFields = false;
if (fieldSelection) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this if part necessary? Asking because the assignment looks like it will always be an array. Maybe this whole part can even become something like:

const hasRequiredFields = resp.field_selection.some(field => field.is_included === true && field.is_required === false);

Copy link
Contributor Author

@alvarezmelissa87 alvarezmelissa87 May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping in an if statement is mostly an extra safety precaution for something going wrong silently in the backend or changes to it that would make field_selection undefined. Looking at the docs, there doesn't seem to be a guarantee that that field will always be populated. The for loop prevents us from having to create a function on every go through. Though in this case the optimization is probably not significant. 🤔

I think for now I'd lean toward keeping it though I'm happy to update if you feel strongly (the readability is a bit better with your suggestion)

Copy link
Contributor

@walterra walterra May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! Maybe we can then update the response type DfAnalyticsExplainResponse to something like field_selection?: FieldSelectionItem[]; so TypeScript will force us to consider that the field could be not returned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type updated in 4477264

for (let i = 0; i < fieldSelection.length; i++) {
const field = fieldSelection[i];
if (field.is_included === true && field.is_required === false) {
hasRequiredFields = true;
break;
}
}
}

// If sourceIndex has changed load analysis field options again
if (previousSourceIndex !== sourceIndex || previousJobType !== jobType) {
if (jobTypeOrIndexChanged) {
const analyzedFieldsOptions: EuiComboBoxOptionOption[] = [];

if (resp.field_selection) {
Expand All @@ -204,21 +227,24 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
loadingFieldOptions: false,
fieldOptionsFetchFail: false,
maxDistinctValuesError: undefined,
requiredFieldsError: !hasRequiredFields ? requiredFieldsErrorText : undefined,
});
} else {
setFormState({
...(shouldUpdateModelMemoryLimit ? { modelMemoryLimit: expectedMemoryWithoutDisk } : {}),
requiredFieldsError: !hasRequiredFields ? requiredFieldsErrorText : undefined,
});
}
} catch (e) {
let errorMessage;
if (
jobType === ANALYSIS_CONFIG_TYPE.CLASSIFICATION &&
e.message !== undefined &&
e.message.includes('status_exception') &&
e.message.includes('must have at most')
e.body &&
e.body.message !== undefined &&
e.body.message.includes('status_exception') &&
e.body.message.includes('must have at most')
) {
errorMessage = e.message;
errorMessage = e.body.message;
}
const fallbackModelMemoryLimit =
jobType !== undefined
Expand Down Expand Up @@ -321,6 +347,7 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
excludesOptions: [],
previousSourceIndex: sourceIndex,
sourceIndex: selectedOptions[0].label || '',
requiredFieldsError: undefined,
});
};

Expand Down Expand Up @@ -368,6 +395,9 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
forceInput.current.dispatchEvent(evt);
}, []);

const noSupportetdAnalysisFields =
excludesOptions.length === 0 && fieldOptionsFetchFail === false && !sourceIndexNameEmpty;

return (
<EuiForm className="mlDataFrameAnalyticsCreateForm">
<Messages messages={requestMessages} />
Expand Down Expand Up @@ -715,18 +745,31 @@ export const CreateAnalyticsForm: FC<CreateAnalyticsFormProps> = ({ actions, sta
</EuiFormRow>
</Fragment>
)}
<EuiFormRow
fullWidth
isInvalid={requiredFieldsError !== undefined}
error={
requiredFieldsError !== undefined && [
i18n.translate('xpack.ml.dataframe.analytics.create.requiredFieldsError', {
defaultMessage: 'Invalid. {message}',
values: { message: requiredFieldsError },
}),
]
}
>
<Fragment />
</EuiFormRow>
<EuiFormRow
label={i18n.translate('xpack.ml.dataframe.analytics.create.excludedFieldsLabel', {
defaultMessage: 'Excluded fields',
})}
isInvalid={noSupportetdAnalysisFields}
helpText={i18n.translate('xpack.ml.dataframe.analytics.create.excludedFieldsHelpText', {
defaultMessage:
'Select fields to exclude from analysis. All other supported fields are included.',
})}
error={
excludesOptions.length === 0 &&
fieldOptionsFetchFail === false &&
!sourceIndexNameEmpty && [
noSupportetdAnalysisFields && [
i18n.translate(
'xpack.ml.dataframe.analytics.create.excludesOptionsNoSupportedFields',
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ export const JobType: FC<Props> = ({ type, setFormState }) => {
previousJobType: type,
jobType: value,
excludes: [],
requiredFieldsError: undefined,
});
}}
data-test-subj="mlAnalyticsCreateJobFlyoutJobTypeSelect"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ export const validateAdvancedEditor = (state: State): State => {
createIndexPattern,
excludes,
maxDistinctValuesError,
requiredFieldsError,
} = state.form;
const { jobConfig } = state;

Expand Down Expand Up @@ -330,6 +331,7 @@ export const validateAdvancedEditor = (state: State): State => {

state.isValid =
maxDistinctValuesError === undefined &&
requiredFieldsError === undefined &&
excludesValid &&
trainingPercentValid &&
state.form.modelMemoryLimitUnitValid &&
Expand Down Expand Up @@ -397,6 +399,7 @@ const validateForm = (state: State): State => {
maxDistinctValuesError,
modelMemoryLimit,
numTopFeatureImportanceValuesValid,
requiredFieldsError,
} = state.form;
const { estimatedModelMemoryLimit } = state;

Expand All @@ -412,6 +415,7 @@ const validateForm = (state: State): State => {

state.isValid =
maxDistinctValuesError === undefined &&
requiredFieldsError === undefined &&
!jobTypeEmpty &&
!mmlValidationResult &&
!jobIdEmpty &&
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ export interface State {
numTopFeatureImportanceValuesValid: boolean;
previousJobType: null | AnalyticsJobType;
previousSourceIndex: EsIndexName | undefined;
requiredFieldsError: string | undefined;
sourceIndex: EsIndexName;
sourceIndexNameEmpty: boolean;
sourceIndexNameValid: boolean;
Expand Down Expand Up @@ -133,6 +134,7 @@ export const getInitialState = (): State => ({
numTopFeatureImportanceValuesValid: true,
previousJobType: null,
previousSourceIndex: undefined,
requiredFieldsError: undefined,
sourceIndex: '',
sourceIndexNameEmpty: true,
sourceIndexNameValid: false,
Expand Down