Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Trained models list: disables 'View training data' action if data frame analytics job no longer exists #171061

Merged
merged 8 commits into from Nov 21, 2023

Conversation

alvarezmelissa87
Copy link
Contributor

@alvarezmelissa87 alvarezmelissa87 commented Nov 10, 2023

Summary

Fixes #167667, disabling the 'View training data' action for models in the Trained Models list if the data frame analytics job which created the model no longer exists

Adds origin_job_exists property to trained models list model items.
This is set during the models fetch for models with associated data frame analytics jobs.

Checklist

Delete any items that are not applicable to this PR.

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@peteharverson peteharverson requested review from jgowdyelastic and walterra and removed request for darnautov November 13, 2023 09:53
@@ -98,6 +98,7 @@ export type TrainedModelConfigResponse = estypes.MlTrainedModelConfig & {
* Associated pipelines. Extends response from the ES endpoint.
*/
pipelines?: Record<string, PipelineDefinition> | null;
origin_job_exists?: boolean;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be optional route param. We should always perform this check when retrieving the models as it's useful information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 7c06abd

x-pack/plugins/ml/server/routes/trained_models.ts Outdated Show resolved Hide resolved
x-pack/plugins/ml/server/routes/trained_models.ts Outdated Show resolved Hide resolved
@@ -142,6 +142,7 @@ export function useModelActions({
icon: 'visTable',
type: 'icon',
available: (item) => !!item.metadata?.analytics_config?.id,
enabled: (item) => item.origin_job_exists === true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a tooltip to show why the action is disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 6d8a4e5
Happy to get suggestions on the copy to use.
image

@alvarezmelissa87
Copy link
Contributor Author

This is ready for another look when you get a chance 🙏 cc @peteharverson, @jgowdyelastic

@alvarezmelissa87
Copy link
Contributor Author

@elasticmachine merge upstream

@peteharverson peteharverson changed the title [ML] Trained models list: disable 'View training data' action if DFA job no longer exists [ML] Trained models list: disable 'View training data' action if data frame analytics job no longer exists Nov 14, 2023
const jobIds = result.map((model) => {
let id = model.metadata?.analytics_config?.id;
if (id) {
id = `${id}*`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be modifying the IDs. This could lead to unexpected behaviour. e.g. if we have two jobs foo and foo2. wildcarding foo* will match both.

In my previous suggestion I overlooked the fact that comma separating the IDs to pass to getDataFrameAnalytics will throw a 404 if any of them are missing.

I still don't think we should call getDataFrameAnalytics in a loop and so I think the simplest solution would be to fetch all DFA jobs and just loop over them looking for the IDs we want.
That call isn't expensive. We could also be clever and only use the job ID if there is only one model in the results. As it's likely this endpoint will be called with either one model ID or no model IDs.

I had a go at writing up what I'm thinking. I got carried away trying to make it performant.

const filteredModels = filterForEnabledFeatureModels<TrainedModelConfigResponse>(
  result,
  getEnabledFeatures()
);
const dfaJobIdMap = filteredModels.reduce<Record<string, string>>((c, m) => {
  const id = m.metadata?.analytics_config?.id;
  if (id !== undefined) {
    c[m.model_id] = id;
  }
  return c;
}, {});

const jobIds = Object.values(dfaJobIdMap);

if (jobIds.length === 0) {
  // return early, there are no dfa jobs
  return response.ok({
    body: filteredModels,
  });
}

let dfaJobs: estypes.MlDataframeAnalyticsSummary[] = [];
try {
  const jobs =
    jobIds.length === 1
      ? await mlClient.getDataFrameAnalytics({
          id: jobIds[0],
        })
      : await mlClient.getDataFrameAnalytics();

  dfaJobs = jobs.data_frame_analytics;
} catch (e) {
  //
}

for (const model of filteredModels) {
  const dfaJob = dfaJobs.find((j) => j.id === dfaJobIdMap[model.model_id]);
  model.origin_job_exists = dfaJob !== undefined;
}

return response.ok({
  body: filteredModels,
});

Also updating filterForEnabledFeatureModels to add a generic type to avoid type issues

export function filterForEnabledFeatureModels<
  T extends TrainedModelConfigResponse | estypes.MlTrainedModelConfig
>(models: T[], enabledFeatures: MlFeatures) {
  ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed this offline and can't think of a situation where adding * to each job might cause the wrong job to be matched.
I'm still not a fan of adding these * characters to work around the fact the es endpoint will throw a 404 if one job can't be found. But there's not a compelling reason to demand this change.

} = useMlKibana();

const handleClick = async () => {
if (item.metadata?.analytics_config === undefined) return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could this be a check probably with a type guard so we can avoid the as casting later on? (Saw these lines were copied as is but maybe it's an easy fix)

x-pack/plugins/ml/server/routes/trained_models.ts Outdated Show resolved Hide resolved
});

jobs.forEach(({ id }) => {
const model = result.find(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to explicitly set to true or false, not just true for all dfa models.

Suggested change
const model = result.find(
if (m?.analytics_config?.id !== undefined) {
// if this is a dfa model, set origin_job_exists
model.origin_job_exists = result.find((m) => id === m.analytics_config.id) !== undefined;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would only set it to true if a job id returned from the check matched the job id on the model. But agree that we should be setting it explicitly to false instead of just not adding the property at all and relying on falsey-ness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to set explicitly in 50ffcfd

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly yeah,
origin_job_exists === undefined -> this is not a DFA model
origin_job_exists === true -> this a DFA model and the job exists
origin_job_exists === false -> this is a DFA model and the job no longer exits.
Falsey-ness is evil :D

// Swallow error to prevent blocking trained models result
}

const filteredModels = filterForEnabledFeatureModels(result, enabledFeatures);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can loop over the filteredModels rather than result to remove the need to explicitly check for enabledFeatures.dfa as all dfa jobs will have been removed if dfa is not enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - that makes sense - updated in 50ffcfd

allow_no_match: true,
});

jobs.forEach(({ id }) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be looping through models, not jobs. We need to set false for all job ids that aren't found, and by looping through jobs only we won't know if the jobs hasn't been foun

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the job hasn't been returned then the origin_job_exists property wouldn't be added to the model - though if we want to explicitly set it to 'false' when it's not found then I agree that looping through the models makes more sense.

I was originally thinking that it would be more efficient to just loop through the returned jobs since that would mean maybe we wouldn't need to go through all the models but happy to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to loop through the filtered models to ensure all dfa models get the origin_job_exists property set explicitly in 50ffcfd

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally thinking that it would be more efficient to just loop through the returned jobs

I was actually thinking that to make this more efficient we could keep a list of the dfa models when identifying them in the reduce and only loop through those later on. e.g.

const dfaModels = [];
const jobIdsString = filteredModels.reduce(
  (jobIdsStr: string, currentModel: TrainedModelConfigResponse, idx: number) => {
    if (isTrainedModelConfigResponse(currentModel)) {
      dfaModels.push(currentModel);
      ....

But it's not needed, the time difference will be milliseconds

const filteredModels = filterForEnabledFeatureModels(result, getEnabledFeatures());

try {
// @ts-ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The types can be fixed by updating filterForEnabledFeatureModels to use a generic type. (I suggested this is a previous comment when we were thinking of the other implementation)
As result is TrainedModelConfigResponse and not estypes.MlTrainedModelConfig

export function filterForEnabledFeatureModels<
  T extends TrainedModelConfigResponse | estypes.MlTrainedModelConfig
>(models: T[], enabledFeatures: MlFeatures) {
  ...

This also means you don't need the isTrainedModelConfigResponse type guard later on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - good catch - updated in 8c7094d

@alvarezmelissa87
Copy link
Contributor Author

@elasticmachine merge upstream

const jobIdsString = filteredModels.reduce((jobIdsStr, currentModel, idx) => {
let id = currentModel.metadata?.analytics_config?.id ?? '';
if (id !== '') {
id = `${idx > 0 ? ',' : ''}${id}*`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a bug where if idx is > 0 but no dfa jobs have yet to be found it'll add a comma to the front of the string which then causes the loading of dfa jobs to fail.
Rather than looking for idx > 0, it should to checking to see if jobIdsStr is empty.
I know it was my suggestion to use a reduce but I think it was bad advice as it has caused this bug due to the code being hard to read. For the sake of easy to read code maybe we should revert back to a map, filter, join.

Something like

const jobIds = filteredModels
  .map((m) => m.metadata?.analytics_config?.id)
  .filter(isDefined)
  .map((id) => `${id}*`);

if (jobIds.length) {
  const { data_frame_analytics: jobs } = await mlClient.getDataFrameAnalytics({
    id: jobIds.join(','),
    allow_no_match: true,
  });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree - updated in 73216d0

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ml 3.6MB 3.6MB +74.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @alvarezmelissa87

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested latest changes and LGTM

Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alvarezmelissa87 alvarezmelissa87 merged commit 39af788 into elastic:main Nov 21, 2023
27 checks passed
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Nov 21, 2023
@alvarezmelissa87 alvarezmelissa87 deleted the ml-dfa-view-training branch November 21, 2023 15:50
@szabosteve szabosteve changed the title [ML] Trained models list: disable 'View training data' action if data frame analytics job no longer exists [ML] Trained models list: disables 'View training data' action if data frame analytics job no longer exists Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:3rd Party Models ML 3rd party models :ml release_note:enhancement v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] View training data should be disabled if DFA job no longer exists
7 participants