[ML] Allow users to annotate ML anomaly results #33376

droberts195 · 2018-09-04T12:35:03Z

It would be nice if a user who knows the reason for an interesting feature in the ML results to be able to annotate this.

To enable this we could add a new annotation result type, similar to this:

    {
      "job_id": "it-ops-metrics",
      "result_type": "annotation",
      "timestamp": 1454944200000,
      "end_timestamp": 1454946000000,
      "annotation": "Datacenter was isolated for failover testing",
      // The following is optional
      "detector_index": 1
    }

By making the detector index optional, the annotation can apply to either the whole job or just a specific detector.

Unlike other ML results, instead of timestamp and bucket_span annotations have timestamp and end_timestamp so that the annotation can span an arbitrary time period.

Originally it was thought that the same functionality could be used by the ML C++ code to add reasons why it created results, but the current thinking is that it is better to have separate functionality for the two use cases, hence elastic/ml-cpp#197 has been raised to discuss labelling by the C++ code.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-09-04T12:35:04Z

Pinging @elastic/ml-core

tveasey · 2018-09-04T15:20:12Z

We should think about structured input the user can provide here as well, the simplest thing would be something to the effect of "this is important" or "this is not important". It is much harder to infer things from free text input if we start using this to provide supervision to our models (although having both is no bad thing).

richcollier · 2018-09-04T17:01:05Z

maybe this is also a solution for this ER? https://github.com/elastic/enhancements/issues/3322

droberts195 · 2018-10-19T13:35:58Z

We should think about structured input the user can provide here as well

We talked about this feature in the roadmap discussion in Dublin and decided that initially this will just be annotations by humans for human consumption.

If we ever add user feedback to provide supervision to our models then we can either extend the format or add a new result type to store this feedback separately.

droberts195 · 2018-11-06T10:03:30Z

Some possibilities based on a discussion with @peteharverson are:

Allow global annotations, i.e. no job_id field
Allow annotations that apply to multiple jobs, i.e. job_id is an array with multiple values
Allow annotations that apply to a job group, i.e. allow job_group to be specified instead of job_id

If we allow groups we need to decide how and when they'll be expanded to specific job IDs.

This makes the deletion logic quite complicated - annotations should be deleted when all jobs they relate to are deleted, but the wide variety of ways of specifying which jobs an annotation relates to makes determination of this complex.

sophiec20 · 2018-11-06T10:03:53Z

I wonder if detector_index might also be a optional one?

droberts195 · 2018-11-06T10:32:27Z

I wonder if detector_index might also be a optional one?

We discussed this and decided to have detector_index as the only option, instead of by/partition field values.

droberts195 · 2018-11-06T10:34:11Z

We also decided that it would be useful to have persistent annotations, that don't get deleted if all the jobs they relate to get deleted. This would be useful for the case of repeatedly deleting and recreating a job to try out slightly different configurations.

So there should be an is_persistent boolean field in the annotations too.

droberts195 · 2018-11-07T09:57:01Z

The other thing we discussed is editing annotations. It will be incredibly frustrating for users if they are not allowed to edit annotations, for example, after making a typo.

This leads on to the question of who can edit which annotations. Ideally there would be object level security such that each user could only edit annotations they added, unless the owning user opened up the permissions to more users. However, we do not currently have object level security. (The same problem applies to job configs for example.)

We should not introduce a form of object level security purely for annotations. Therefore we decided that if you can edit annotations you can edit any annotations.

There is also the question of whether creating and editing annotations should be an admin or user level privilege. Most write actions require admin privileges. However, that would cripple annotations as it's likely they'll be most useful to people looking at the results, and these people do not require admin privileges. Therefore we decided that the machine_learning_user role should allow adding and editing of annotations. (The actions could still be admin actions, but the machine_learning_user role definition should incorporate them. Alternatively the actions could be "monitor" actions, in which case no changes would be required to the definition of machine_learning_user. This is a low level implementation decision that can be deferred.)

walterra · 2018-11-16T11:13:09Z

Do we want to support additional metadata similar to timestamp/end_timestamp, for example a value range? This would allow annotations based on time and value like in this demo:

droberts195 · 2018-11-16T12:03:58Z

for example a value range

I think this only makes sense for an annotation tied to a single job and single detector_index. Therefore the value range would have to be optional, and the UI would have to be capable of rendering annotations that didn't have a value range.

I'm not sure the extra complexity is worth it, but happy to go with other opinions on this if others think it is worthwhile.

droberts195 · 2018-11-16T12:09:05Z

One more problem that emerged is which index to store annotations in when they apply to multiple jobs. The original description of this issue suggested simply a "new result type", but of course different jobs could be using different results indices, so which index should store an annotation that applies to multiple jobs?

Some options are:

Annotations always go in .ml-anomalies-shared no matter which job(s) they relate to
Annotations are considered metadata rather than results, so go in .ml-meta
We have a completely new index, .ml-annotations for annotations

The third option is nice because it makes it easier to grant write access to annotations to the machine_learning_user role without giving it write access to any other type of ML internal data.

walterra · 2018-11-20T16:34:57Z

The current state of discussion is that APIs will be exposed via Kibana only, everything necessary to be able to view/add/edit annotations from within the UI should be doable via Kibana without requiring any custom endpoints on the ML java plugin side.
However, automatically generated annotations might be done via the java plugin, it's just that this functionality is internal only and doesn't need any API endpoints.
One thing to consider though: If we store annotations in a custom index like .ml-annotations with special requirements, then at some point an index template needs to be setup, probably via the elasticsearch plugin.

droberts195 · 2018-11-20T16:54:06Z

One thing to consider though: If we store annotations in a custom index like .ml-annotations with special requirements, then at some point an index template needs to be setup, probably via the elasticsearch plugin.

Yes, definitely we'll add this index in the ES code. We'll also need to add some sort of cleanup code on the ES side.

walterra · 2018-11-22T18:16:02Z

FYI, I wrote up some notes how TypeScript can help us with these kinds of format specifications: https://gist.github.com/walterra/1dd998cf8d14e19f7dd90a98e4d4eb71#file-typescript_interface-md

The ML UI now provides the ability for users to annotate time periods with arbitrary text to add insight to what happened. This change makes the backend create the index for these annotations, together with read and write aliases to make future upgrades possible without adding complexity to the UI. It also adds read and write permission to the index for all ML users (not just admins). The spec for the index is in https://github.com/elastic/kibana/pull/26034/files#diff-c5c6ac3dbb0e7c91b6d127aa06121b2cR7 Relates elastic#33376 Relates elastic/kibana#26034

The ML UI now provides the ability for users to annotate time periods with arbitrary text to add insight to what happened. This change makes the backend create the index for these annotations, together with read and write aliases to make future upgrades possible without adding complexity to the UI. It also adds read and write permission to the index for all ML users (not just admins). The spec for the index is in https://github.com/elastic/kibana/pull/26034/files#diff-c5c6ac3dbb0e7c91b6d127aa06121b2cR7 Relates #33376 Relates elastic/kibana#26034

droberts195 · 2019-01-09T09:30:55Z

The first cut of annotations is in 6.6. Separate issues can be raised for future enhancements.

droberts195 added >feature :ml Machine learning team-discuss labels Sep 4, 2018

droberts195 mentioned this issue Sep 4, 2018

[ML] Ability to label anomalies elastic/ml-cpp#197

Closed

benwtrent mentioned this issue Nov 1, 2018

[ML] Determine when data is missing from a bucket due to Ingest latency #35131

Closed

walterra mentioned this issue Nov 19, 2018

[ML] Annotations elastic/kibana#25856

Open

88 tasks

walterra mentioned this issue Nov 21, 2018

[ML] User Annotations elastic/kibana#26034

Merged

3 tasks

droberts195 mentioned this issue Dec 17, 2018

[ML] Create the ML annotations index #36731

Merged

droberts195 added v6.6.0 and removed team-discuss labels Jan 9, 2019

droberts195 closed this as completed Jan 9, 2019

This was referenced Jun 11, 2020

[ML] Delete job should delete associated annotations #57976

Closed

[ML] Should we have an option for annotations to be retained after job deletion? #57977

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Allow users to annotate ML anomaly results #33376

[ML] Allow users to annotate ML anomaly results #33376

droberts195 commented Sep 4, 2018 •

edited

elasticmachine commented Sep 4, 2018

tveasey commented Sep 4, 2018

richcollier commented Sep 4, 2018

droberts195 commented Oct 19, 2018

droberts195 commented Nov 6, 2018 •

edited

sophiec20 commented Nov 6, 2018

droberts195 commented Nov 6, 2018

droberts195 commented Nov 6, 2018

droberts195 commented Nov 7, 2018

walterra commented Nov 16, 2018 •

edited

droberts195 commented Nov 16, 2018

droberts195 commented Nov 16, 2018 •

edited

walterra commented Nov 20, 2018

droberts195 commented Nov 20, 2018

walterra commented Nov 22, 2018

droberts195 commented Jan 9, 2019

[ML] Allow users to annotate ML anomaly results #33376

[ML] Allow users to annotate ML anomaly results #33376

Comments

droberts195 commented Sep 4, 2018 • edited

elasticmachine commented Sep 4, 2018

tveasey commented Sep 4, 2018

richcollier commented Sep 4, 2018

droberts195 commented Oct 19, 2018

droberts195 commented Nov 6, 2018 • edited

sophiec20 commented Nov 6, 2018

droberts195 commented Nov 6, 2018

droberts195 commented Nov 6, 2018

droberts195 commented Nov 7, 2018

walterra commented Nov 16, 2018 • edited

droberts195 commented Nov 16, 2018

droberts195 commented Nov 16, 2018 • edited

walterra commented Nov 20, 2018

droberts195 commented Nov 20, 2018

walterra commented Nov 22, 2018

droberts195 commented Jan 9, 2019

droberts195 commented Sep 4, 2018 •

edited

droberts195 commented Nov 6, 2018 •

edited

walterra commented Nov 16, 2018 •

edited

droberts195 commented Nov 16, 2018 •

edited