Integrate user feedback into Orion #72

AlexanderGeiger · 2019-12-17T09:44:31Z

One significant part of Orion is the user interaction, where users can annotate signals through MTV.
We can use these annotations to improve future anomaly detections.

A simple proposal of how this workflow could look like:

For each signal in the signalset of the datarun:
- run pipeline on signal and find anomalies
- get all known events that are related to the signal and have a annotation tag from database
- For each known event:
  - get the aggregated signal (in intermediate outputs) from the datarun where the known event was found
  - get the shape of the sequence that was marked as anomalous in the known event
  - compare this shape to aggregated signal of current datarun using a specified method (e.g. DTW) and check if some subsequence is significantly closer than others
  - if there is a similar sequence, add an event with source 'shape matching' and a corresponding annotation tag that is similar to the tag of the original event
  - if there is any anomaly that was found in the current datarun, which overlaps with the known event, remove it from the list of found anomalies
- add all remaining found anomalies as an event with source 'orion'

@sarahmish came up with a first skeleton of how we could implement that in Orion:

from orion.explorer import OrionExplorer

class OrionFeedback:
	""" this class manages the annotated events for a specified signal from 
	MTV and incorperates them back into Orion.

	should this be inhereted from the Orion Explorer?
	"""

	def execute_feedback(self, datarun, signal_id):
		""" this is the main method for feedback.

		this method loads the signal specified, fetches its known events,
		applies shape matching to the known events, then returns a resolved 
		list of labels for the signal.

		Attributes:
			- datarun: the datarun with the annotations
			- signal_id: the specific signal in the datarun which we are executing feedback
		"""

		signal = self.get_signal(signal_id)
		known_events = self.get_known_events(datarun, signal_id)

		for event in known_events:
			matched_events = self.shape_matching(signal, event)
			if self.overlap(matched_events, known_events):
				# priority: user > orion
				#			user > shape_matching
				#			shape_matching > orion
				#
				# if same priority: 
				#			user ? user
				#			shape_matching.match_score ? shape_matching.match_score
				# 			keep the higher score
				#
				# or have overlap favour anomalies over normal labels generally.
				pass

		return matched_events + known_events

	def shape_matching(self, signal, segment, method="dtw"):
		""" this methods returns the similarity score between signal and segment

		Attribute:
			- signal: is the signal data
			- segment: is the segment we want to match
			- method: is the algorithm used
		"""
		pass

	def overlap(self, shapes):
		""" this methods returns shapes after removing overlapped components, by keeping
		the higher scored ones

		Attribute:
			- shapes: a list of tuples, where a tuple is composed of (value, score)
		"""

		# remove overlap
		found.sort(key=lambda x: x["cost"])
		no_overlap = []

		for first_shape in found:
		    first_range = range(first_shape["id"], first_shape["id"]+window)
		    
		    flag = True
		    for second_shape in no_overlap:
		        second_range = range(second_shape["id"], second_shape["id"]+window)
		        
		        xs = set(first_range)
		        if len(xs.intersection(second_range)) > 0:
		            flag = False
		            
		    if flag:
		        no_overlap.append(first_shape)

	# helper functions
	def get_signal(self, signal):
		""" get signal from mongodb
		"""

		# exists in OrionExplorer
		pass

	def get_pipeline(self, pipeline):
		""" get pipeline from mongodb
		"""

		# exists in OrionExplorer
		pass

	def get_known_events(self, signal):
		""" get registered events for a particular signal in a given datarun 
		from mongodb
		"""
		pass

Some points that should be discussed:

Should this class be a subclass of the OrionExplorer, since it requires many of the same functionality?
How do we handle cases where multiple users annotated a sequence with different labels?
Should we use raw or aggregated signals for the shape matching?
What methods can be used for the shape matching besides DTW? Based on user annotations, can we use supervised (and maybe online) Machine Learning methods for subsequence classification?

The text was updated successfully, but these errors were encountered:

sarahmish · 2019-12-19T01:46:01Z

By sequential order:

the feedback class benefits a lot from the functionality of OrionExplorer (almost all the helper functions). However, In the proposed form of the class, we treat feedback as merely a general function that binds together existing methods (except shape matching). Expanding on the same point, since shape matching is also required by MTV, maybe we should let the feedback be a part of the OrionExplorer.
if such a conflict occurs, we can handle the case in three ways:
- handle each user separately without merging, this will return a little information for Orion.
- use all annotations by users and take the most severe case to be the "true" case.
- use all annotations by users whilst having a ranking for users; e.g. if user A has a higher rank than user B, and the annotations are in contradictory form, the feedback will use the annotation made by user A.
this is a valid concern in general, if we implement a shape matching algorithm on the raw data whilst displaying the aggregated version, the user could be confused. Although, applying shape matching on raw data could provide interesting results.
I think a basic shape matching algorithm will suffice in this phase of the feedback; eventually when the model is executed again, it should take the pattern of the newly labelled data into consideration and marking other similar shapes with the same tag. I believe another model in this part will increase the error rate of the consequent model.

AlexanderGeiger mentioned this issue Dec 17, 2019

Shape matching API #73

Open

sarahmish added the under discussion label Dec 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate user feedback into Orion #72

Integrate user feedback into Orion #72

AlexanderGeiger commented Dec 17, 2019

sarahmish commented Dec 19, 2019

Integrate user feedback into Orion #72

Integrate user feedback into Orion #72

Comments

AlexanderGeiger commented Dec 17, 2019

sarahmish commented Dec 19, 2019