Skip to content

Annotations and Selections

Jean-Luc Stevens edited this page Jun 10, 2021 · 2 revisions

This page contains some notes that point towards a unified architecture between linked selections and annotators. There is no concrete plan yet though there are some interesting ideas:

Core insight

When using linked selections, you have to define a data selection which is then linked across plots. Data annotators also involve defining regions-of-interest (ROIs) in a multi-dimensional space but instead of executing a selection, the goal is to associate some metadata to one or more user-defined ROIs.

Both linked selections and generalized data annotators require an interactive way of defining an ROI. Both systems also need to display these ROIs to the user (these are called 'indicators' in the case of linked selections). In the case of linked selections, the defined ROI turns into a selection (dim) expression which defines the data contained within the ROI of interest.

This means that if selection expressions can define both the data selected and then map to an appropriate visual indicator, annotations could be represented as selection expressions paired with metadata (e.g tags, textual descriptions etc). Note that these are called generalized data annotators which are unlike the existing polydraw/polyedit annotators (streams) which are more interactive tools to define specialized ROIs (with some metadata) than the more general approach used by linked selections. The freehand draw stream is also a way to define an annotation that is not an ROI which is in term different from a generalized freehand annotator that can work across multiple plots in screen space.

New insight: For freehand, the 'ROI' is the origin/anchor point and the data is part of the annotation itself (i.e in the JSON blob)

Generalized process of making annotations

  1. User denote one or more ROI in multi-dimensional space (like linked selections/dim expressions)
  2. Widgets are then used to collect semantic information from the user (tags, description etc) to associate with these ROIs.
  3. The ROIs and their metadata need to be persisted/serialized.
  4. The persistent data can then be used for new visualizations, analysis dashboards, ML tasks etc

Possible components of a generalized system that may be reusable

  1. A representation of multiple ROIs/selection expressions that work across data types, holoviews elements and dimensionalities.
  2. A way to represent/serialize/store these ROIs in a persistent format (ideally a flat format)
  3. The interactive tools used to define ROIs (including the annotator streams mentioned above), other Bokeh tools etc.
  4. The code used to give ROIs a visual representation (i.e mapping them to visual indicators)