# Overview of progress with adding multicam labeling to the DeepLabCut Napari Plugin

__Author:__ Jonas Bengt Carina Håkansson
__Date:__ 2022-09-01

## Introduction
This document covers the development done on the multicam branch of the DeepLabCut Napari Plugin during the 2022 DeepLabCut AI Residency.

## Goal
The idea behind this feature is to allow for multicam labeling for reconstructing 3D trajectories of labeled bodyparts. This goal is very inspired by [DLTdv by Ty Hedricks](linkt.to.dltdv.on.git).

## Several viewers
Napari does not currently support multicam viewing. Grid view is possible but it treats all views as one, meaning if you zoom in on camera 1, you can no longer see camera 2.

A workaround is to use several viewers (see those orthoview plugins) and to syncronize the content between the viewers.

## ```MultiViewControls``` (in _widgets.py)
Jessy Lauer implemented this class. In essence, it is a class that allows the viewers to reference each other for syncing frame number and for drawing epipolar lines.

This class also houses the function for launching more viewers, ```_open_viewers```.

### Note on dirty hack
The first version of this class populated the ```MultiViewControls._viewers``` of the first viewer with all the viewers when pushing the "open viewers" (name?) button to trigger the ```_open_viewers``` function. The ```MultiViewControls._viewers``` of the newly opened viewers did however not have a reference to the first, leading to difficulties when, for instance, drawing epipolar lines on the new viewers as they could not find the points on the other viewers since they couldn't find the viewers.

To solve this I used the garabge collector (should find link to stack overflow where I saw that) to copy ```MultiViewControls._viewers``` of the parent viewer int the same property of the newly opened viewers.

```
        # dirty hack to get the MultiViewControls of the newly launched viewers
        for object in gc.get_objects():
            if isinstance(object, MultiViewControls):
                object._viewers = self._viewers
```

This works but is not pretty and another solution should be found.

### Syncing frame number - ```_update_viewers)```
(Written by Jessy Lauer)
This function updates the frame number of all viewers based on the viewer on which the frame number changed first.

It is connected to the event of chaning frame number by
```viewer.dims.events.current_step.connect(self._update_viewers)``` in ```_open_viewers``` (both class functions of ```MultiViewControls```)

### Help layer (for epipolar lines)
In order to digitize multi-camera videos faster, we want epipoloar lines. These are lines that based on extrinsic camera calibrations and the image coordinates of a point in one camera, a line on which the corresponding point in another camera is drawn.

![Epipolar_linea](img/epilines.png)
*Illustration of epipolar lines, image from https://en.wikipedia.org/wiki/Epipolar_geometry*

To draw these in Napari, we use a shapes layer. This layer is created upon insertion of labeled points (```KeypointControls.on_insert```). Upon creation, a few things are added to its metadata. Most importantly, the camera calibration. Currently, the calibration (from EasyWand) had to be stored in the folder of the labeled data. It is found by parsing the files in that folder for a file containing "dltCoefs.csv".

Code snippet creating the help_layer in ```KeypointControls.on_insert```:
```
            # 0, 0, 3 seems to mean that the layer is 3 dimensional (x,y,framenumber), 0,0,2 makes a 2 dimensional layer
            self.help_layer = self.viewer.add_shapes(np.empty((0, 0, 3)), name="Help lines")
                        
            # attaching extrinsic calibration to KeypointControls
            if self.viewer.title.__contains__("Camera "):
                self.help_layer.metadata["viewers"] = self._multiview_controls._viewers
                self.help_layer.metadata["camera_number"] = int(re.findall(r'\d+',self.viewer.title)[-1])
                self.help_layer.metadata["calibration_type"] ="unknown"
                self.help_layer.metadata["extrinsic_calibration_coefficients"] = []
                self.help_layer.metadata["point_layer"] = layer

                for file in os.listdir(layer.metadata["root"]):
                    if file.__contains__("dltCoefs.csv"):
                        self.help_layer.metadata["calibration_type"] = "DLTdv"
                        self.help_layer.metadata["extrinsic_calibration_coefficients"] = pd.read_csv(
                            os.path.join(layer.metadata["root"],
                            file), header=None).values[:, self.help_layer.metadata["camera_number"]-1]
```

### Epipolar lines (```_initiate_ep_lines```)
At this stage, the epipolar lines are initiated by clicking the "Draw EP lines" button. This should be handled during insertion of points in the future.

Let's call the viewer on which epipolar lines are being added "parent viewer". The viewers from which these lines are being calculated we'll call "kid viewers".

Upon initiation of the epipolar lines, one line is added per body part, per frame, and per child viewer.

So for three viewers in total, one parent, two kids, 10 frames, and 5 bodyparts, the parent would have 100 lines. On each frame, it would have two lines per body part, on from each kid.

On instances where a bodypart is not labeled on a kid viewer, the line is still initiated, only it has a length of zero ((0,0) to (0,0)). The reason for this is that we keep track of which line goes with which bodypart based on the order of the lines. In our example with five bodyparts and two kid viewers, the first five lines would correspond to bodypart 1 to 5 for kid 1. Lines 6-10 would correspond to bodypart 1 to 5 for kid 2.

On instances where a bodypart is labeled on a kid viewer, the followin paragraph details how the line is calculated.

### Calculate epipolar lines (```get_epipolar_line_shape``` and ```get_epipolar_line```)
These should probably be renamed. ```get_epipolar_line_shape``` takes the image coordinates of a point on a child viewer, and passes that point as well as the camera calibrations of both parent and kid (C2 for the parent and C1 for the kid, naming convention based on DLTdv). It passes these to ```get_epipolar_line```. The latter is ported completely from the corresponding function in DLTdv. It calculates the intercept and slope for the resulting epipolar line and returns them to ```get_epipolar_line_shape``` which uses them to construct a line as two points. Depending on the slope, these lines will start from either the top, bottom, or left edge of the canvas in the parent viewer. Likewise, it will end either in the top, bottom, or right edge of the parent viewer.

### Update epipolar lines when point changes ```_update_viewers_data```
If we change a point in one viewer, we want the corresponding epipolar lines to change in the other viewers.
This is handled by ```_update_viewers_data```. The main code of this functinon is concerned with deducing which point was actually changed, i.e. which frame number, and which bodyoart (bodypart number). Once that is figred out (based on which point is selected), the corresponding epipolar lines are easily found due to the logical structure in which they are stored, the relevant lines are calculated according to the section above.

Note that in order for Napari to detect the change to the shapes layer (i.e. that a point has changed), all the data of the layer is first copied to a temporary variable, the relevant line is changed, and the temporary variable is then passed back to the data of the shapes layer housing the lines.

## Conclusion
These two main features, the ability to show several camera views at once and to calculate epipolar lines based on point image coordinates, are a solid first step to creating a multicam DeepLabCut compatible version of the DeepLabCut Napari Plugin. There is, however, much work to be done moving forward.

A few things that come to mind:
- Where should the calibration be housed? We cannot assume that every labeled group of videos (so videos depicting the same event) share extrinsic camera calibration. One DLC project might span videos from several sites. We need a system for this. Housing a copy of the relevant calibration in each labeled data folder is not terrible, the files are rather small. It just feels like a bit of a brutal solution, although though be fair, brutaliam has its elegance too.
- Handle videos with different numbers of frames. It is conceivable that users will want to analyze videos for which the animal is of view for portions of a subset of the videos, this can lead to a mismatch between the frame numbers present in the different videos. This can be solved by matching videos based on file names, as filenames of extracted frames typically include the frame number.
- Cropped videos. For videos where the animal only take up a small portion of the total image, users might want to crop the video before labeling. In my work, this is often done to save space and to improve performance when, for instance, labeling frames via a remote connection. The cropping information, so the position of the relevant corner of the cropping rectangle, is needed for the calibration of the camera to make sense. This should be an easy fix, I already use something similar in my work and would only need to port some straighforward MATLAB code for the epipolar lones to work on cropped frames.
- Camera distortion. The methods I've implemented for the epipolar lines do not consder distortion of the images. This will become relevan for wider lenses. There are methods for this in Ty Hedrick's DLTdv and porting them should not be a big task.
- Controls. Currently the controls for selecting frame, current bodypart, and so on, are all housed seperately on each viewer. This shoould be changed so that only one viewer has the controls, or so that they are housed in a freely floating window. Look at DLTdv for inspiration.
- Launch viewers and populate epipolar layer. Ideally, there should be a way to launch the right number of viewers and populate the epipolar line layers imideatly. This highlights a need for grouping videos into bouts. More on this at the next point.
- Video grouping. From personal experience, we unfortnately cannot assume that the videos that go together share the trunk of the filename (like "event12_camera1, event12_camera2"). Sometimes videos depicting the same event will have different video number due to mishaps during recording. This could be handled by some type of metadata file in each video folder in labeled-data (good place to put calibrations and cropping info too).
- Launch viewers and populate epipolar layer, continued. If we indicate which videos go together via metadata, then we could modify the reader so that when this metadata is encountered and suggests related videos, Napari asks the user if they want to open all related videos, upon clicking yes, the appropriate number of viewers could be launched and supplied with the right data, after which epipolar line population would be easy.
- If more than two cameras, epipolar lines should be replaced with predicted postion on images based on reprojection. At least this should be an option. Again, see how DLTdv handles this.
