<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Label-Correctly-Oriented-Frame-Ranges" data-toc-modified-id="Label-Correctly-Oriented-Frame-Ranges-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Label Correctly Oriented Frame Ranges</a></span><ul class="toc-item"><li><span><a href="#Usage" data-toc-modified-id="Usage-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Usage</a></span></li></ul></li><li><span><a href="#Prepare-Train/Validation-Datasets" data-toc-modified-id="Prepare-Train/Validation-Datasets-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Prepare Train/Validation Datasets</a></span><ul class="toc-item"><li><span><a href="#Usage:" data-toc-modified-id="Usage:-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Usage:</a></span></li></ul></li><li><span><a href="#Fit-and-Evaluate-the-Flip-Classifier-Model" data-toc-modified-id="Fit-and-Evaluate-the-Flip-Classifier-Model-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Fit and Evaluate the Flip Classifier Model</a></span><ul class="toc-item"><li><span><a href="#Instructions" data-toc-modified-id="Instructions-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Instructions</a></span></li></ul></li><li><span><a href="#Correct-Extracted-Dataset-Using-Train-Flip-Classifer-Model" data-toc-modified-id="Correct-Extracted-Dataset-Using-Train-Flip-Classifer-Model-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Correct Extracted Dataset Using Train Flip Classifer Model</a></span><ul class="toc-item"><li><span><a href="#Instructions" data-toc-modified-id="Instructions-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Instructions</a></span></li></ul></li><li><span><a href="#Preview-Corrected-Sessions" data-toc-modified-id="Preview-Corrected-Sessions-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Preview Corrected Sessions</a></span></li></ul></div>

<center><h1>Flip Classifier Training Notebook</h1></center>

Flip classifiers are RandomForestClassifier models that MoSeq2-Extract uses to ensure that the mouse is always extracted with the mouse's nose pointing to the right and tail to the left. This notebook is a streamlined utility and guide for preparing data and training a model that handles your specific data acquisition use case.

To use this notebook, you must first extract some data using MoSeq2-Extract to use as training data for the flip classifier model. 100K frames is optimal for training the flip classifier. 

This can be an iterative process if your data contains large amounts of flips throughout the extractions. On your first iteration, it is acceptable to extract the data without a flip-classifier. After training a new flip classifier, you may apply it to your dataset to correct the flips without having to re-extract the data before going into the PCA step.

<center><img src="https://drive.google.com/uc?export=view&id=1cOwyen2Siy-_wJ1HcE0PmMUi3Lcgcwwa"></center>

## Label Correctly Oriented Frame Ranges

Use this interactive tool to build your training dataset for the flip classifier model. Select a range of frames and indentify whether the rodent is facing left or facing right. The range of frames are used to build your training set.

### Usage
1. Specify the the path to the input data folder in `input_dir`, and the path to the resulting flip classifier model.
2. Specify the maximum number of frames to use in the `max_frames` field, the default value is 1e5.
3. Specify the number of tail filter iterations in the `tail_filter_iters` field, the default value is 1.
4. Specify the size of the spatial median blur filter kernel size in the `space_filter_size` field, the default value is 3.
5. Select a sesseion from the dropdown menu.
6. Drag the slider to select a frame index to preview, or enter the frame number in the indicator located on the right side of the frame slider.
7. Click `Start Range` to starting selecting the range. Drag the slider to the end of the range. Click `Facing Left` (when the rodent's head is facing left) or `Facing Right` (when the rodent's head is facing right) to specify the correct orientation for the range of frames. After specifying the orientation, the selected frames will be added to the dataset used to train the model. **The list on the right side will indicate the specified direction, session name, and selected frame range that are added to the training set.** . 
8. Click `Cancel Select` to cancel the selection if you are not satified with the selection.
The `Current Total Selected` section turns green when there are enough labled frames to train the model.
9. Select an unwanted seclection and click `Delete Selection` to remove the selection from the training set.

__Note__: If two frame ranges are selected with overlapping frames, the training set will only include the unique selected indices, removing duplicates. 

In [None]:
from moseq2_app.main import flip_classifier_tool

input_dir = './' # Specify the data folder
model_path = './flip-classifier-xx-1.pkl' ## e.g. ./flip-classifier-azure-ephys.pkl

max_frames = 1e5 # max number of frames to use (performance anecdotally saturates around 1e5)
tail_filter_iters = 1 # number of tail filter iterations
space_filter_size = 3 # size of the spatial median blur filter kernel size

continuous_slider_update = True # update the view as the slider values are updated
launch_gui = True # launches the frame selector gui

FF = flip_classifier_tool(input_dir=input_dir,
                          output_file=model_path,
                          max_frames=max_frames,
                          tail_filter_iters=tail_filter_iters,
                          space_filter_size=space_filter_size,
                          continuous_slider_update=continuous_slider_update,
                          launch_gui=launch_gui)
FF.interactive_launch_frame_selector()

Note: if your frame selection was interrupted for any reason, and you would like to relaunch the tool with all of your previously selected frame ranges, uncomment the code in the following cell and run the cell.

In [None]:
# FF.interactive_launch_frame_selector()

## Prepare Train/Validation Datasets

This cell splits your dataset into train/validation sets. 

### Usage:
1. Run the following cell to split your dataset into train/validation sets to train the flip classifier.
2. Specify the percentage for train/validation split in `test_size`, and the default value is 20, meaning 20\% of the data is used as validation dataset.
3. If you want to preview the trainign dataset, set `plot_examples` to `True`. The left column contains correctly oriented rodent examples (rodent's nose pointing to the right), and the right column contains incorrectly oriented rodent examples (rodent's nose pointing to the left).

In [None]:
test_size = 20 # percent train/validation split
plot_examples = False # Set plot_examples to True to display the training data.

FF.prepare_datasets(test_size, plot_examples=plot_examples)

## Fit and Evaluate the Flip Classifier Model
The following cell train a random forest classifier model with the data, determine the flip classifier's accuracy, and then save the model to your desired output path.

### Instructions

1. Specify the maximum depth of the tree, the default value is 6. Increase this value if your data includes larger amounts of variability and you want to increase model complexity. Variability can arise from obstructions, different rodent sizes, larger crop sizes, etc. **Please be mindful of over-fitting**.
2. Specify the number of parallel jobs to run `fit()` and `predict()`, the default value is 4.
3. Set the `train` variable to `True` if you want to train a new model with the selected data, otherwise it will only evaluate the model on the selected data.

In [None]:
max_depth = 6 
         
n_jobs = 4
verbose = 0 # levels of verbosity: [0, 1, 2]
train = True

FF.train_and_evaluate_model(n_jobs=n_jobs,
                            max_depth=max_depth,
                            verbose=verbose,
                            train=train)

## Correct Extracted Dataset Using Train Flip Classifer Model

Use a pre-trained flip classifier model to correct extractions in your dataset that may have frames where the rodent is incorrectly flipped. 
### Instructions
1. Specify the path in `frame_path` where frames are found in the h5 files, the default value is `'frames'`.
2. Set `write_movie` to `True` if you want to write new movies with the corrected frames.
3. Set `Verbose` to `True` if you want to display progress bars for each session.

In [None]:
chunk_size = 4000
frame_path = 'frames'
write_movie = True
verbose = False

FF.apply_flip_classifier(chunk_size=chunk_size,
                         frame_path=frame_path,
                         write_movie=write_movie,
                         verbose=verbose)

## Preview Corrected Sessions

In [None]:
from moseq2_app.main import preview_extractions

preview_extractions(input_dir, flipped=True)

***