<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Label-Correctly-Oriented-Frame-Ranges" data-toc-modified-id="Label-Correctly-Oriented-Frame-Ranges-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Label Correctly Oriented Frame Ranges</a></span><ul class="toc-item"><li><span><a href="#Widget-Guide" data-toc-modified-id="Widget-Guide-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Widget Guide</a></span></li><li><span><a href="#Instructions" data-toc-modified-id="Instructions-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Instructions</a></span></li></ul></li><li><span><a href="#Prepare-Train-Test-Datasets" data-toc-modified-id="Prepare-Train-Test-Datasets-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Prepare Train-Test Datasets</a></span></li><li><span><a href="#Fit-and-Evaluate-the-Flip-Classifier-Model" data-toc-modified-id="Fit-and-Evaluate-the-Flip-Classifier-Model-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Fit and Evaluate the Flip Classifier Model</a></span></li></ul></div>

<center><h1>Flip Classifier Training Notebook</h1></center>

Flip classifiers are RandomForestClassifier models that MoSeq2-Extract uses to ensure that the mouse is always extracted facing east. This notebook is a streamlined utility and guide for preparing data and training a model that handles your specific data acquisition use case.

To use this notebook, you must first extract some data using MoSeq2-Extract to use as training data for the flip classifier model. 100K frames is optimal for training the flip classifier. 

This is can be an iterative process if your data contains large amounts of flips throughout the extractions. On your first iteration, it is acceptable to extract the data without a flip-classifier. Each iteration of re-extracting the data (with your latest model) and training a new model will yield higher model accuracy.

<center><img src="https://drive.google.com/uc?export=view&id=11Hsw5A3mjz5hQPBhr_dxhzE4pNzTsY4T"></center>

## Label Correctly Oriented Frame Ranges

Use this interactive tool to build your training dataset for the flip classifier model. You will select the frame ranges where the rodent is facing east, these ranges will be used to build your training set.

### Widget Guide
<center><img src="https://drive.google.com/uc?export=view&id=1U5wIeqWW6BOts8SiiCC7psdokaED1l5M"></center>

### Instructions
- First, use the Session Selector (1) to choose a session to label frames from.
- Use the Slider (3) to select a frame index to preview.
- To include a frame range in your training set:
   1. On the starting index, click the "Start Range" Button (4). This will start the frame range inclusion, displaying the direction selection buttons (5).
   2. Increase the slider to the desired end index, then indicate the direction the mouse is facing in the selection using either of the displayed direction selection buttons (5) to add it to the training list.
      - To cancel the selection, click the "Cancel Select" Button (4), and the list will not be added.
      - The selected frame range will appear in the box (7) next to the image preview (2) with a L/R prefix depending on which button directional was clicked.
      - When the indicator (6) turns green, you are ready to continue onto the next cells.
   3. Once all of your correct frame ranges are selected, click the "Clear Output" (8) button and continue to the next cell.
   
__Note__: If two frame ranges are selected with overlapping frames, the training set will only include the unique selected indices, removing duplicates. 

In [None]:
from moseq2_app.main import flip_classifier_tool

input_dir = './'
model_path = './flip-classifier-xx-1.pkl' ## e.g. ./flip-classifier-azure-ephys.pkl

max_frames = 1e5 # max number of frames to use (performance anecdotally saturates around 1e5)
tail_filter_iters = 1 # number of tail filter iterations
space_filter_size = 3 # size of the spatial median blur filter kernel size

FF = flip_classifier_tool(input_dir=input_dir,
                          output_file=model_path,
                          max_frames=max_frames,
                          tail_filter_iters=tail_filter_iters,
                          space_filter_size=space_filter_size)

## Prepare Train-Test Datasets

Split your dataset into a train/test X and y sets. 

Select a percent split for the test set such that you can accurately evaluate the model accuracy in the next following step.

Upon completion, the cell will plot a 2x2 grid. 
 - The left column contains the correctly flipped examples of the data.
 - The right column contains the incorrect examples.
 - The bottom row contains the y-axis flip versions of the top row.

Ensure that only the plotted frames in the __left__ column show the rodent facing east.

In [None]:
test_size = 20 # percent split

FF.prepare_datasets(test_size)

## Fit and Evaluate the Flip Classifier Model

The following cell will train the model with the split data, determine the flip classifier's accuracy, then saves the model to your desired output path.

In [None]:
# The maximum depth of the tree. Increase this value if your data includes larger amounts of variability.
# Variability can arise from obstructions, different rodent sizes, larger crop sizes, etc.
# Note: if increasing max_depth, ensure to increase your dataset size to ensure the model is not being over-fit.
max_depth = 6 
         
n_jobs = 4 # Number of parallel jobs to run `fit()` and `predict()`
verbose = 0 # levels of verbosity: [0, 1, 2]

FF.train_and_evaluate_model(n_jobs=n_jobs,
                            max_depth=max_depth,
                            verbose=verbose)

***