<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Label-the-Rodent's-Orientations-Within-Frame-Ranges" data-toc-modified-id="Label-the-Rodent's-Orientations-Within-Frame-Ranges-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Label the Rodent's Orientations Within Frame Ranges</a></span></li><li><span><a href="#Prepare-Train-Validation-Datasets" data-toc-modified-id="Prepare-Train-Validation-Datasets-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Prepare Train-Validation Datasets</a></span></li><li><span><a href="#Fit-or-Evaluate-the-Flip-Classifier-Model" data-toc-modified-id="Fit-or-Evaluate-the-Flip-Classifier-Model-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Fit or Evaluate the Flip Classifier Model</a></span></li><li><span><a href="#Correct-Extracted-Dataset-Using-Train-Flip-Classifier-Model" data-toc-modified-id="Correct-Extracted-Dataset-Using-Train-Flip-Classifier-Model-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Correct Extracted Dataset Using Train Flip Classifier Model</a></span><ul class="toc-item"><li><span><a href="#Apply-a-flip-classifier-to-correct-the-extracted-dataset" data-toc-modified-id="Apply-a-flip-classifier-to-correct-the-extracted-dataset-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Apply a flip classifier to correct the extracted dataset</a></span></li><li><span><a href="#Preview-Corrected-Sessions" data-toc-modified-id="Preview-Corrected-Sessions-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Preview Corrected Sessions</a></span></li></ul></li></ul></div>

Flip classifiers are RandomForestClassifier models that MoSeq2-Extract uses to ensure that the mouse is always extracted with the mouse's nose pointing to the right and tail to the left. This notebook is a streamlined utility and guide for preparing data and training a model that handles your specific data acquisition use case.

To use this notebook, you must first extract some data using MoSeq2-Extract to use as training data for the flip classifier model. 100K frames are optimal for training the flip classifier. 

This can be an iterative process if your data contains large amounts of flips throughout the extractions. On your first iteration, it is acceptable to extract the data without a flip-classifier. After training a new flip classifier, you may apply it to your dataset to correct the flips without having to re-extract the data before going into the PCA step.

<center><img src="https://drive.google.com/uc?export=view&id=1cOwyen2Siy-_wJ1HcE0PmMUi3Lcgcwwa"></center>

## Label the Rodent's Orientations Within Frame Ranges
Use this interactive tool to build your training dataset for the flip classifier model. Select a range of frames and identify whether the rodent is facing left or facing right. The ranges of frames are used to build your training set.

**Instructions**
- **Specify the data folder** in the `input_dir` field.
- **Specify the path for the resulting model** in the `model_path` field. For example, `./flip-classifier-azure-ephys.pkl`.
- **Specify the maximum number of frames to use** in the `max_frames` field, the default value is 1e5.
- **Specify the number of tail filter iterations** in the `tail_filter_iters` field, the default value is 1.
- **Specify the size of the spatial median blur filter kernel size** in the `space_filter_size` field, the default value is 3.is 3.
- **Run the following cell** to set the parameters and initialize the Data Labeller.

In [None]:
from moseq2_app.main import flip_classifier_tool
from moseq2_extract.util import read_yaml, get_strels

input_dir = './' # Specify the data folder
config_path = './config.yaml' # Specify the config file
model_path = './flip-classifier-xx-1.pkl' ## e.g. ./flip-classifier-azure-ephys.pkl

max_frames = 1e5 # max number of frames to use (performance anecdotally saturates around 1e5)

config_data = read_yaml(config_path) # load config data

strels = get_strels(config_data)# get structuring elements

clean_parameters = {'prefilter_space': config_data['spatial_filter_size'], # median filter kernel sizes 
                    'prefilter_time': config_data['temporal_filter_size'], # temporal filter kernel sizes
                    'strel_tail': strels['strel_tail'], # struc. element for filtering tail
                    'iters_tail': config_data['tail_filter_iters'], # number of iters for morph. opening to filter tail
                    'frame_dtype': config_data['frame_dtype'], # frame dtype
                    'strel_min':strels['strel_min'], # structuring element for erosion
                    'iters_min': config_data['cable_filter_iters']}# number of iterations for erosion

continuous_slider_update = True # update the view as the slider values are updated
launch_gui = True # launches the frame selector gui

FF = flip_classifier_tool(input_dir=input_dir,
                          output_file=model_path,
                          max_frames=max_frames,
                          clean_parameters=clean_parameters,
                          continuous_slider_update=continuous_slider_update,
                          launch_gui=launch_gui)

**Instructions:**
- **Run the following cell** to launch the Data Labeller GUI.
- **Select the target session from the dropdown menu** and start labeling.
- **Drag the slider** to select a frame index to preview.
- **Click `Start Range`** to starting selecting the range. **Drag the slider** to the end of the range. **Click `Facing Left` or `Facing Right`** to specify the correct orientation for the range of frames. After specifying the orientation, the selected frames will be added to the dataset used to train the model.
- **Click `Cancel Select`** to cancel the selection.

**Note**: The `Current Total Selected` section turns green when there are enough labeled frames to train the model. If your frame selection was interrupted for any reason, and you would like to relaunch the tool with all of your previously selected frame ranges, uncomment the code in the following cell and run the cell.

If two frame ranges are selected with overlapping frames, the training set will only include the unique selected indices, removing duplicates. 


In [None]:
FF.interactive_launch_frame_selector()

## Prepare Train-Validation Datasets
This cell splits your dataset into train/validation sets and displays images in the datasets.

Upon completion, the cell will plot a 2x2 grid. The left column contains the correctly flipped examples of the data. The right column contains the incorrect examples. The bottom row contains the y-axis flip versions of the top row.

Ensure that only the plotted frames in the __left__ column show the rodent is pointed to the right.

**Instructions:**
- **Run the following cell** to split your dataset into train/validation sets to train the flip classifier.
- **Specify the percentage for train/validation split** in `test_size`, and the default value is 20, meaning 20\% of the data is used as the validation dataset.
- If you want to preview the training dataset, **set `plot_examples` to `True`.** 




In [None]:
test_size = 20 # percent train/validation split
plot_examples = False # Set plot_examples to True to display the training data.

FF.prepare_datasets(test_size, plot_examples=plot_examples)

## Fit or Evaluate the Flip Classifier Model

The following cell train a random forest classifier model with the data, determine the flip classifier's accuracy and then save the model to your desired output path.

**Instructions:**
- **Specify the maximum depth of the tree**.  Increase this value if your data includes larger amounts of variability and you want to increase model complexity. Variability can arise from obstructions, different rodent sizes, larger crop sizes, etc. **Please be mindful of over-fitting**.
- **Specify the number of parallel jobs** to run `fit()` and `predict()`.
- **Set the `train` variable to `True`** if you want to train a new model with the selected data, otherwise, it will only evaluate the model on the selected data.
- **Run the following cell** to fit or evaluate the flip classifier model.

In [None]:
max_depth = 6 
         
n_jobs = 4
verbose = 0 # levels of verbosity: [0, 1, 2]
train = True

FF.train_and_evaluate_model(n_jobs=n_jobs,
                            max_depth=max_depth,
                            verbose=verbose,
                            train=train)

## Correct Extracted Dataset Using Train Flip Classifier Model

Use a pre-trained flip classifier model to correct extractions in your dataset that may have frames where the rodent is incorrectly flipped. 
### Apply a flip classifier to correct the extracted dataset
**Instructions:**
- **Set the `write_movie` variable to `True`** if you want to write a new video with the corrected frames.
- **Set the `verbose` variable to `True`** if you want to display progress bars for each session.
- **Run this cell** to apply the trained model to correct the extracted dataset.

In [None]:
chunk_size = 4000
frame_path = 'frames'
write_movie = True
verbose = False

FF.apply_flip_classifier(chunk_size=chunk_size,
                         frame_path=frame_path,
                         write_movie=write_movie,
                         verbose=verbose)

### Preview Corrected Sessions
**Instructions:**
- **Run the following cell** to preview corrected sessions.

In [None]:
from moseq2_app.main import preview_extractions

preview_extractions(input_dir, flipped=True)

***