# Welcome to MoSeq2-Notebook

### Run all of the MoSeq2 tools in a self-contained notebook.

***
<center><h1>MoSeq2 Introduction</h1></center>

***

<img src="https://github.com/dattalab/moseq2-app/blob/master/media/Data_Pipeline.png?raw=true">


MoSeq2 software toolkit for unsupervised characterization of animal behavior. Moseq takes depth recordings of single behaving animals as input, and outputs a rich labeling of postural dynamics in terms of reused motifs or 'syllables'. This notebook begins with compressed depth recordings (see 'Data Acquisiting Overview' below) and transforms this data through the steps of:

- **Extraction**: The animal is segmented from the background and its position and heading direction are aligned across frames.
- **Dimensionality reduction**: Raw video is de-noised and transformed to low-dimensional pose trajectories using principal component analysis (PCA).
- **Model training**: Pose trajectories are modeled using an autoregressive hidden Markov model (AR-HMM), producing a sequence of syllable labels.
- **Analysis**: Model output is reported through visualization and statistical analysis.

### Resources
Below are a list of publications and links to the individual github tool wikis for your convenience.
- Publications
    - [Mapping Sub-Second Structure in Mouse Behavior](http://datta.hms.harvard.edu/wp-content/uploads/2018/01/pub_23.pdf)
    - [The Striatum Organizes 3D Behavior via Moment-to-Moment Action Selection](http://datta.hms.harvard.edu/wp-content/uploads/2019/06/Markowitz.final_.pdf)
    - [Q&A: Understanding the composition of behavior](http://datta.hms.harvard.edu/wp-content/uploads/2019/06/Datta-QA.pdf)
- Wikis
    - [Extract](https://github.com/dattalab/moseq2-extract/wiki)
    - [PCA](https://github.com/dattalab/moseq2-pca/wiki)
    - [Model](https://github.com/dattalab/moseq2-model/wiki)
    - [Viz](https://github.com/dattalab/moseq2-viz/wiki)

## Data Acquisition Overview

Moseq2 takes animal depth recordings as input. We we have developed a [data acquisition pipeline](https://github.com/dattalab/moseq2-docs/wiki/Setup:-acquisition-software) for the Xbox Kinect depth camera. We suggest following our [data acquisiting tutorial](https://github.com/dattalab/moseq2-docs/wiki/Acquisition) for doing recordings. 

***

### Ensure you are running the python version located in your corresponding conda environment.

Remember: The anaconda environment must be activated prior to launching this jupyter notebook in order to use the specified python version.

For example, if your anaconda environment is called moseq2, then your output would look like: ```/Users/username/anaconda3/envs/moseq2/bin/python```

In [None]:
%%bash
which python

***
<center><h1>Notebook Setup</h1></center>

***

Install the requirements for Moseq2 by following the [README file](http://localhost:8888/notebooks/MoSeq2_Step_0.ipynb) (if you have not done so already). Then copy the present notebook into the same directory as your depth recordings before proceeding with the following setup steps
<img src="https://github.com/dattalab/moseq2-app/blob/master/media/Setup_Pipeline.png?raw=true">

### Session Folder Contents

After acquiring some data, an individual session folder should contain the following files:

```
├── session_1/ **
├   ├── depth.dat **        # depth data
├   ├── depth_ts.txt **     # timestamps
└─  └── metadata.json **    # metadata
```
***

- `depth.dat`: compressed file version of depth video (to be extracted)
- `depth_ts.txt`: timestamp file used for metadata analysis (can be used to check dropped fram e rate). Check the depth_ts.txt file to be sure that the dropped frame rate is less than 1%. If you look at the difference between neighboring timestamps, it should be ~30 (as in 30 milliseconds).
- `metadata.json`: JSON file containing recording session information.

### Data file organization

To ensure that your Moseq2-Notebook runs smoothely, we recommend the following directory structure; where a single master directory contains: __this notebook__, and one sub-directory for each recording session, where each of the sub-directories has depth data, metadata and optional timestamp data:

```
.
├── MoSeq2-Notebook.ipynb **
├── session_1/ **
├   ├── depth.dat        # depth data
├   ├── depth_ts.txt     # timestamps
├   └── metadata.json    # metadata
├── session_2/ **
├   ├── depth.dat
├   ├── depth_ts.txt
└── └── metadata.json
```

You can also access and analyze individual (and external) sessions using this notebook by specifying the path when prompted below.

### Ensure your session folders are found:

The following cell will prompt you to enter the path where your recorded sessions can be found. 

If you have the above recommended directory structure, then just press `ENTER` to search the default current working directory. 

It will then recursively search through your inputted directory and return the number of found `depth.dat` files, and your base working directory.

In [None]:
from moseq2_extract.gui import get_found_sessions
from glob import glob
import os

base_dir, found_sessions = get_found_sessions()

print('Number of found sessions to analyze:', found_sessions)
print('Your base directory is:', base_dir)

In [None]:
%%bash
pwd

### Generate Configuration Files

In [None]:
import os
from moseq2_extract.gui import generate_config_command

config_filepath = os.path.join(base_dir, 'config.yaml')

print(f'generating file in path: {config_filepath}')
generate_config_command(config_filepath)

A configuration file has been created in the same directory as your Notebook and session directories. The directory should now have the following contents

```
.
├── MoSeq2-Notebook.ipynb
├── config.yaml **
├── session_1/
└── session_2/   
```

### Download a Flip File

In order to ensure your extraction is smooth and invariant to the mouse's orientation, we recommend using a flip-classifier to aid keeping the mouse oriented throughout the extraction.
Note: MoSeq2-Notebook currently only supports flip correction for Adult male c57 mice. (You may skip this step if your mice are very different)

In [None]:
from moseq2_extract.gui import download_flip_command

download_flip_command(base_dir, config_filepath)

***
<center><h1>Raw Data Extraction</h1></center>

***

You will use the MoSeq2-Extract module to convert your raw data files to human-readable/parseable formats such as mp4 videos, and YAML/HDF5 metadata files. These metadata files are used to then train your PCA model, while the mp4 file is primarily used to ensure that the session was extracted correctly with no defects or unwanted artifacts.

In the extraction step, begin by testing your detected ROIs with the default parameters. If all goes well, continue into the to the test extraction step.

The first two steps are meant to debug possible extraction errors you may encounter before performing an extraction on your full dataset.

<img src="https://github.com/dattalab/moseq2-app/blob/master/media/Extraction_Pipeline.png?raw=true">

Once testing is done, you can then proceed to extract all the session files found by your notebook.

## Pre-Extraction Data Quality Testing

Before performing a full extraction on your recordings, follow the following steps to ensure your Regions of Interest (ROIs) are properly found. This will bring more clarity as to what to expect after a complete extraction of your data. 

### ROI Test

This test ensures that your whole background area is properly captured without any artifacts that may interfere with the mouse video extraction.

#### Configurable Parameter Descriptions
- `bg_roi_dilate`: the detected floor mask is dilated with a kernel whose (width, height) is specified by this parameter
- `bg_roi_depth_range`: min/max depth values in which the ROI detection algorithm will search for a flat surface. (I like your note on real-life measurements, keep that here).

#### Possible ROI Pathologies
_To be completed_

The following cell will extract the first frame, ROI, and background ROI for your reference before continuing into the extraction process.

In [None]:
import ruamel.yaml as yaml
from moseq2_extract.gui import find_roi_command

# see note above about creating file paths
sample_testdir_in = base_dir#os.path.join(base_dir, 'session_1/') # session directory to perform ROI testing
sample_roi_testfile = os.path.join(sample_testdir_in, 'depth.dat') # depth file to perform ROI testing on
sample_testdir_out = os.path.join(sample_testdir_in, 'sample_proc/') # directory to save roi extraction results

with open(config_filepath, 'r') as f:
    config_data = yaml.safe_load(f)
f.close()

# Relevant ROI parameters you may need to configure
config_data['bg_roi_dilate'] = (10, 10) # Size of the mask dilation (to include environment walls)
config_data['bg_roi_depth_range'] = (650, 750) # Range to search for floor of arena (in mm)

with open(config_filepath, 'w') as f:
    yaml.dump(config_data, f, Dumper=yaml.RoundTripDumper)
f.close()

find_roi_command(sample_roi_testfile, sample_testdir_out, config_filepath)

Once complete, you can expect the following directory structure:

```
.
├── config.yaml
├── MoSeq2-Notebook.ipynb
├── session_1/
├   ├── sample_proc/ **
├   ├   ├── bground.png & bground.tiff **
├   ├   ├── first_frame.png & first_frame.tiff **
├   ├   └── roi.png & roi.tiff ** 
├   ├── depth.dat
├   ├── depth_ts.txt
├   └── metadata.json
└── session_2/
```

Display your calculated ROI images below:

In [None]:
from IPython.display import display, Image
from glob import glob

images = glob(os.path.join(sample_testdir_out, '*.png'))
ims = [Image(im) for im in images]
[display(im) for im in ims]

### Sample Test Extraction 

#### Configurable Parameter Descriptions
- `min_height`: The shortest possible height that the mouse can be in your recordings. Important factor to include in order to properly estimate the minimum depth value during video construction.
- `max_height`: The tallest possible height your mouse can be in your recordings. Important factor to include in order to properly estimate the maximum depth value during video construction.
- `spatial_filter_size`: Median spatial filter applied to the raw video pre-extraction in order to get crisp mp4 files. The larger the kernel, the more granular your video will become. Be aware not to set it too high as to not lose video clarity. (Must be ODD)
- `temporal_filter_size`: Median temporal filter applied to the raw video pre-extraction in order to handle any frame drops or time irregularities in the compressed data. Only use if your videos appear to be laggy or have a noticable amount of frames dropped. (Must be ODD)

#### Possible Extraction Pathologies
_To be completed_

In [None]:
from moseq2_extract.gui import sample_extract_command

sample_testfile = os.path.join(sample_testdir_in, 'depth.dat')
extract_testdir_out = 'test_proc/' # directory to save sample extraction
nframes = 200 # number of frames to extract from raw to preview

with open(config_filepath, 'r') as f:
    config_data = yaml.safe_load(f)
f.close()

# Extraction parameters you may need to configure
config_data['min_height'] = 10 # Min mouse height from floor (mm)
config_data['max_height'] = 100 # Max mouse height from floor (mm)
config_data['spatial_filter_size'] = [3] # Space prefilter kernel (median filter, must be odd)
config_data['temporal_filter_size'] = [0] # Time prefilter kernel (median filter, must be odd)

with open(config_filepath, 'w') as f:
    yaml.dump(config_data, f, Dumper=yaml.RoundTripDumper)
f.close()

sample_extract_command(sample_testfile, extract_testdir_out, config_filepath, nframes)

After an extraction, you can expect the following directory structure:

```
.
├── config.yaml
├── MoSeq2-Notebook.ipynb
├── session_1/
├   ├── test_proc/ **
├   ├   ├── bground.png & bground.tiff **
├   ├   ├── first_frame.png & first_frame.tiff **
├   ├   ├── results_00.mp4 **
├   ├   ├── results_00.h5 **
├   ├   ├── results_00.yaml **
├   ├   └── roi.png & roi.tiff ** 
├   ├── depth.dat
├   ├── depth_ts.txt
├   └── metadata.json
└── session_2/
```

You can view your sample extraction below:

In [None]:
from IPython.display import display, Video

vid = Video(os.path.join(sample_testdir_in, extract_testdir_out, 'results_00.mp4'))

display(vid)

If you are happy with your sample extraction, continue to extracting your full dataset. Otherwise, consider adjusting some of your ROI or extraction parameters.

## Extract Session(s)

Run the following cells to extract all of your found previously `depth.dat` files.

In [None]:
from moseq2_extract.gui import extract_found_sessions

# depth files to recursively search for that have been partially extracted or not yet extracted 
filename = 'depth.dat'

commands = extract_found_sessions(base_dir, config_filepath, filename)

This is what your directory structure should look like once the process is complete:

```
.
├── MoSeq2-Notebook.ipynb
├── config.yaml
├── session_1/
├   ...
├   └── proc/ **
├   ├   ├── roi.tiff
├   ├   ...
├   └   └── results.h5 **
└── session_2/
├   ...
├   └── proc/ **
├   ├   ├── roi.tiff
├   ├   ...
└   └   └── results.h5 **
        
```

### Aggregate your results into one folder and generate an index file.

#### Configurable Parameter Descriptions
- `recording_format`: the start_time, session_name, and subject_name parameters are variable names that are read from each `metadata.json` file. The names of the resulting files will have the inputted format.
- `aggregate_results_dir`: directory name that contains all of your extracted data.

In [None]:
from moseq2_extract.gui import aggregate_extract_results_command

recording_format = '{start_time}_{session_name}_{subject_name}' # filename formats for the extracted data
aggregate_results_dir = 'aggregate_results/' # directory to save all metadata+extracted videos to with above respective name format
 
aggregate_extract_results_command(base_dir, recording_format, aggregate_results_dir)

Resulting in the following directory (sample) structure:

```
.
├── aggregate_results/ **
├   ├── session_1_results.h5 ** # session 1 metadata
├   ├── session_1_results.yaml **
├   ├── session_1_results.mp4 ** # session 1 extracted video
├   ├── session_2_results.h5 ** # session 2 metadata
├   ├── session_2_results.yaml **
├   └── session_2_results.mp4 ** # session 2 extracted video
├── config.yaml
├── moseq2-index.yaml ** # index file
├── MoSeq2-Notebook.ipynb
├── session_1/
└── session_2/
```

__Notice your index file has been generated in your base directory.__

View all of your extracted videos below:

In [None]:
from IPython.display import display, Video
from glob import glob
import os

extractions = glob(os.path.join(aggregate_results_dir, '*.mp4'))
vids = [Video(vid) for vid in extractions]
[display(vid) for vid in vids]

***
<center><h1>Principal Component Analysis (PCA)</h1></center>

***

Once the data has been extracted, implement a Principal Component Analysis on your metadata (specifically h5 files) to compute the principal components of your mouse's body in order to subsequently classify its behavior in the ARHMM model.

<img src="https://github.com/dattalab/moseq2-app/blob/master/media/PCA_Pipeline.png?raw=true">

## Training PCA

__A good example of what you should expect from your PCA Components and Scree plot are shown below:__

<center>Components</center> | <center>Scree Plot</center>
- | - 
<img src="https://github.com/dattalab/moseq2-app/blob/master/media/Components_Ex.png?raw=true" width=400 height=400> | <img src="https://github.com/dattalab/moseq2-app/blob/master/media/Scree_Ex.png?raw=true" width=400 height=400>

#### Configurable Parameter Descriptions
- `gaussfilter_space`: Kernel size for performing a gaussian filter on your processed mouse video before performing PCA. This helps identify crisper, more informative principal components.
- `medfilter_space`: Same as gauss filter kernel but uses Median Filtering instead. (Typically use one or the other)
    - Both filters are used for when the principal components do not appear to have crisp boundaries, or are all too similar to each other to be considered reliable components.
- `missing_data`: If you have missing/dropped frames in your videos, set this to true.
- `missing_data_iters`: Number of times to iterate over missing data during PCA to fill in missing gaps appropriately.
- `recon_pcs`: Number of principal components to reconstruct from missing data.

#### Possible PCA Pathologies
_To be completed._

In [None]:
from moseq2_pca.gui import train_pca_command
import ruamel.yaml as yaml

session_dir = aggregate_results_dir # Directory to search for your extracted sessions
pca_filename = 'pca' # Name of your PCA model h5 file to be saved
pca_dirname = '_pca/' # Directory to save your computed PCA results

with open(config_filepath, 'r') as f:
    config_data = yaml.safe_load(f)
f.close()

# PCA parameters you may need to configure
config_data['gaussfilter_space'] = (1.5, 1) # Spatial filter for data (Gaussian)
config_data['medfilter_space'] = [0] # Median spatial filter
config_data['recon_pcs'] = 10 # Number of PCs to use for missing data reconstruction
config_data['missing_data'] = False # Use missing data PCA
config_data['missing_data_iters'] = 10 # Number of times to iterate over missing data during PCA


with open(config_filepath, 'w') as f:
    yaml.dump(config_data, f, Dumper=yaml.RoundTripDumper)
f.close()

train_pca_command(aggregate_results_dir, config_filepath, pca_dirname, pca_filename)

Once complete, you can expect your relative directory structure to look something like this:
```
.
├── _pca/ **
├   ├── pca.h5 ** # pca model compressed file
├   ├── pca.yaml  ** # pca model YAML metadata file
├   ├── pca_components.png **
├   └── pca_scree.png **
├── aggregate_results/
├── config.yaml
├── moseq2-index.yaml
├── MoSeq2-Notebook.ipynb
├── session_1/
└── session_2/

```

View your `computed PCs` and `scree plot` in the next cell.

In [None]:
from IPython.display import display, Image
images = [os.path.join(pca_dirname, 'pca_components.png'), os.path.join(pca_dirname, 'pca_scree.png')]
for im in images:
    display(Image(im))

## Computing Principal Component Scores

Apply your trained PCA model using your computed principal components to compute your PC Scores.

In [None]:
from moseq2_pca.gui import apply_pca_command

scores_filename = 'pca_scores' # name of the scores file to compute and save

apply_pca_command(base_dir, config_filepath, pca_dirname, scores_filename)

Once complete, you will have a pca_scores file saved in your pca directory. (Example shown below)
```
.
├── _pca/
├   ├── pca.h5
├   ├── pca.yaml
├   ├── pca_scores.h5  ** # scores file
├   ├── pca_components.png
├   └── pca_scree.png
├── aggregate_results/
├── config.yaml
├── moseq2-index.yaml
├── MoSeq2-Notebook.ipynb
├── session_1/
└── session_2/

```

## (Optional) Computing Model-Free Syllable Changepoints

This is an optional step used to aid in determining model-free syllable lengths; which are general approximations of the duration of respective body language syllables. Computing Model-Free Changepoints can be useful for determining the prior variable for syllable duration, denoted as `kappa`, in the ARHMM modeling step.

__A good example of a Changepoints Distance plot is shown below__
<img src="https://github.com/dattalab/moseq2-app/blob/master/media/CP_Ex.png?raw=true" width=400 height=400>


Measure syllable block duration distances between detected syllables using your PCA model or computed PC scores below.

__Warning: These parameters have been pre-tuned to accomodate for C57 Mice, and those of the like. Therefore, we do not recommend changing the changepoint calculation parameters. However, if you decide to do so, it is at your own risk.__

#### Configurable Parameter Descriptions
- `threshold`: Computed value used to determine the "peak"/transition point from one syllable to the other
- `dims`: Number of random projections to use in order to compare the computed principal components with, and determine a distribution for the block durations.

In [None]:
from moseq2_pca.gui import compute_changepoints_command
import ruamel.yaml as yaml
changepoints_filename = 'changepoints' # name of the changepoints images to generate
pca_dirname = '_pca/'
with open(config_filepath, 'r') as f:
    config_data = yaml.safe_load(f)
f.close()

# Changepoint computation parameters you may want to configure
config_data['threshold'] = 0.5 # Peak threshold to use for changepoints
config_data['dims'] = 300 # Number of random projections to use

with open(config_filepath, 'w') as f:
    yaml.dump(config_data, f, Dumper=yaml.RoundTripDumper)
f.close()

compute_changepoints_command(base_dir, config_filepath, pca_dirname, changepoints_filename)

The changepoints plot will be generated and saved in the pca directory (example below).

```
.
├── _pca/ 
├   ├── pca.h5
├   ├── pca_scores.h5
├   ...
├   └── changepoints_dist.png **
├── aggregate_results/ 
├── config.yaml
├── moseq2-index.yaml
├── MoSeq2-Notebook.ipynb
├── session_1/
└── session_2/
```

View your changepoints distance plot:

In [None]:
from IPython.display import display, Image

display(Image(os.path.join(pca_dirname, changepoints_filename+'_dist.png')))

***
<center><h1>ARHMM Modeling</h1></center>

***

In order to train your ARHMM (Auto-Regressive Hidden Markov Model), you will use your computed PC scores as your input data, and specify whether you are modeling a single experimental group for observational research, or modeling multiple different groups (e.g. control vs. experimental groups) for comparative analysis.

<img src="https://github.com/dattalab/moseq2-app/blob/master/media/Model_Pipeline.png?raw=true">

## (Optional) Specify Groups

### What are groups?

MoSeq using groups in the `moseq2-index.yaml` file to indicate whether your collected sessions are representing a single experimental group, or many different groups that you would like to compare while modeling and visuslizing.

By default, all the session recordings have the same group title: `'default'`. If you do not have 2 sessions that are different enough to separate to different groups for later comparison, you can skip this step.

Otherwise, there are 3 ways you are able to specify your groups:
1. Specify group by SessionName
2. Specify group by SubjectName
3. Manually edit index file

Once a cell is run, it will display your current indexing structure.

#### View Indexed Sessions
Use this cell to view your sessions' information regarding their SessionNames, SubjectNames, and Groups.

In [None]:
from moseq2_viz.gui import get_groups_command

index_filepath = os.path.join(base_dir,'moseq2-index.yaml')

get_groups_command(index_filepath)

#### 1 - Specify Group by Session Name

In [None]:
from moseq2_viz.gui import add_group_by_session

value = 'ayman_first_tethered_recording' # value of the corresponding key
group = 'group1' # designated group name
exact = False # Must be exact key-value match
lowercase = False # change to lowercase
negative = False # select opposite selection than key-value pair given

add_group_by_session(index_filepath, value, group, exact, lowercase, negative)

#### 2 - Specify Group by Subject Name

In [None]:
from moseq2_viz.gui import add_group_by_subject

value = '1776' # value of the corresponding key
group = 'group2' # designated group name
exact = False # Must be exact key-value match
lowercase = False # change to lowercase
negative = False # select opposite selection than key-value pair given

add_group_by_subject(index_filepath, value, group, exact, lowercase, negative)

#### 3 - Manually Edit Index File

Simply navigate to your `moseq2-index.yaml` file in your jupyter notebook homepage and edit the group names to your specified values.

## Train ARHMM

#### Configurable Parameter Descriptions
- `hold_out`: Boolean for whether to hold out data during the training process.
- `hold_out_seed`: Integer used to reproduce the same hold out set for repeated testing.
- `nfolds`: Number of data folds to hold out during training. (If used, nfolds <= nsessions)
- `npcs`: Number of selected principal components, chosen in order as shown in the PC Components plot. If too few or too many PCs are selected, the ARHMM predictions will become unreliable.
- `num_iter`: Number of time the model will iterate over your dataset, we recommend at least 100 starting out. This is modeling regularization parameter to ensure that your model is fitting appropriately to its given dataset.
- `max_states`: Maximum number of states the ARHMM that the ARHMM can end up with at the end of training. This is modeling regularization parameter that indicates the complexity of the transitions that may be happening in your dataset. Therefore, if there are too few the model may not learn the actual behavior, and if there are too many, then the model will overfit to the dataset.
- `separate-trans`: Boolean for whether to separate the modeling process for different groups. (Must set to true if number of unique groups > 1)
- `kappa`: Prior probability variable used to indicate average syllable length. Setting kappa to the number of frames is a good starting point to determining the proper expressed syllable durations. If kappa is too low, syllables will appear to be too short, and vice versa.
- `checkpoint_freq`: Value indicating when to save model checkpoints per number of iterations passed. (If -1, do not checkpoint)

#### Possible ARHMM Pathologies
_To be completed._

In [None]:
from moseq2_model.gui import learn_model_command
import os
pca_dirname = '_pca/'
scores_filename = 'pca_scores'
base_dir = './'
config_filepath = 'config.yaml'
scores_file = os.path.join(pca_dirname, scores_filename+'.h5') # path to input PC scores file to model
model_path = os.path.join(base_dir, 'model.p') # path to save trained model
index_filepath = os.path.join(base_dir, 'moseq2-index.yaml') # path to your auto-generated (possibly modified) index file

# Advanced modeling parameters
hold_out = False # boolean to hold out data during the training process
hold_out_seed = -1 # integer to standardize the held out folds during training
nfolds = 2 # number of folds to hold out during training (if hold_out==True)
npcs = 10  # number of PCs being used

num_iter = 50 # number of iterations to train model
max_states = 100 # number of maximum states the ARHMM can end up with
kappa = None # syllable length prior None for default
robust = False # use robust-ARHMM with t-distribution

separate_trans = True # separate group transition graphs; set to True if ngroups > 1

checkpoint_freq = -1 # model saving freqency (in interations)

learn_model_command(scores_file, model_path, config_filepath, index_filepath, hold_out, nfolds,
                    num_iter, max_states, npcs, kappa,
                    separate_trans, robust, checkpoint_freq)

Once training is complete, your model will be saved in your base directory (shown below). 
```
.
├── _pca/ 
├── aggregate_results/ 
├── config.yaml
├── model.p **
├── moseq2-index.yaml/
├── MoSeq2-Notebook.ipynb
├── session_1/
└── session_2/
```

Now use the moseq2-viz module to produce crowd videos and a number of statistical analysis plots.

***
<center><h1>Visualize Analysis Results</h1></center>

***

Now that you have a trained ARHMM, you can use it generate informative graphs and videos regarding the behavior syllables found, their usage frequency, and transition probabilities.

The graph below shows the 4 operations that the MoSeq2-Viz module currently affords. They can also be computed in any order at this point in the notebook.

<img src="https://github.com/dattalab/moseq2-app/blob/master/media/Viz_Pipeline.png?raw=true">

## Make Crowd Videos

This tool allows you to create videos containing many overlayed clips of the mouse performing the same specified syllable at the moment a red dot appears on their body. The videos are sorted by most frequently expressed syllable to least.
To create the crowd videos, run the following command:

In [None]:
from moseq2_viz.gui import make_crowd_movies_command
import os
base_dir = './'
index_filepath = 'moseq2-index.yaml'
model_path = 'model.p'
config_filepath = 'config.yaml'
crowd_dir = os.path.join(base_dir, 'crowd_movies/') # output directory to save all movies in

max_syllables, max_examples = 20, 20 # maximum number of syllables, and examples of each syllable in a video respectively

make_crowd_movies_command(index_filepath, model_path, config_filepath, crowd_dir, max_syllables, max_examples)

Once completed, you can find your crowd movies along with a metadata YAML file in your corresponding crowd directory. The metadata `info.yaml` file will contain model information pertaining to how these crowd videos were produced.
```
.
├── _pca/ 
├── aggregate_results/ 
├── config.yaml
├── crowd_movies/ **
├   ├── info.yaml **
├   ├── syllable_sorted_44 (usage).mp4 **
├   ...
├   └── syllable_sorted_12 (usage).mp4 **
├── model.p 
├── moseq2-index.yaml
├── MoSeq2-Notebook.ipynb
├── session_1/
└── session_2/
```

View your generated crowd movies below:

In [None]:
from IPython.display import display, Video
from glob import glob

videos = glob(os.path.join(crowd_dir, '*.mp4'))
vids = [Video(vid) for vid in videos]
[display(vid) for vid in vids]

## Compute Usage Plots

Use this command to compute the model-detected syllables usages sorted in descending order of usage.

__Note for plotting multiple groups: remember to include all group names in the tuple to graph them.__

In [None]:
from moseq2_viz.gui import plot_usages_command

sort = True
count = 'usage'
max_used_syllable = max_syllables - 1 
groups = ('group1', 'group2')
output_file = 'usages'

plot_usages_command(index_filepath, model_path, sort, count, max_syllables, groups, output_file)

## Compute Scalar Summary and Tracking Plots

Use the following command to compute some scalar summary information about your modeled groups, such as average velocity, height, etc.
This command will also generate a tracking summary plot; depicting the path traveled by the mouse in your recordings.

In [None]:
from moseq2_viz.gui import plot_scalar_summary_command

output_file = 'scalars' # prefix name of the saved scalar position and summary graphs

plot_scalar_summary_command(index_filepath, output_file)

Graph your output:

In [None]:
from IPython.display import display, Image
from glob import glob

images = glob('scalars_*.png')
ims = [Image(im) for im in images]
[display(im) for im in ims]

## Compute Syllable Transition Graph

Use the following command to generate a syllable transition graph. The graph will be comprised of nodes labelled by syllable, and edges depicting a probable transition, with edge thickness depicting the weight of the transition edge.

For multiple groups, there will be a transition graph for each group, as well as a unified graph with different colors to identify the groups. __Note: remember to include all group names in the tuple to graph them.__

In [None]:
from moseq2_viz.gui import plot_transition_graph_command

max_syllable = 20 # Maximum number of nodes in the transition graph
groups = ('group1', 'group2') # Group to graph, default if empty str
output_filename = 'transition' # name of the png file to be saved

plot_transition_graph_command(index_filepath, model_path, config_filepath, max_syllable, groups, output_filename)

***
<center><h1>Notebook End</h1></center>

***