To run this notebook, please make sure you are in the environment that has the `keypoint_moseq` package. If you use the conda environment named `keypoint_moseq`, please run `conda activate keypoint_moseq` to activate the environment.

This notebook contains a collection of analyses we typically perform on behavioral recordings to better understand mouse behavior. To make the files and Jupyter notebooks more organized, it is recommended to have one folder for each project.

Your base directory should look like this at the beginning of the analysis notebook:
```
.
└── <project_dir>/               ** current working directory
    ├── <model_dir>/             ** model directory
        ├── crowd_movies/        ** [Optional] crowd movies folder
        ├── grid_movies/         ** [Optional] grid movies folder
        ├── trajectory_plots/    ** [Optional] trajectory plots folder
        ├── checkpoint.p         ** checkpoint file
        └── results.h5           ** model results
    ├── <keypointdata_dir>/      ** [Optional] video of keypoint input data (necessary for generating crowd movies, and trajectory plots and grid movies)
    ├── tutorial.ipynb           ** the notebook for modeling the data
    └── analysis.ipynb            ** this notebook

```

# Project setup

Set up the variable for the path to the project directorry and the model name.

In [None]:
import keypoint_moseq as kpms

model_dirname='model_dir' # model directory name for the model to analyze
project_dir='./demo_project' # the full path to the project directory

# Assign Groups

The following cell creates the `index.yaml` file.

**Instructions:**
- **Click the column header** to sort the column and use the filter icon to filter if needed.
- **Click on the session** to select the session. **To select multiple sessions, click the sessions while holding down the [Ctrl]/[Command] key, or click the first and last entry while holding down the [Shift] key.**
- **Enter the group name in the `Desired Group Name` field** and click `Set Group` to update the `group` column for the selected sessions.
- Click the `Update Index File` button to save current group assignments.

Notes: If only the text box input field is display and there is no table, please fully shut down Jupyter and restart the server.

In [None]:
index_file=kpms.interactive_group_setting(project_dir, model_dirname)

# Compute Syllable Satistics

## Compute `moseq_df`
The following cell generates a `DataFrame` of kinematic values for each frame and each frame is aligned to its model label. 

In [None]:
# import os
# index_file=os.path.join(project_dir, model_dirname, 'index.yaml')

moseq_df = kpms.compute_moseq_df(project_dir, model_dirname, index_file, 
                                 smooth_heading=True) # whether the output heading is smoothed or not
print('moseq_df size: ', moseq_df.shape[0], 'rows;', moseq_df.shape[1], 'columns')
moseq_df.head()

## Export `moseq_df`

You can export the `moseq_df` to a csv file to `{project_dir}/{model_dirname}/moseq_df.csv`.

In [None]:
# Save `moseq_df` as a csv file
import os

# Specify the path you want to save the dataframe in `save_path`
save_path=os.path.join(project_dir, model_dirname) # default the dataframe is saved in the model directory

# exports the dataframe
filename='moseq_df.csv'
moseq_df.to_csv(os.path.join(save_path, filename), index=False)

print('DataFrame is saved:', os.path.join(save_path, filename))

## Compute `stats_df`
`stats_df` is a `DataFrame` that contains statistical summaries (i.e., min, max, mean, std) of scalar values (kinematic values such as heading and velocity) associated with each syllable, as well as the frequency each syllable is expressed.

In [None]:
stats_df=kpms.compute_stats_df(
                               moseq_df, threshold=0.005, # threshold for syllable usage to include in the dataframe
                               groupby=['group', 'file_name'], # the column(s) to group the dataframe by
                               fps=fps, # frame rate of the video
                               syll_key='syllables_reindexed', # syllable key ('syllables' or 'syllables_reindexed')
                               normalize=True) # syllable usages within one session add up to 1 when True
print('The shape of stats_df', stats_df.shape)
stats_df.head()

## Export `stats_df`

You can export the `stats_df` to a csv file to `{project_dir}/{model_dirname}/stats_df.csv`.

In [None]:
# exports the dataframe
filename='stats_df.csv'
stats_df.to_csv(os.path.join(save_path, filename), index=False)

print('DataFrame is saved:', os.path.join(save_path, filename))

## Generate Behavioral Summary (Fingerprints)
Fingerprints summarize behavior by showing distributions of scalars (eg. position, velocity, height, and length) and syllables.

In [None]:
summary, range_dict = kpms.create_fingerprint_dataframe(moseq_df, stats_df, 
                                                        stat_type='mean', # the type of statistics to plot ("mean", "min", or "max")
                                                        n_bins=100, # the number fo bins that indicates resolution of distribution 
                                                        range_type='robust') # range type for stats, robust filters out top and bottom 1% ("robust" or "full")
kpms.plotting_fingerprint(summary, range_dict, 
                          save_dir=save_dir, # path to the directory to save the figure
                          preprocessor_type='minmax') # data preprocessor for the fingerprint ("minmax", "standard", or "none")

#  Syllable Visualization and Labelling Tool
Syllables can be visualized in the form of trajectory plots, crowd movies, and grid movies. 

## Display trajectory plots

In [None]:
kpms.show_trajectory_gif(project_dir, model_dirname, 
                         video_dir='dlc_project/videos', # the path to the input keypoint videos
                         keypoint_data_type='deeplabcut') # the type of keypoint data ("deeplabcut" or "sleap")

## Syllable labeling tool

**Instructions:**
- **Run the following cell.** Specify the movie type to show if necessary. The default is displaying grid movies.
- **Input the syllable behavioral label and short description** in the text fields.
- Click `Save Setting` to save the syllable label and description for later analysis.
- Use `Next` and `Previous` to navigate between syllables and the syllable label and description will be automatically saved when using these buttons.

Notes: If the widget crashes, please restart the notebook kernel and run the cell again.

In [None]:
kpms.label_syllables(project_dir, model_dirname, 
                     video_dir='dlc_project/videos', # the path to the input keypoint videos
                     keypoint_data_type='deeplabcut', # the type of keypoint data ("deeplabcut" or "sleap")
                     movie_type='grid') # type of movie to show ("grid" or "crowd")

# Syllable Statistics Graphing
Syllable statistics provide information about the behavioral patterns.

In [None]:
# ordering of syllables, could be 'stat' or 'diff'
order='stat'
# statistic to be plotted, could be ,'duration','velocity_px_s_mean'
stat='frequency'
# groups to be plotted
groups=stats_df['group'].unique()
# name of the control group
ctrl_group='a'
# name of the experimental group
exp_group='b'
# boolean for whether the dots are connected
join=False
# boolean for whether to plot the significant stars
plot_sig=True
# significance threshold
thresh=0.05

kpms.plot_syll_stats_with_sem(stats_df, project_dir, model_dirname, 
                              save_dir=save_dir, # path to the directory to save the figure
                              plot_sig=True, # whether to plot the significant stars, plot the stars when True
                              thresh=0.05, # significance threshold
                              stat='frequency', # statistic to be plotted ('duration' or 'velocity_px_s_mean')
                              order='stat', # ordering of syllables, could be 'stat' or 'diff'
                              groups=stats_df['group'].unique(), # groups to be plotted
                              ctrl_group='a', # name of the control group
                              exp_group='b', # name of the experimental group
                              join=False, # whether the dots are connected, connected when True
                              figsize=(10, 5)) # figure size                       

## Transition matrices
Transition matrices compactly represent the frequency any syllable transitions into any other syllable and is one way to describe structure in behavior.

In [None]:
trans_mats, usages, groups=kpms.generate_transition_matrices(project_dir, model_dirname, 
                                                             normalize='bigram', # matrix normalization methods ("bigram", "rows" or "columns")
                                                             max_syllable=20) # maximum number of syllables to include
kpms.visualize_transition_bigram(groups, trans_mats, 
                                 save_dir=save_dir, # path to the directory to save the figure
                                 normalize='bigram') # matrix normalization methods ("bigram", "rows" or "columns")

## Syllable Transition Graph
Transition matrices can also be visualized as directed graphs, where each node of the graph represents one syllable, and the directional edges represent transitions between syllables.

In [None]:
kpms.plot_transition_graph_group(groups, trans_mats,usages, 
                                 save_dir=save_dir, # path to the directory to save the figure
                                 layout='circular') # transition graph layout ("circular" or "spring")

In [None]:
layout='circular' # could be circular or spring
kpms.plot_transition_graph_difference(groups, trans_mats,usages, 
                                      save_dir=save_dir, # path to the directory to save the figure
                                      layout='circular') # transition graph layout ("circular" or "spring")