# Analysis: Data Extraction

After data is acquired, it must be extracted from its raw data format in order for it to be properly processed.

In order to follow along with this notebook, we have provided an example of a good recording [here](https://storage.googleapis.com/moseq2-examples/good-session.tar.gz).

Unzipping the folder you should see the following directory structure:

<img src="media/extract-dirstruct.png" alt="Starting directory structure" title="Starting Directory Structure" />

You can now use moseq2-extract to perform extraction and alignment of the mouse from the raw depth videos.

## MoSeq2-Extract
For a full list of available options, run the following bash command:

In [None]:
%%bash
moseq2-extract --help

## Generating a Configuration File
In order to easily view/edit/execute all of your extraction command line parameters of choice, you will need to generate a configuration file.
To do that run the following command:

In [None]:
%%bash
moseq2-extract generate-config

Once that is done, you will end up with the following file, default titled: config.yaml

```
crop_size:
- 80
- 80
bg_roi_dilate:
- 10
- 10
bg_roi_shape: ellipse
bg_roi_index: 0
bg_roi_weights:
- 1
- 0.1
- 1
bg_roi_depth_range:
- 650
- 750
bg_roi_gradient_filter: false
bg_roi_gradient_threshold: 3000
bg_roi_gradient_kernel: 7
bg_roi_fill_holes: true
min_height: 10
max_height: 100
fps: 30
flip_classifier: 
flip_classifier_smoothing: 51
use_tracking_model: false
tracking_model_ll_threshold: -100
tracking_model_mask_threshold: -16
tracking_model_ll_clip: -100
tracking_model_segment: true
tracking_model_init: raw
cable_filter_iters: 0
cable_filter_shape: rectangle
cable_filter_size:
- 5
- 5
tail_filter_iters: 1
tail_filter_size:
- 9
- 9
tail_filter_shape: ellipse
spatial_filter_size:
- 3
temporal_filter_size:
- 0
chunk_size: 1000
chunk_overlap: 0
output_dir:
write_movie: true
use_plane_bground: false
frame_dtype: uint8
centroid_hampel_span: 0
centroid_hampel_sig: 3
angle_hampel_span: 0
angle_hampel_sig: 3
model_smoothing_clips:
- 0
- 0
frame_trim:
- 0
- 0
compress: false
compress_chunk_size: 3000
compress_threads: 3
config_file:
```

Note: you may remove some of the above parameters in the configuration file to fit your specific experiment. __This is only done if your data/experiment is different from that which MoSeq affords.__

# Extracting Data

To extract data, simply run the following command on any __depth.dat__ file:

In [None]:
%%bash
moseq2-extract extract sample_session/depth.dat

To run an extraction with your generated configuration file, run the extract command with the --config-file flag:

In [None]:
%%bash
moseq2-extract extract sample_session/depth.dat --config-file config.yaml

# Extraction Result

This automatically select an ROI and extract data to the proc folder where depth.dat is located. When the extraction is completely, the results are stored in proc/results.h5, and a movie of the extraction is stored in proc/results.mp4.

Below is what the results.mp4 video should look like:

<img src="media/extracted_video.gif" alt="Generated video from raw data" title="Extracted Video" />

## flip-classifier
You will likely want to use a flip-classifier, which corrects for any 180 degree ambiguities in the angle detection. To download one of the pre-trained classifiers, use this command:

In [None]:
%%bash
moseq2-extract download-flip-file

<img src="media/extract-flipout.png" title="Download Flip-File Command Output" />

After downloading, you can then use the flip-classifier option with moseq-extract:

__(Note: if an extraction has already been made in the same directory, a second extraction may fail because they do not replace older versions of extracted data)__

In [None]:
%%bash
moseq2-extract extract sample_session/depth.dat --flip-classifier ~/moseq2/new_flip_classifier.pkl

<img src="media/extract-endstruct.png" alt="Ending directory structure" title="Ending Directory Structure" />

A jupyter notebook tutorial for [training a flip classifier can be found here](https://github.com/dattalab/moseq2-docs/blob/master/notebooks/training_a_flip_classifier.ipynb)

## Next Step: PCA Computation

This concludes the extraction step. [click here](http://localhost:8888/notebooks/MoSeq2_Step_2.ipynb) to view the MoSeq2-PCA walkthrough. || [github link](https://github.com/dattalab/moseq2-docs/blob/master/usage-docs/MoSeq2_Step_2.ipynb)