# Basic analysis

NOTE: This requires a python virtualenv already installed with the appropriate dependencies, by e.g.:

`conda create -n acanaba python=3.10 numpy pandas ipykernel jupyter holoviews bokeh datashader seaborn pytables`

Let's load our tracked data and calculate some basic metrics for the pre-reversal and first reversal session.

First we need to import the `pandas` module which allows us to manipulate tabular data. The module provides two basic data structures, DataFrames (table-like) and Series (row-like).

In [None]:
import pandas as pd
idx = pd.IndexSlice

The `idx` object allows us to do complex indexing on our DataFrames and Series objects.

Now let's load our analysed position data. Initially let's just load up the first data file and see what it looks like.

In [None]:
track_fns = {'prerev': '../rawdata/sub-lizzy/sub-lizzy_ses-prerev_task-RR10_vidDLC_resnet50_dlc_acan_masterOct23shuffle1_800000.h5',
             'rev01': '../rawdata/sub-lizzy/sub-lizzy_ses-rev01_task-RR10_vidDLC_resnet50_dlc_acan_masterOct23shuffle1_800000.h5'}
_df = pd.read_hdf(track_fns['prerev'])
_df

This is reasonable, but there seems to be an uninformative (and incorrect) 'scorer' index, and the frame number has no name. Let's fix this.

In [None]:
_df.columns = _df.columns.droplevel(0)
_df.index.name = 'frame_id'
_df

Great, this looks better. Now let's make a function for this, so we can load all our data together.

In [None]:
def load_track_session(filename: str) -> pd.DataFrame:
    '''Load one session analysed by DeepLabCut into a DataFrame'''
    df = pd.read_hdf(filename)
    df.columns = df.columns.droplevel(0)
    df.index.name = 'frame_id'
    return df

Now load all the sessions into a single DataFrame.

In [None]:
track_dfs = {key: load_track_session(fn) for key, fn in track_fns.items()}
track_df = pd.concat(track_dfs, names=['session_id'])
track_df

If we want to extract one point, let's say the animal's nose, we can index just those columns.

In [None]:
track_df.loc[:, idx['nose']]

Similarly, if we want to see just the frames where DeepLabCut classified a point with high confidence, we can filter the results on the likelihood.

In [None]:
track_df.loc[:, idx['nose']].query('likelihood > 0.95')