BASS

Stand alone package for the algorithm "Behavioral Action Sequence Segmentation".

This repository contains all the code for the implementation of the motif discovery algorithm BASS from paper "A lexical approach for identifying behavioral action sequences".

This repository allows you to run BASS on your own dataset. In order to reproduce the results from the paper please refer to the the original repository. For a sister repository on running BASS please check here

For questions, email gautam_nallamala(AT)fas.harvard.edu and gautam.sridhar(AT)icm-institute.org

Requirements:

Python3.8 or below, Cython, NumPy, pandas, matplotlib SciPy Sklearn editdistance

To install editdistance:

pip3 install editdistance

Setup BASS:

Before running any of the code, bass.pyx has to be compiled using:

Navigate to the folder BASS/ then run

python3 setup_bass.py build_ext --inplace

bass.pyx contains the implementation of the motif discovery algorithm, the specification of the mixture model and miscellaneous functions used for the analysis in the paper.

A detailed document on how to use the code can be found here.

What follows below is a brief description to get started.

Dataset organization

All datasets are present in the Data/ folder. The filenames have to .npy formats, and the names of the files have to be in the form datasetname_dataset_condition(x).npy. The x stands for a number to indicate multiple conditions of the same experimental setup that you would like to try. For example, we use Condition 0 as the control, Condition 1 as the experiment. You can in addition use many conditions. Each condition dataset should be a matrix of the form nbouts x nfeatures. In addition you will need a lengths file named as datasetname_lengths_condition(x).npy. These are used if your dataset has subsets, like multiple recordings, different trajectories etc.

For eg. if you had a dataset for the control setting with 10 recordings, datasetname_lengths_condition0.npy will be an array with 10 elements, where each element is the number of bouts in that recording. datasetname_dataset_condition0.npy will have all the bouts in all the recordings as a matrix as mentioned above.

Code organization

4 scripts are present in the main folder along with 2 jupyter notebooks. The ideal work flow is as follow -

python learn_gmm.py -c 0 1 2 (and so on) - The GMM can be learnt on different conditions as the same time.

Use Analyze_GMM notebook now in order to analyze your GMM and the bout types

python run_bass.py -c 0 - BASS must be run separately on different conditions
python compare_datasets.py -cn 0 -ch 1 - Find the enriched motifs in one condition over another
python decode.py -c 0 - Label the motifs in your dataset after the dictionary is found

Use Analyze_decoded in order to analyze your recordings and validating the sequences found using compare_datasets.py

For all the scripts, there are multiple options other than -c that can be added. In order to check the options, use python script_name.py --help

Important:

For each new application, a `soft’ clustering model has to be specified using a GMM.

If instead you have the data as a sequence of cluster labels, i.e., `hard' clustered data, then convert it into a sequence of probability vectors, and define a gmm model with means as the centers of the clusters and the circular standard deviation of 1.0

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
BASS		BASS
Data		Data
GMM		GMM
Results		Results
Utils		Utils
Analyze_GMM.ipynb		Analyze_GMM.ipynb
Analyze_decoded.ipynb		Analyze_decoded.ipynb
Create_Synthetic_dataset.ipynb		Create_Synthetic_dataset.ipynb
LICENSE		LICENSE
README.md		README.md
compare_datasets.py		compare_datasets.py
decode.py		decode.py
learn_gmm.py		learn_gmm.py
run_bass.py		run_bass.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BASS

Requirements:

Setup BASS:

Dataset organization

Code organization

Important:

About

Releases

Packages

Languages

License

GautamSridhar/BASS

Folders and files

Latest commit

History

Repository files navigation

BASS

Requirements:

Setup BASS:

Dataset organization

Code organization

Important:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages