Skip to content
some data analysis tests
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This reporsitory contains code and visualizations needed to reproduce the results presented in The associated data can be found here:


python 2.7 scikit_image==0.13.0

Getting started

To reproduce the figures and panels from 'Predicting natural behavior from whole-brain neural dynamics', download the data from the OSF repository. Each dataset is indentified by a strain name and condition eg. AML32_moving and contains two hdf5 files: (1) data [strain][condition].hdf5 - contains the raw calcium data, behavior data and other meta data (2) results [strain][condition]_results.hdf5 - contains models, model prediction, PCA, ...

hdf5 is a container format that allows storing variable datatypes in an organized structure with its associated metadata. A good way to see the nested data structure is by using an hdf5 viewer. Internally, our code converts the data to a nested dictionary. The basic structure of the input data is this:

Input/raw data

  • [strain]_[condition].hdf5
    • BrainScanner_Date_Time1
      • Neurons
        • Activity
        • Time
        • RawFluorescence
        • ...
      • Behavior
        • AngleVelocity
        • Eigenworms
        • ...
      • Centerlines
      • goodVolumes
    • BrainScanner_Date_Time2
    • BrainScanner..._

Output data/results

The results data is similarly structured by dataset. Each dataset contains analysis output for the

  • [strain]_[condition].hdf5
    • BrainScanner_Date_Time1
      • PCA
      • LASSO
      • ...

figure code

The code to make figures is located in the subdirectory 'figures'. You need to modify the path that points to the data according to where you downloaded it to. After that, you should see the figures.

Analyzing new data or running models from scratch

key pieces

  • dataHandler: Reads in wholebrain data.
  • dimReduction: contains actual analysis functions
  • dataQualityCheck: run diverse set of analysis methods, such as PCA, linear Regression or SVM. Here, output is plotted, and the program is more verbose. This is mostly for checking basic features of the data before running the full analysis.
  • runAnalysis: basically the same as dataQualityCheck, but runs quietly and saves results in a hdf5 file for input data and oputput data as described above.

prepare your data

You need a sucessfully analyzed BrainScanner dataset from the wholebrain analysis pipeline. dataFolder is the Brainscanner folder containing your dataset.

  1. in MATLAB, on tigress: run rerunDataCollectionMS(dataFolder) This creates a new version of the heatmap with some extra information, including a regularized derivative and the time synchornization between behavior cameras and calcium data.
  2. (Optional) If you are running the code in a different location and want to copy the datasets, copy heatdataMS.mat, centerline.mat and pointStatsNew.mat in to a folder.
  3. create a metadatafile. This should be called strain_condition.dat For example, AML32_moving.dat, and contain the name of your dataFolder. Each line has one dataset.
  4. Change the paths in the code (dataQualityCheck) to point to your data and to your desired output locations.
You can’t perform that action at this time.