# Introduction to Figure 5

This document describes how to generate figure 5 of the ensemble neural coding paper. It goes into a lot of detail because this is the first notebook to document zeebeez3. Eventually the information here will be broken down into multiple notebooks. It currently looks like this:

![Figure 5](images/figure5.svg)


# The preprocessing of acoustic stimuli and neural data

The tuning curves in figure 5a are built from encoders that predict either the trial-averaged spike rate or the trial-averaged overall power in an LFP band in response to a single acoustic syllable. The syllable has been quantified using the soundsig.sound.BioSound class. The actual processing sequence for the acoustic data is as follows:

1. The [zeebeez3.transforms.Biosound](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/transforms/biosound.py) class reads a [zeebeez3.core.Experiment](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/core/experiment.py) object and segments the syllables of the acoustic stimuli for a recording session.

2. The [zeebeez3.transforms.Biosound](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/transforms/biosound.py) class uses [soundsig.sound.BioSound](https://github.com/theunissenlab/soundsig/blob/master/soundsig/sound.py) to quantify the acoustic features of each unique syllable.

3. The [zeebeez3.transforms.Biosound](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/transforms/biosound.py) class writes the acoustic feature vector for each syllable to a file, that is located in:
> /auto/tdrive/mschachter/data/<bird_name>/transforms/BiosoundTransform_<bird_name>.h5

4. After the h5 files are generated for all recording sessions across all birds, the [zeebeez3.aggregators.biosound.AggregateBiosounds](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/aggregators/biosound.py) object is used to aggregate the biosounds across birds. The final aggregate biosounds file is written to:

> /auto/tdrive/mschachter/data/aggregate/biosound.h5

It is this aggregate biosound.h5 file that is used by the encoder.

The stimulus-conditioned spike and LFP data follow a separate processing path:

1. The raw data (with low-passed LFP and spike-sorted + multiunit spikes) is written to .nwb files. There is one .nwb file per session. *NOTE: this is a work-in-progress, and the notebook currently runs off of files that were generated from a different hdf5-based source.*

2. The raw data is read from .nwb files into a [zeebeez3.core.Experiment](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/core/experiment.py) object, which provides useful functions for accessing spikes, LFP, and stimulus information. *NOTE: this is also a work-in-progress*

3. The raw data is preprocessed to make stimulus-conditioned data easier to access, using [zeebeez3.transforms.stim_event.StimEventTransform](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/transforms/stim_event.py). The stimulus-conditioned data is located in hdf files at:
> /auto/tdrive/mschachter/data/*bird*/transforms/StimEventTransform_*bird*\_*block*\_*site*\_*hemisphere*.h5
    
   For example, this is a StimEventTransform file, from a recording site that is frequently used to generate plots (*note that this file was generated on April 7, 2016*):

> /auto/tdrive/mschachter/data/GreBlu9508M/transforms/StimEvent_GreBlu9508M_Site4_Call1_L.h5

4. The spike+LFP data in the StimEventTransform file and the acoustic feature vectors in the BiosoundTransform file for a bird are then processed and joined together in the [zeebeez3.transforms.pairwise_cf.PairwiseCFTransform](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/transforms/pairwise_cf.py) class. This class does many things:

    * It zscores the LFP on each electrode using the the mean and standard deviation computed over time across all stimuli and trials.
    * It filters out syllables that are less than 50ms in duration, where syllable duration is defined as the duration of nonzero amplitude from onset to offset, plus 30ms of time following the offset.
    * For each stimulus syllable, the multi-trial, multi-electrode LFP are passed to the function *compute_lfp_spectra_and_cfs*. This is a super important function! This is where the trial-averaged power spectra, trial-mean-subtracted power spectra, and cross coherencies are computed.
    * Spike rate and spike synchrony is also computed across trials.
    * All the processed LFP and spike data is saved to an hdf5 file of the form:
    
> /auto/tdrive/mschachter/data/*bird*/transforms/PairwiseCF_*bird*\_*block*\_*site*\_*hemisphere*\_raw.h5

   For example, this is a PairwiseCF file, that corresponds to the previously mentioned StimEventTransform file:
   
> /auto/tdrive/mschachter/data/GreBlu9508M/transforms/PairwiseCF_GreBlu9508M_Site4_Call1_L_raw.h5


    
    



# How the encoder fits the data

The class that reads the acoustic feature vectors and computes the encoders is the [zeebeez3.models.acoustic_encoder_decoder.AcousticEncoderDecoder](https://github.com/theunissenlab/zeebeez3/blob/master/zeebeez3/models/acoustic_encoder_decoder.py) class. As you may have noticed from the name, this class actually fits both the encoder and the decoders for the paper. We'll just talk about the encoder for now.

After reading the previous section, it is easy to explain what two files the AcousticEncoderDecoder class takes as input:
1. The PairwiseCF transform file for a given block.
2. The aggregate biosounds file.


