# BIPN 162 - Fear Decoding


# Names

- Jack Celaya
- Sophia Lanaspa
- Ehsun Yazdani

# Abstract

We will be doing option 4: re-analysis of existing data using methods discussed in class to ask a question not addressed in the original paper. In the paper, they recorded single neurons in the VP while rats got different tastants and were trained to associate saccharin (normally palatable) with a negative outcome. Over time, saccharin became aversive. By aligning neural activity with licking and looking at how firing patterns changed, they showed how VP neurons tracked palatability and learning. They used PCA to reduce data complexity and visualize how responses shifted with learning.


# Research Question

[insert here]

## Background and Prior Work

[insert here]

# Hypothesis


[insert here]


# Data

## Data overview


- Dataset #1 
  - Dataset Name: bf-3 Data
  - Link to the dataset: https://doi.org/10.6080/K0HH6H8V 
  - Number of variables: Firing matrix (50, 16, 435) numpy array
  - Description: Recordings from ventral pallidum neurons in male rats undergoing fear discrimination

## Neural Dataset


To begin working with the data, we need to import some basic libraries for later use. The data is originally stored as a MatLab file, so our goal is to convert the multi-dimensional data into a .npz file to analyze and extract the firing rate for the neurons recorded so we can begin to understand what we are working with. The .npz file will allow us to use python to do our analysis as well as be structured in a numpy 3d array.


In [1]:
# Common libraries that will be used
import numpy as np
import pandas as pd 
import os
import scipy.io
import matplotlib.pyplot as plt

In [None]:
# first is to extract and save everything as one .npz file
# cube.mat contains the researchers compiled data in the .mat format.

def load_mat_file(filepath): # use scipy to open .mat file and extract relevent data.
    """Load a .mat file and return the VPcube structure."""
    mat_contents = scipy.io.loadmat(filepath, simplify_cells=True)
    return mat_contents['cube']

def extract_all_fields(cube):
    """Extract all relevant fields from the VPcube MATLAB structure."""
    return {
        'fire': cube.get('fire'),  # firing data (z-scored, diff, raw, etc.)
        'cer': cube.get('cer'),    # suppression ratios
        'poke': cube.get('poke'),  # nose poke behavior
        'name': cube.get('name'),  # neuron/session identifiers
        'wave': cube.get('wave'),  # waveform shape
        'half_duration': cube.get('halfduration'),
        'amplitude_ratio': cube.get('amplituderatio'),
        'tag': cube.get('tag')     # tag metadata
    }

def save_to_npz(output_path, data_dict):
    """Save all extracted fields to a compressed .npz file."""
    np.savez_compressed(output_path, **data_dict)
    return output_path

# Run the extraction and save process using the uploaded file
# if it doesnt work, mess around with the file paths.
file_path = '/cube.mat'
cube = load_mat_file(file_path)
full_data = extract_all_fields(cube)
output_file = '/FearDecoding/full_dataset.npz'
saved_path = save_to_npz(output_file, full_data)

saved_path

These 3 functions take the matlab file and load it. Extract the variables we are interested from the matlab file, and then convert to a .npz file.

It is important to check first that the correct data has been transfered over, and will need to index through the frame multiple times to reach the spike train data to start with.

In [13]:
# load in the dataset
data = np.load("full_dataset.npz", allow_pickle=True)

# This loads in our neuron spike tensor
fire = data['fire']

# Unwraps the real object inside
fire_raw = fire.item()
print(fire_raw.keys())

# unrwapping zscore data from 'z'
zscore_data = fire_raw['z'] #doesnt work so there must be more keys
print(zscore_data.keys()) # new set of columns


dict_keys(['CV', 'CS', 'raw', 'diff', 'z'])
dict_keys(['s1', 'ms500', 'ms250', 'ms100'])


When searching through the different layers, the shape would not appear when printed. This was the indicator that there are more keys to search through. From the data description, it is known that the firing rate is down the zscore line of columns.

The second dictionary of keys contains the miliseconds they recorded in the experiment. 

In [14]:
firing_matrix = zscore_data['ms250']
print(type(firing_matrix))
print(firing_matrix.keys())

firing_matrix = zscore_data['ms250']['pellet']
print(type(firing_matrix))
print(np.shape(firing_matrix))  # should be (trials, time_bins, neurons)


<class 'dict'>
dict_keys(['interval', 'pellet', 'itiPokeCess'])
<class 'numpy.ndarray'>
(50, 16, 435)


### Check for Null/Missing and Duplicates

[explain here]

In [5]:
# code here

### Create a new dataset with our variables of interest

In [6]:
# code here

[explain here]


## Exploratory Data Analysis 

[explain here]

In [7]:
# code here

## Support Vector Machine

[explain here]

In [8]:
# code here

## Random Forest

[explain here]

In [9]:
#code here

## Logistic Regression

[explain here]

In [10]:
# code here

## Results

[explain here]

# Discussison and Conclusion

[insert here]


# Team Contributions

Jack:


Sophia:


Ehsun: 

