# Fiber Photometry Demo

This notebook contains exercises split across multiple lessons that span the following primary topics
* strings
* conditionals
* lists
* pandas
* dictionaries

The exercise modules are centered around analysis of neuroscience fiber photometry data, which involves recording changes in brightness of activity-dependent fluorescent proteins that are expressed in cells of interest. Typically the activity data stream consists of a single vector time-series (i.e. fluorescence intensity measurements acquired across time). For this particular dataset (acquired by Dr. Adam Gordon-Fennell), animals were periodically allowed to consume liquid rewards in "access periods." Both the onsets of the access periods and animal licks were recorded. The data consist of the following files:

* *_events.csv : table where rows contain the event type (access period or lick) and time of occurrence (in seconds)
* *_streams_session.csv : table where rows are samples acquired across time for the photometry recording. For each sample, the table details the channel, time, and fluorescence value

The exercises follow a sequence starting from loading in (behavioral and neural stream) data from tabular files (excel CSVs), examining the data and extracting relevant data streams, preprocessing the data, and visualizing the activity. In the 2nd half of the exercises, we incorporate the behavioral data and generate activitt plots centered around the behavioral events. 


In [None]:
import os
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

In [None]:
fdir = r'C:\Users\stuberadmin\Documents\GitHub\NAPE_python_tutorials\sample_data'
fname = '2022_06_10_abb12'

## Conditionals exercise 1: Detecting and loading data in different file formats.

The cell below defines the absolute file path (`epoc_data_path`) for the behavioral data. Currently we are loading in the .feather file, which is just another file format that can store tabular data, as opposed to csv or xlsx formats.

In [None]:
# note os.path.join is a function that simply merges and formats string arguments together such that they are compatible with other file loading functions
epoc_data_path = os.path.join(fdir, f'{fname}_events.feather') 
epoc_data_path

 In the sample_data folder, we also have a csv version - try changing the behavioral data path above to the csv version, then edit the conditional in the cell below to load the csv.

 Notice that we are using the pandas package (abbreviated in the import code section at the top of this notebook as `pd`) to load the tabular data. We will learn more about it later, but pandas is a package that adds tabular data processing capabilities to python.

In [None]:
if 'feather' in epoc_data_path:
    print('loading feather file')
    epoc_data = pd.read_feather(epoc_data_path)
elif 'csv' in epoc_data_path:
    print('loading csv file')
    epoc_data = pd.read_csv(epoc_data_path)
else:
    print('Not a valid file type')
    

In [None]:
# Calling the .head() method on our pandas dataframe will show the top 5 entries.
epoc_data.head()

## Conditionals exercise 2: Checking contents of lists using conditionals and boolean operators

The preview of the behavioral data table above exercise doesn't really tell us how many behavioral conditions there are (event_id_char column) since there are so many row entries. We can have python look through all entries in the event_id_char column and tell us what unique entries there are by using the .unique() method. We will go over what coding concept a method is in later lessons, but for now, just know it will return all unique entries for a given pandas dataframe/series.

In [None]:
event_conditions = list(epoc_data['event_id_char'].unique()) # grab unique condition names
event_conditions

In [None]:
if 'access_period' in event_conditions and 'lick' in event_conditions:
    print('Both conditions present')
elif 'access_period' in event_conditions or 'lick' in event_conditions:
    print('One out of the two conditions present')

## Pandas exercise 1: Exploring contents of a dataframe

In [None]:
epoc_data.nunique() # provides count of unique entries in each column
epoc_data.info() # can identify if there is missing data

In [None]:
epoc_data.head()

In [None]:
epoc_data.tail()

In [None]:
epoc_data.iloc[95:105]

In [None]:
# we can also get a sense of how many events (y dimension) there are. If we use the method `shape`, the first dimension outputted is y and the 2nd in x.
epoc_data.shape

## Pandas exercise 2: Obtaining data from a dataframe column based on information in another column.

We are working with neural activity data that was synchronized to events that happened during the recording. Our behavioral dataframe contains event times (event_ts) for two different types of behavooral events (access_periods and licks). Here let's figure out how to extract event times for a given condition

In [None]:
# grab event times for condition 1
event_one = epoc_data['event_ts'][epoc_data['event_id_char'] == 'access_period'].values
event_one

## Work on tseries data

## Pandas, math, and functions combined exercise: Correcting photometry traces with isosbestic channel

In [None]:
data_path = os.path.join(fdir, f'{fname}_streams_session.feather')

data = pd.read_feather(data_path)
data

In [None]:
fiber_id = '1' # 1 is NAcShell_medial; 2 is NAcShell_lateral

data_405 = data['signal'][data['channel'] == '405'].values
data_465 = data['signal'][data['channel'] == '465'].values



In [None]:
tvec = data['time'][data['channel'] == '405'].values 
tvec_zeroed = tvec - data['time'][0] # for plotting trace starting from time 0

fs = 1.0/np.mean(np.diff(tvec))

print(tvec_zeroed)

In [None]:
def controlFit(control, signal):
    
    # GuPPY version
    # function to fit control channel to signal channel

    p = np.polyfit(control, signal, 1)
    arr = (p[0]*control)+p[1]
    return arr


def deltaFF(signal, control):
    
    # function to compute deltaF/F using fitted control channel and filtered signal channel

    res = np.subtract(signal, control) # numerator of F(t)-f0
    normData = np.divide(res, control) # (F(t)-f0)/F0
    normData = normData*100

    return normData

fitted_control = controlFit(data_405, data_465) # this function would be helpful if we wanted to correct signals from multiple fluorophore channels
corrected_data = deltaFF(data_465, fitted_control) # this function would be helpful everytime we needed to compute dF/F

In [None]:
plt.figure(figsize=(10,7))
plt.plot(tvec_zeroed, data_405)
plt.plot(tvec_zeroed, data_465)
plt.plot(tvec_zeroed, fitted_control)
plt.plot(tvec_zeroed, corrected_data)
plt.xlabel('Time (s)')
plt.ylabel('Fluorescence')
#plt.xlim([170, 400]) # we can zoom in to see how motion artifacts can be corrected for
plt.legend(['405', '470', 'fit curve', 'corrected']);

### Dictionary Exercise: Dividing event times up into their respective conditions. These event times can then be organized into a dictionary for storage purposes.

In [None]:
# grab event times for all conditions via for loop
event_times_dict = {}
for event_name in epoc_data['event_id_char'].unique():
    event_times_dict[event_name] = epoc_data['event_ts'][epoc_data['event_id_char'] == event_name].values

event_times_dict

### Convert behavioral event times to samples: to prepare for indexing and snipping out trial activity windows.

#### Requires knowledge on: dictionaries and functions 

The data itself is not characterized by units of time, rather each item in the vector is a "sample" in the recording and does not hold any information on timing

On the other hand our behavioral events are in units of time. Accordingly, to extract trial snippits from the photometry data, we need to figure out which sample corresponds to a given event time

We can use this line of code : `np.argmin(abs(tvec - time))`

In [None]:
print(f'First event from condition one: {event_one[0]}')

print(f"The sample that corresponds to {event_one[0]} is {np.argmin(abs(tvec - event_one[0]))}")

print(f"Sample {np.argmin(abs(tvec - event_one[0]))}'s time: {tvec[4224]}") # if dataset changes, replace number in tvec[4224] with calculated sample number

### Let's extract the activity trace from one trial for demonstration purposes

In [None]:
# initialize start/end times for trial window
event_window_time = np.array([-1, 10])
event_window_samples = (event_window_time*fs).astype(int) # samples are integers

# generate vector the maps time onto samples in the event trial
trial_window_num_samples = event_window_samples[1] - event_window_samples[0]
tvec_trial = np.linspace(event_window_time[0], event_window_time[1], trial_window_num_samples)

# generate vector of every sample between start and end samples
# then offset to specific event's window by adding the event sample
trial_template_indices = np.arange(event_window_samples[0], event_window_samples[1])
trial_samples = trial_template_indices + np.argmin(abs(tvec - event_one[0]))

plt.plot(tvec_trial, corrected_data[trial_samples])

### Dictionary lesson: To analyze all trials, let's do some organization of the event conditions and trial times

In [None]:
# turn the above code into a function
def get_tvec_sample(tvec, time):
    return np.argmin(abs(tvec - time))

event_samples_dict = {}
for condition_name in epoc_data['event_id_char'].unique(): # loop through conditions
    
    print(condition_name)
    tmp_list = []
    
    # go through each event and compute the sample/frame that it occurred on (b/c the list is currently in seconds)
    for event in event_times_dict[condition_name]: # loop through events
        tmp_list.append(get_tvec_sample(tvec, event))
        
    # once we go through converting each time to sample, and add to a list iteratively,
    # store that list to its corresponding condition in the dictionary
    event_samples_dict[condition_name] = tmp_list

In [None]:
event_samples_dict.keys()

In [None]:
event_samples_dict['access_period']

### Numpy and dictionary lesson: Let's analyze data from all trials

In [None]:
# remove trials that have windows outside of entire recording, then extract data from each trial

data_trial_dict = {}
for condition_name in epoc_data['event_id_char'].unique(): # loop through conditions
    
    # initialize numpy array to populate with trial-extracted data
    tmp_trial_array = np.empty([len(event_samples_dict[condition_name]), trial_window_num_samples])
    
    # use trial index template, offset with event sample, and extract trial data
    for idx, event_sample in enumerate(event_samples_dict[condition_name]): # loop through events
        tmp_trial_array[idx,:] = corrected_data[trial_template_indices + event_sample]
    
    # once data are fully populated, add to dictionary
    data_trial_dict[condition_name] = tmp_trial_array

In [None]:
cond_names = list(data_trial_dict.keys())

plt.imshow(data_trial_dict[cond_names[0]], aspect='auto')
plt.ylabel('Trial', fontsize=14)
plt.xlabel('Sample', fontsize=14)
cbar = plt.colorbar()
cbar.set_label('Fluorescence', fontsize=14)

In [None]:
plt.plot(tvec_trial, np.mean(data_trial_dict[cond_names[0]], axis=0))
plt.plot(tvec_trial, np.mean(data_trial_dict[cond_names[1]], axis=0))
plt.ylabel('Fluorescence', fontsize=14)
plt.xlabel('Time', fontsize=14)
plt.legend(cond_names, fontsize=14)