## Calcium Analysis project

*Tutor: Elena Nicollin*

In this project, you will **analyze calcium imaging data recorded in a freely roaming mouse**.

The goal of this project is to obtain a visual representation of the **activities of cells** recorded in the *CA1 layer of the hippocampus*, as a function of the **position of the mouse** in an open field.



In [None]:
# Import the python libraries to have all functions needed
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd     # this library handles dataframes, is just used to load the data
import scipy.signal as signal

# Load the data
CaData = np.array(pd.read_csv("OF1_2025_traces.csv", header = None))
SpatialData = np.array(pd.read_csv("OF1_2025_positions.csv", header = None))
ScopeFrames = np.array(pd.read_csv("OF1_2025_MiniscopeFrames.csv", header = None))[:,0]
CamFrames = np.array(pd.read_csv("OF1_2025_CameraFrames.csv", header = None))[:,0]

The 4 datasets loaded are:
- **CaData**: The preprocessed data from the calcium imaging. Values are the average calcium fluorescence intensity, for each neuron, at each time frame recorded with the calcium miniscope.
- **SpatialData**: The x and y coordinates of the mouse in the open field, at each time frame recorded with the camera. Unit is millimeters.
- **ScopeFrames**: Time points of the frames recorded with the calcium miniscope. Unit is seconds.
- **CamFrames**: Time points of the frames recorded with the camera. Unit is seconds.

### Overview of the datasets

In [44]:
# Show the first 5 lines of the calcium imaging data

# Show the first 5 lines of the position data

# Show the first 5 lines of the miniscope frames data 

# Show the first 5 lines of the camera frames data 


In [45]:
# Get the number of rows and columns of each dataset.



### Spatial data

In [46]:
# From the SpatialData dataset, figure out the dimensions of the open field



In [47]:
# Visualize the trajectory of the mouse in the open field over the entire recording



### Calcium data

In [None]:
# How many neurons are there?
# Visualize the calcium recording of one neuron of your choice



In [None]:
# Zoom in on the signal to clearly display a single calcium event



The calcium fluorescence signal is an indirect readout of cellular activity. We therefore need to process it to obtain a signal closer to the real neuronal activity.

As you can see on the isolated event above, a single calcium event has a fast rise and a slow decay. We will ignore the fast rise for simplicity, and attempt to correct the slow decay in order to obtain a more narrow event, similar to a single neuron spike. To do this, we will perform a **deconvolution**.


<img style="float: center;" width="720" src="https://i.imgur.com/uKWFXDU.png">

In [None]:
# Select a calcium event that you think represents one single spike from the neuron.
# Isolate this event so that it begins exactly at the peak.
# Save this event as its own short array, and visualize it. It should look like a decreasing exponential



In [50]:
# Fit an exponential function to your event signal.




# To verify, you can visualize your event and the fitted curve on one plot.


In [None]:
# Normalize the fitted curve such that its total sum is equal to 1



In [None]:
# Use this normalized curve to deconvolve the calcium signal of each neuron, using the deconvolve function from scipy.



In [None]:
# Deconvolution reduces the length of a signal. How much shorter are your new calcium signals? Crop the end of the ScopeFrames data accordingly.



In [None]:
# Visualize the original signal and the deconvolved signal of one neuron. Make sure you can still clearly see the activity peaks.



In [54]:
# This signal still has some noisy artifacts. Process the deconvolved data such that all data points below the threshold of your choice are set to zero.




In [55]:
# Visualize the noiseless data of one neuron




### Aligning the spatial data and the calcium data

We know that the spatial data was recorded by a camera, while the calcium data was recorded with a miniscipe. Consequently, the two datasets must be realigned.

To remind you, *ScopeFrames* contains the times of the frames recorded with the miniscope, while *CamFrames* contains the times of the camera frames.

In [56]:
# Get the time of the first frames of each dataset. Which recording was started first, miniscope or camera? 



# Do the same with the last frames recorded. Which recording was stopped first?



In [57]:
# Calculate the frame interval of the miniscope, and the frame interval of the camera.



In [None]:
# Based on the results from the two above cells, which recording should you use as the reference to align the frames?
# Create a numpy array filled with zeros, of a size equal to the number of frames from your reference recording.




# For each of your reference frames, get the index of the non-reference frame that's the closest, storing it in your new array.




In [None]:
# You now have an array referencing which data points to keep from your data.
# You will store these data points in a new 2D array.
# Initialize this 2D array filled with zeros. How many rows and columns should it have?



# At each row in your new 2D array, save the corresponding row of your original dataset, using the array of indexes you obtained in the previous cell.



In [None]:
# You can now visualize your new data to ensure you didn't lose any major information




In [61]:
# If this alignment was successful, you should now have the same number of data points in your spatial data and in your calcium data.
# Get the number of rows and columns of each to make sure



### Spatial representation of the neuronal activities

Lastly, we will use the matplotlib.pyplot function [`imshow`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.imshow.html) to visualize the average neuronal activity of a neuron per area.

To do so, we must first construct a 2D matrix that will contain these values of average neuronal activities. Let's do this first for just one neuron.

In [None]:
# Let's start by extracting the mean activity of neuron 0 when the mouse is situated in the bottom left corner of the open field (x<35 and y<35)
neuron_index = 0
x_max = 35
y_max = 35

# Get the indices of the rows from your spatial data where the mouse x coordinate is < x_max and the y coordinate is < y_max


# Use these index values to extract the activity of your neuron at these same time points, and compute the average activity.


In [None]:
# We need to repeat the process for the rest of the open field, which must be sliced into a grid. Keep squares of size 35 for now.

# Knowing the size of the field, create an array of the x grid values, and an array of the y grid values.



# Create an empty 2D matrix with the same dimension as your grid: this 2D-array will store the average calcium activities.



# 1. To go over each square in the grid, we will need two loops: one loop for the x values; and another one for the y values, inside the first loop
# 2. For each square in the grid, the mouse must not only be <xmax and <ymax, but also >xmin and >ymin. In your inner loop, get the four values bordering your square (xmax, ymax, xmin, ymin).
# 3. Lastly, for each square in the grid, extract the index values of the spatial data when the mouse is in the correct space, and use these to compute the average activity of the neuron
# 4. Save that value in your 2D matrix. How can you keep track of where to save each value in this 2D array?




In [None]:
# Plot the result using the plt.imshow() function

plt.imshow()
plt.title('neuron index ' + str(neuron_index) )
plt.colorbar()
plt.axis('off')
plt.show()

In [None]:
# You can now add another outer loop to repeat the process for each neuron.



In [None]:
# Try playing with the size of the grid squares. What happens if they're too small? Too large?
# You can also check that your representation is correct by giving your grid squares different sizes along the x and y (helps to visually tell them apart).