# Problem 2: Summary images.
Load up the calcium image video in the file: TEST MOVIE 00001-small.tif. This file represents the raw fluorescence video in TIF format, but without the motion. One of the most common ways that people analyze functional data is to try to identify individual cells in the data in a “summary image”. A summary image condenses an entire video sequence into a single image, wherein each pixel represents a summary of the entire time-trace at the location of that pixel in the video.

## Part B
What would you expect a good statistic for a summary image to capture? What other statistics would you think could work? Try out 1-2 others: were you right?

A good statistic summary of an image should capture the essence of the activity from a cell in the calcium image. Rather than just locating where the cells are in the data, it would be important to then show a visual representation of how active that cell was during the time frame of study. While mean and median provides a great view of all the cells, it becomes apparent when watching the data and looking at the variance plot that it does not differentiate which cells were more active than not. At the same time the variance summary plot emphasizes which cells turned on and off the most, but poorly captures the rest of the cells. Other statistics one could use is a range summary, taking the difference of the max and min of the data across all frames and finding the difference for each point, and another statistic may be looking at just the max or min of each point to see the full activation potential of a cell.

In [10]:
import plotly
import plotly.graph_objects as go
import tifffile
from google.colab import drive
from PIL import Image
import numpy as np

# Find the tif file in google drive
drive.mount('/content/drive')
file = "/content/drive/MyDrive/Neural_Signals_and_Computation_HW1/TEST_MOVIE_00001-small.tif"

# Load tif file into numpy array and image
data = tifffile.imread(file)

# Utilize a Max, Min and Range Summary Images
data_max = data[0]
data_min = data[0]
for i in range(data.shape[0]):
  data_max = np.maximum(data[i], data_max)
  data_min = np.minimum(data[i], data_min)

data_range = np.subtract(data_max, data_min)

# Plot the summary images
# Create a square layout for the plot
layout = go.Layout(
    width=data.shape[1],  # Set the width of the plot
    height=data.shape[2],  # Set the height of the plot
    xaxis=dict(range=[0, data.shape[1]]),  # Set the x-axis range to match the width of the image
    yaxis=dict(range=[0, data.shape[2]]),  # Set the y-axis range to match the height of the image
    margin=dict(l=0, r=0, t=0, b=0),  # Set the margins to 0 to remove unnecessary spacing
)

# Plot Images
fig_max = go.Figure(data=go.Heatmap(z=data_max), layout=layout)
fig_min = go.Figure(data=go.Heatmap(z=data_min), layout=layout)
fig_range = go.Figure(data=go.Heatmap(z=data_range), layout=layout)
print("Max Summary")
fig_max.show()
print("Min Sumary")
fig_min.show()
print("Range Summary")
fig_range.show()

Output hidden; open in https://colab.research.google.com to view.

The range summary manages to capture most of the information found in the variance summary and still addes some information on the location of other cells that may not have as much activity (on and off) in the time frame we study. For certain cases, this may be a positive, but for other cases this may only muddle up information one wishes to narrow down to. The same descriptions can be given for the max statistic. On the other hand, the min statistic appears to be closer to the images we created for the centrality summaries, showing off locations of cells in a general sense, but does not provide much information on potential activities, as it only covers the minimum activity found for each cell in the image. Overall, one could argue that the range and max summaries may provide useful information in analysis.