#Problem 5: Matrix factorization
The above procedure is common in many labs, however can cause errors due to the dependency on the summary image and the multiple stages of processing. Matrix factorization has emerged as an alternative approach for identifying ROIs from the full spatio-temporal video. Here we will explore three different types of factorization and compare the results using a table.

##Part C:
Now try ICA for the same pixels-by-time matrix as in part A (for a specific number of components you find reasonable). What are the differences that you note?

In [2]:
import plotly
import plotly.graph_objects as go
import matplotlib.pyplot as plt
import tifffile
from google.colab import drive
from PIL import Image
import numpy as np
import sklearn
from sklearn import cluster, decomposition

# Find the tif file in google drive
drive.mount('/content/drive')
file = "/content/drive/MyDrive/Neural_Signals_and_Computation_HW1/TEST_MOVIE_00001-small.tif"

# Load tif file into numpy array and image
data = tifffile.imread(file)

# Vectorize each frame into a column vector thenc combine each column together
# Matrix -> (500*500, 500) or (MN, T)
full_data = []
for k in range(len(data)):
  column = np.array(data[k]).flatten()
  full_data.append(column)
full_data = np.array(full_data).transpose()

# Create a square layout for the plot
layout = go.Layout(
    width=data.shape[1],  # Set the width of the plot
    height=data.shape[2],  # Set the height of the plot
    xaxis=dict(range=[0, data.shape[1]]),  # Set the x-axis range to match the width of the image
    yaxis=dict(range=[0, data.shape[2]]),  # Set the y-axis range to match the height of the image
    margin=dict(l=0, r=0, t=0, b=0),  # Set the margins to 0 to remove unnecessary spacing
)

# Check when we choose number of components k = 5
for i in range(1):
  n_components = i + 5

  ica_estimator = decomposition.FastICA(
    n_components=n_components, max_iter=400, whiten="arbitrary-variance", tol=15e-5
  )
  ica_result_mn = ica_estimator.fit_transform(full_data)
  ica_result_f = ica_estimator.components_

  result = np.matmul(ica_result_mn, ica_result_f)

  ica_data = []
  for j in range(result.shape[1]):
    ica_data.append(result[:, j].reshape((500, 500)))
  ica_data = np.array(ica_data)

  # Plot the summary images
  print("Using " + str(i + 5) + " components: ")
  fig_mean = go.Figure(data=go.Heatmap(z=ica_data[0]), layout=layout)
  fig_mean.show()

# Original Data Image Frame 0
print("Original Data")
fig_mean = go.Figure(data=go.Heatmap(z=data[0]), layout=layout)
fig_mean.show()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Using 5 components: 


Original Data


Compared to PCA and NMF, it appears that ICA factorized the data in the opposite direction where most of the background in the image has what seems to be a higher value than the regions of interest / cells. Some similarities are present, but the most prominent differences in the heat map in darker and brighter areas of flouresence show a clearer distinction of certain regions than PCA and NMF. At the same time, like PCA, it appears ICA allows negative numbers as well compared to the original image and NMF that does not use negatives.