# Naive Observer Analysis

Example of analysis of categorization by naive human observers of morphological images of spheroids under different treatments. Observers were told to categorize images into four categories, but without being told what the categories corresponded to, or what features to use.

In [None]:
# Import block

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import naive_observer_functions as nof

In [None]:
# Example data included in repo
datafile = 'datasets/raw/RawDatasetZ.csv'

# Load the data file
df_raw = nof.load_labels(datafile)

In [None]:
df_raw.info()

In [None]:
df_raw.head()

Rows are unique images. Columns are different Observers. The entry at $(m,n)$ gives the label (category) assigned to the $m$th image by the $n$th observer. Note that labels have no consistency across observers; they were not instructed with any meaning to the labels, other than there were 4 categories overall.

In [None]:
# Make similarity matrix
df_similarity = nof.make_similarity_matrix(df_raw)

# Sort via the corrgram method
df_sorted = nof.corrgram_sort(df_similarity)

In [None]:
# Column names are the Image IDs; the same set in the same order applies to the rows, although it is
# not included in the index

print(f'The similarity matrix has shape {df_similarity.shape}')
df_similarity.iloc[:10,:10]

The similarity matrix is symmetric, and the entry at $(m,n)$ gives the number of observers who placed the $m$th and $n$th images into the same category.

In [None]:
print(f'The sorted similarity matrix has shape {df_sorted.shape}')
df_sorted.iloc[:10,:10]

By sorting the matrix using the corrgram method, we account for the arbitrariness of the observer-assigned categories, and seek any blocks of consistent pairing of images.

In [None]:
# Plot heatmaps
nof.plot_similarity_matrix(df_similarity, title="Similarity of " + datafile)
nof.plot_similarity_matrix(df_sorted, title="Sorted Similarity of " + datafile)

The two heatmaps contain the same information, just arrayed in a different order. When sorted, there is a clear indication of two major blocks; these correspond to conditions with and without oxygen deprivation, which observers clearly sorted differently. It is ess clear if observers strongly discriminated glucose deprivation.