# Fire analysis script
The goal of this script is to complete a regression where region of interest type, saliency, and other features are used to predict fixation behavior.

## 1. Load eye movement data

This cell loads the monkey's raw eye movement data from a CSV file.

In [4]:
import pandas as pd

eye_df = pd.read_csv("RawData_AllAnimals.csv")
eye_df.head()

#remove rows where ROI is "OffScreen"
eye_df = eye_df[eye_df['ROI'] != "OffScreen"]
eye_df = eye_df[eye_df['ImageType'] == "F"]

## 2. Load ROI masks and saliency maps

This cell loads all ROI type/index masks and saliency maps into dictionaries for fast lookup by image name.

In [5]:
import numpy as np
import os

# Helper to get base image name from Stimuli column
def get_base_name(stimuli):
    return os.path.splitext(os.path.basename(stimuli))[0]

# Load ROI masks
roi_type_dict = {}
roi_index_dict = {}

csv_names = os.listdir("csv_output")
csv_names.remove("mean_saliency.csv")  # Exclude mean_saliency if present


for fname in os.listdir("contour_masks"):
    if "roi_type" in fname:
        key = fname.split("_roi_type")[0]
        roi_type_dict[key] = np.load(os.path.join("contour_masks", fname))
    elif "roi_index" in fname:
        key = fname.split("_roi_index")[0]
        roi_index_dict[key] = np.load(os.path.join("contour_masks", fname))

# Calculate the size (number of pixels) of each ROI in each image

roi_sizes = {}

for img_name, roi_index_mask in roi_index_dict.items():
    unique, counts = np.unique(roi_index_mask, return_counts=True)
    roi_sizes[img_name] = dict(zip(unique, counts))

# Example: print ROI sizes for the first image
first_img = list(roi_sizes.keys())[0]
print(f"ROI sizes for {first_img}: {roi_sizes[first_img]}")

# Load saliency maps robustly
saliency_dict = {}
for fname in csv_names:
    if fname.endswith("_saliency.csv"):
        key = fname.split("_saliency")[0]
        # Try reading as CSV, skip header if present
        try:
            arr = np.loadtxt(os.path.join("csv_output", fname), delimiter=',')
        except Exception:
            arr = pd.read_csv(os.path.join("csv_output", fname), header=None).values
        saliency_dict[key] = arr



ROI sizes for FStil13.jpg: {0: 1104669, 1: 547130, 2: 96792, 3: 74166, 4: 33920, 5: 20874, 6: 63486, 7: 43979, 8: 32607, 9: 28316, 10: 27661}


## 3. Join eye movement data with image features

This cell creates a new DataFrame where each fixation is enriched with ROI type, ROI index, and saliency at the fixation location.

In [6]:
def get_base_name_images(stimuli):
    # Always use .jpg for lookup, regardless of original extension
    base = os.path.splitext(os.path.basename(stimuli))[0]
    return base + ".jpg"

features = []

for idx, row in eye_df.iterrows():
    img_key = get_base_name_images(row['Stimuli'])
    x, y = int(row['XPos']), int(row['YPos'])
    
    # Skip if image not found or fixation out of bounds
    if img_key not in roi_type_dict or img_key not in saliency_dict:
        continue
    roi_type = roi_type_dict[img_key]
    roi_index = roi_index_dict[img_key]
    saliency = saliency_dict[img_key]
    if y >= roi_type.shape[0] or x >= roi_type.shape[1]:
        continue

    features.append({
        "Subject": row["Subject"],
        "Stimuli": row["Stimuli"],
        "FixStart": row["FixStart"],
        "FixEnd": row["FixEnd"],
        "FixDur": row["FixDur"],
        "XPos": x,
        "YPos": y,
        "ROI": row["ROI"],
        "Block": row["Block"],
        "Trial": row["Trial"],
        "ImageType": row["ImageType"],
        "Species": row["Species"],
        "SubjectName": row["SubjectName"],
        "roi_type": roi_type[y, x],
        "roi_index": roi_index[y, x],
        "saliency": saliency[y, x],
        "area": roi_sizes[img_key].get(roi_index[y, x], 0)  # Get area size or 0 if not found
    })

fixation_features_df = pd.DataFrame(features)
fixation_features_df.head()

Unnamed: 0,Subject,Stimuli,FixStart,FixEnd,FixDur,XPos,YPos,ROI,Block,Trial,ImageType,Species,SubjectName,roi_type,roi_index,saliency,area
0,Cheyenne_20240702_1420,FStil09.png,232.713,392.695,163.256,1527,561,NonFire,1,Cheyenne_Block1_20240702_1420.xlsx,F,Baboon,Cheyenne,1,1,52.848877,771609
1,Cheyenne_20240702_1420,FStil09.png,409.259,645.983,239.965,1093,378,Fire,1,Cheyenne_Block1_20240702_1420.xlsx,F,Baboon,Cheyenne,1,1,171.167404,771609
2,Cheyenne_20240702_1420,FStil09.png,655.942,869.192,216.689,1193,525,NonFire,1,Cheyenne_Block1_20240702_1420.xlsx,F,Baboon,Cheyenne,1,1,148.384018,771609
3,Cheyenne_20240702_1420,FStil09.png,899.287,1055.843,159.995,1664,123,NonFire,1,Cheyenne_Block1_20240702_1420.xlsx,F,Baboon,Cheyenne,0,0,36.488598,993632
4,Cheyenne_20240702_1420,FStil04.png,83.609,213.495,133.334,929,579,NonFire,1,Cheyenne_Block1_20240702_1420.xlsx,F,Baboon,Cheyenne,0,0,40.138653,1370896


## 4. Logistic Regression: Predicting fire region fixations

This cell shows how to use the joined DataFrame to predict whether a fixation falls on a fire region (roi_type == 1) using saliency and fixation duration as predictors.

In [7]:
from sklearn.linear_model import LogisticRegression

# Create binary target: 1 if fire region, 0 otherwise
fixation_features_df["is_fire"] = (fixation_features_df["roi_type"] == 1).astype(int)
X = fixation_features_df[["saliency", "FixDur", "area"]]
y = fixation_features_df["is_fire"]

model = LogisticRegression()
model.fit(X, y)

print("Regression coefficients:", model.coef_)
print("Odds ratios:", np.exp(model.coef_))

Regression coefficients: [[ 8.47190471e-03  6.00536606e-04 -1.41989907e-06]]
Odds ratios: [[1.00850789 1.00060072 0.99999858]]


## 5. Machine Learning Logistic Regression

In this cell, we apply a machine learning approach using logistic regression to predict whether a fixation falls on a fire region (roi_type == 1) based on saliency and fixation duration. The data is split into training and test sets to evaluate the model's predictive accuracy on unseen data.

In [8]:
# Example: Apply logistic regression to your fire fixation data

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Prepare features and target
X = fixation_features_df[["saliency", "FixDur", "area"]]
y = fixation_features_df["is_fire"]

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=23)

# Fit logistic regression model
clf = LogisticRegression(max_iter=10000, random_state=0)
clf.fit(X_train, y_train)

# Evaluate accuracy
acc = accuracy_score(y_test, clf.predict(X_test)) * 100
print(f"Logistic Regression model accuracy: {acc:.2f}%")

Logistic Regression model accuracy: 62.45%


In [9]:
import statsmodels.api as sm

# Predict FixDur using saliency, area, and whether the fixation is on fire
X = fixation_features_df[["saliency", "area", "is_fire"]]
X = sm.add_constant(X)  # Adds intercept
y = fixation_features_df["FixDur"]

model = sm.OLS(y, X)
result = model.fit()
print(result.summary())

                            OLS Regression Results                            
Dep. Variable:                 FixDur   R-squared:                       0.002
Model:                            OLS   Adj. R-squared:                  0.001
Method:                 Least Squares   F-statistic:                     6.849
Date:                Mon, 02 Jun 2025   Prob (F-statistic):           0.000132
Time:                        16:35:43   Log-Likelihood:                -83321.
No. Observations:               12487   AIC:                         1.667e+05
Df Residuals:                   12483   BIC:                         1.667e+05
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        255.8513      5.305     48.224      0.0