# Joint Cross-Attnetion Fusion Net (JCAF-Net)

### 1. Inputs gaze and mouse:
- X/Y coordinates
- Speed and direction (velocity)
- Joint features: Euclidean distance, direcational angle between gaze and mouse

### 2. Dual-Pathway design:
- Cross-Attention Path: Learns interations between gaze and mouse using attention scores
- Joint Feature Path: Processes handcrafted joint features (e.g. distance, angle) with CNNs

### 3. Feature Extractor:
- ResNet-34 (2D CNNs) as backbone for unimodal (gaze and mouse) and joint inputs

### 4. Cross-Attention Module:
- Applied mid-way trough ResNet (between resudial blocks 2 and 3)
- Attends to gaze features using mouse as key/value and vice versa

### 5. Fusion and Classification:
- Attended features + joint CNN features are concatenated
- FCN (with re-weighting) produces final class logits.


In [None]:
gaze_cols = ["Gaze point X", "Gaze point Y"]
mouse_cols = ["Mouse position X", "Mouse position Y"]
joint_cols = ["Gaze-Mouse Distance", "Angle Between Gaze and Mouse"]

# You need to compute these joint features if not present
def compute_joint_features(df):
    gx, gy = df["Gaze point X"], df["Gaze point Y"]
    mx, my = df["Mouse position X"], df["Mouse position Y"]

    df["Gaze-Mouse Distance"] = np.sqrt((gx - mx)**2 + (gy - my)**2)
    df["Angle Between Gaze and Mouse"] = np.arctan2(gy - my, gx - mx)  # radians
    return df

dataset_time_resampled = compute_joint_features(dataset_time_resampled)

features = gaze_cols + mouse_cols + joint_cols