## NEURAL ACTIVATIONS EXPLORATION

RQ1: How does the model process semantically plausible vs implausible sentences? 
Method: check whether there is an overlap in the most influencial neurons.

* H1: there is a substantial overlap --> set of SYNTAX NEURONS
* H2: there is no substantial overlap --> SYNTAX + SEMANTICS network

In [1]:
import numpy as np

# --- Configuration ---
NUM_TOP_NEURONS = 50  # Let's look at the overlap in the top 50

# --- Load the Results ---
try:
    effects_anomalous = np.load('results/analysis/anomalous_neuron_effects.npy')
    effects_core = np.load('results/analysis/core_neuron_effects.npy')
except FileNotFoundError as e:
    print(f"Error: Make sure both neuron effect files exist. Missing: {e.filename}")
    exit()

# --- Find the Top Neurons for Each Condition ---

# Get the indices of the neurons, sorted from least to most influential
sorted_indices_anomalous = np.argsort(effects_anomalous)
sorted_indices_core = np.argsort(effects_core)

# Take the last N indices to get the top N most influential
top_neurons_anomalous = set(sorted_indices_anomalous[-NUM_TOP_NEURONS:])
top_neurons_core = set(sorted_indices_core[-NUM_TOP_NEURONS:])

print(f"--- Overlap Analysis (Top {NUM_TOP_NEURONS} Neurons) ---")
print(f"\nTop influential neurons for ANOMALOUS:\n{sorted(list(top_neurons_anomalous))}")
print(f"\nTop influential neurons for CORE:\n{sorted(list(top_neurons_core))}")

# --- Calculate the Overlap ---
overlapping_neurons = top_neurons_anomalous.intersection(top_neurons_core)

overlap_percentage = (len(overlapping_neurons) / NUM_TOP_NEURONS) * 100

print("\n--- RESULTS ---")
print(f"Number of overlapping neurons: {len(overlapping_neurons)}")
print(f"Overlap percentage: {overlap_percentage:.2f}%")
print(f"\nOverlapping neuron indices:\n{sorted(list(overlapping_neurons))}")

# --- Interpretation ---
print("\n--- Interpretation ---")
if overlap_percentage > 20:
    print("Result: High overlap. This suggests the model uses a stable, core set of neurons for syntactic processing in both plausible and implausible contexts.")
elif overlap_percentage > 5:
    print("Result: Moderate overlap. This suggests some neurons are dedicated to pure syntax, while others might be involved in combined syntax+semantics processing.")
else:
    print("Result: Low to no overlap. This suggests the model may use largely separate neural pathways for processing plausible vs. implausible syntax.")

--- Overlap Analysis (Top 50 Neurons) ---

Top influential neurons for ANOMALOUS:
[24, 36, 39, 50, 105, 121, 143, 163, 165, 238, 241, 268, 287, 306, 314, 326, 365, 367, 427, 438, 441, 489, 517, 578, 623, 680, 684, 701, 707, 750, 809, 816, 839, 851, 883, 929, 944, 1005, 1009, 1022, 1026, 1043, 1044, 1084, 1085, 1088, 1173, 1174, 1189, 1190]

Top influential neurons for CORE:
[24, 36, 39, 50, 105, 121, 143, 163, 165, 184, 238, 241, 268, 287, 306, 314, 326, 365, 367, 427, 438, 441, 489, 517, 578, 623, 680, 684, 701, 707, 750, 809, 816, 839, 851, 883, 929, 944, 1005, 1009, 1022, 1026, 1043, 1044, 1084, 1088, 1099, 1110, 1189, 1190]

--- RESULTS ---
Number of overlapping neurons: 47
Overlap percentage: 94.00%

Overlapping neuron indices:
[24, 36, 39, 50, 105, 121, 143, 163, 165, 238, 241, 268, 287, 306, 314, 326, 365, 367, 427, 438, 441, 489, 517, 578, 623, 680, 684, 701, 707, 750, 809, 816, 839, 851, 883, 929, 944, 1005, 1009, 1022, 1026, 1043, 1044, 1084, 1088, 1189, 1190]

--- Interpreta

**RESULTS VERIFICATION TASKS**:

* Repeat analysis for the other corpora
* Repeat with a different Syntactic Processing Psycholinguistic task 