# Day 57: Feature Attribution (Saliency Maps)

Understanding *which* parts of an input caused an output is crucial for debugging bias and safety failures.
We implement a mock **Attention Mechanism** to visualize token importance.

In [None]:
import sys
import os

# Add root directory to sys.path
sys.path.append(os.path.abspath('../../'))

from src.interpretability.attention import SimpleAttention

## 1. Init Attention Engine


In [None]:
explainer = SimpleAttention()

## 2. Analyze Attribution

Scenario: User asks "I must kill the process."
Agent Output: "Killing process 123."
We want to see if 'kill' was the driver.

In [None]:
input_text = "I must kill the process now."
output_text = "Killing process 123."

scores = explainer.calculate_attribution(input_text, output_text)
explainer.visualize(scores)

## 3. Analyze Negation

Negations like 'not' are critical for safety.

In [None]:
input_text_2 = "Do not delete the files."
output_text_2 = "Files preserved."

# Even if 'not' isn't in output, our mock engine highlights it as important context.
scores_2 = explainer.calculate_attribution(input_text_2, output_text_2)
explainer.visualize(scores_2)