#### **`att_viz` 101**

`att_viz` is a Python package for visualizing self-attention. Here is a quick tutorial to get you started.

##### **1. Running a basic experiment**

In this experiment, we will load `Salesforce/codegen-350M-mono` and run inference on a single prompt.

Because we opt for no attention aggregation, the HTML visualization will be "chunked", or broken down, into multiple HTML files.

Let's take a look at the code:

In [None]:
from att_viz.utils import Experiment
from att_viz.renderer import Renderer, RenderConfig
from att_viz.self_attention_model import SelfAttentionModel
from att_viz.attention_aggregation_method import AttentionAggregationMethod


### MODIFY HERE TO TEST OTHER MODELS OR PROMPTS ###

model_name_or_directory: str = "Salesforce/codegen-350M-mono"
prompt: str = "print('Hello World')"

###################################################

# Initialize the model: this loads the corresponding Huggingface model and tokenizer
model = SelfAttentionModel(model_name_or_directory=model_name_or_directory)

# Initialize the renderer with the base configuration and no attention aggregation method
renderer = Renderer(
    render_config=RenderConfig(), aggregation_method=AttentionAggregationMethod.NONE
)

experiment = Experiment(model, renderer)

# This will run inference with the given prompt and save some html visualization files
experiment.basic_experiment(prompt=prompt, aggr_method=AttentionAggregationMethod.NONE)

##### **2. Running a headwise averaging experiment**

##### **2.1. Why aggregate the self-attention values?**

Well-designed aggregation methods could help pinpoint value patterns in the self-attention matrix.

`att_viz` implements averaging across attention heads, and offers users the possibility to define their own aggregation functions by extending the `AttentionAggregationMethod` enumeration. 

##### **2.2. Using aggregation methods with `att_viz`**
We can reuse the previous section's code, only switching the aggregation method to `HEADWISE_AVERAGING` in the experiment.

Notice that this time the HTML visualization has not been "chunked" - it is already small enough.

In [None]:
### MODIFY HERE TO TEST OTHER PROMPTS ###

prompt: str = "print('Hello World')"

#########################################

experiment.basic_experiment(
    prompt=prompt,
    aggr_method=AttentionAggregationMethod.HEADWISE_AVERAGING,
)

##### **3. Running an experiment in two steps**

Sometimes, it is preferable to run inference and processing separately. `att_viz` implements this by pairing `save_completions` with `process_saved_completions`.

Here is how you can use this:

In [None]:
from att_viz.utils import save_completions, process_saved_completions
from att_viz.renderer import RenderConfig
from att_viz.attention_aggregation_method import AttentionAggregationMethod

### MODIFY HERE TO TEST OTHER MODELS OR PROMPTS ###

# N.B.: The model completion for prompt[i] will be saved in:
# save_prefix[i]_{input_length, attention_matrix, completion}.pickle

model_name_or_directory: str = "Salesforce/codegen350M-mono"
prompts: list[str] = ["print('Hello World')"]
save_prefixes: list[str] = ["experiment_0"]

###################################################

save_completions(
    model_name_or_directory=model_name_or_directory,
    prompts=prompts,
    save_prefixes=save_prefixes,
)

process_saved_completions(
    render_config=RenderConfig(),
    aggregation_method=AttentionAggregationMethod.HEADWISE_AVERAGING,
    save_prefixes=save_prefixes,
)