# Visualization

We can better understand what features the ABCNN model is learning by looking at its attention distributions.

The attention distributions for the ABCNN-1 blocks can help us to understand word and phrasal associations between sequences that the model is finding.

First, we import the necessary modules for the visualizations.

In [13]:
% matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn

from model.attention.utils import manhattan
from model.attention.utils import euclidean
from model.attention.utils import cosine
from model.attention.utils import compute_attention_matrix
from setup import read_config
from setup import setup_datasets_and_model
from utils import load_checkpoint

In [10]:
CONFIG_FILE = "config.json"
CHECKPOINT_FILE = "checkpoints/gpu0/best_checkpoint"

For this Notebook, we'll use a pre-trained ABCNN-3 models. We will also need to load in the datasets.

In [11]:
# Load in the datasets and a model
config = read_config(CONFIG_FILE)
datasets, model = setup_datasets_and_model(config)

100%|██████████| 283003/283003 [00:41<00:00, 6787.33it/s]
100%|██████████| 41238/41238 [00:06<00:00, 6696.95it/s]
100%|██████████| 80049/80049 [00:11<00:00, 6997.46it/s]
100%|██████████| 86001/86001 [00:00<00:00, 523412.36it/s]


Now that we have the pre-trained model loaded and the datasets, the first thing we can do is inspect the attention matrices themselves for a few examples.

In [12]:
# Overwrite model weights with pre-trained weights
state = load_checkpoint(CHECKPOINT_FILE)
model_dict, optim_dict, _, _ = state
model.load_state_dict(model_dict)