## Example Evaluation Notebook

**__Please make sure to run this notebook from the project's root directory.__**

The notebook shows:
- Explanation generation times
- Explanation model training times (0 for no training required models)
- $Fidelity_{-}$ and $Fidelity_{+}$ scores.

Fidelity scores:
- $Fidelity_{-}$: The least important edges are dropped. The lower the better.
- $Fidelity_{+}$: The most important edges are dropped. The higher the better.

Fidelity scores can be used with percentage sparsity or topk.






In [1]:
import torch
from baselines.eval_utils import fidelity_table, run_times_table

In [2]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
dataset_names = ["Cora", "CiteSeer"]

# dataset specific batch size we used in the experiments
batch_sizes={"Coauthor-CS": 512, "Coauthor-Physics": 128, "Cora": 1024, "CiteSeer": 1024,
             "PubMed": 1024, "Facebook": 1024}    
res_root = './results'
num_repeats = 5

In [3]:
# get the paths of the results for a specific dataset
def get_paths(dataset_name, rep_num=1):
    b = batch_sizes[dataset_name]
    res_files = [
        ('SA', f'{res_root}/{dataset_name}_SA_run{rep_num}.pkl'),
        ('GNNExplainer', f'{res_root}/{dataset_name}_GNNExplainer_run{rep_num}.pkl'),
        ('PGExplainer', f'{res_root}/{dataset_name}_PGExplainer_run{rep_num}.pkl'),
        ('PGMExplainer', f'{res_root}/{dataset_name}_PGMExplainer_run{rep_num}.pkl'),
        ('GraphSVX', f'{res_root}/{dataset_name}_GraphSVX_SmarterSeparate_3_1000_run{rep_num}.pkl'),
        ('SVXSampler', f'{res_root}/{dataset_name}_GNNShap_SVXSampler_WLSSolver_10000_{b}_3_run{rep_num}.pkl'),
        ('GNNShap 10k', f'{res_root}/{dataset_name}_GNNShap_GNNShapSampler_WLSSolver_10000_{b}_run{rep_num}.pkl')
        ]
    return res_files

## Model Times

In [4]:
expl_df, expl_tr_df = run_times_table(get_paths, dataset_names, num_repeats)

100%|██████████| 2/2 [00:00<00:00, 72.09it/s]


### Explanation Generation Times

In [5]:
expl_df

Unnamed: 0,Cora,CiteSeer
SA,0.35±0.01,0.33±0.01
GNNExplainer,96.41±0.13,94.49±0.14
PGExplainer,0.29±0.00,0.51±0.00
PGMExplainer,733.69±0.88,1177.79±0.98
GraphSVX,908.45±0.61,259.65±0.98
SVXSampler,24.11±0.05,12.07±0.13
GNNShap 10k,6.68±0.08,3.61±0.11


### Explanation Model Training Times

In [6]:
expl_tr_df

Unnamed: 0,Cora,CiteSeer
SA,0.00±0.00,0.00±0.00
GNNExplainer,0.00±0.00,0.00±0.00
PGExplainer,22.50±0.09,34.63±0.07
PGMExplainer,0.00±0.00,0.00±0.00
GraphSVX,0.00±0.00,0.00±0.00
SVXSampler,0.00±0.00,0.00±0.00
GNNShap 10k,0.00±0.00,0.00±0.00


## Fidelity Scores

### $Fidelity_{-}$ 30% Sparsity Results

In [7]:
fidelity_table(get_paths, dataset_names, sparsity=0.3, score_type='neg',
               topk=0, device=device, num_repeats=num_repeats, apply_abs=True)

100%|██████████| 2/2 [00:22<00:00, 11.25s/it]


Unnamed: 0,Cora,CiteSeer
SA,0.021±0.000,0.037±0.000
GNNExplainer,0.030±0.003,0.079±0.003
PGExplainer,0.062±0.005,0.060±0.002
PGMExplainer,0.025±0.001,0.038±0.002
GraphSVX,0.074±0.001,0.053±0.001
SVXSampler,0.061±0.000,0.045±0.000
GNNShap 10k,0.009±0.000,0.020±0.000


### $Fidelity_{+}$ Top10 Results

In [8]:
fidelity_table(get_paths, dataset_names, sparsity=0.0, score_type='pos',
               topk=10, device=device, num_repeats=num_repeats, apply_abs=True)

100%|██████████| 2/2 [00:20<00:00, 10.36s/it]


Unnamed: 0,Cora,CiteSeer
SA,0.108±0.000,0.128±0.001
GNNExplainer,0.032±0.003,0.100±0.002
PGExplainer,0.081±0.005,0.112±0.003
PGMExplainer,0.133±0.013,0.134±0.007
GraphSVX,0.178±0.000,0.159±0.000
SVXSampler,0.199±0.001,0.167±0.000
GNNShap 10k,0.206±0.000,0.167±0.000
