### Explanation for Graph Classification

In this notebook, we test the Consistency of explanations - if inputs are identical, then we expect the explanations to be identical 

We have ran the explainer for the same set of graphs twice and we have the explanations of each round in separate csv files. First we load these files into two datatframes

In [1]:
# Imports

import pandas as pd
import numpy as np
import matplotlib as plt
import math

In [2]:
df1 = pd.read_csv("syn6_round1.csv",usecols=["Weights of regression","Bias of regression"])
df2 = pd.read_csv("syn6_round2.csv",usecols=["Weights of regression","Bias of regression"])

In [9]:
def convert_to_numpy(d):
    """The first column in both dataframes are arrays but saved as a string.
    This function converts it to a float array

    Args:
        d (str): array of weights stored in csv as string

    Returns:
        numpy array: array of weights
    """
    s = d.split(']')
    s = s[0].split()
    s = s[1:]
    a = []

    for c in s:
        if c != '[' and c!=']':
            a.append(float(c))
    return a

Now that we have loaded the data, now we wish to measure the how close the explanations are. For this we first give some notations:

- a1: numpy array of weights along bias of regression from df1
- a2: numpy array of weights along bias of regression from df2

Next, to quantify this closeness, similar to the coherence we define the following metric

![alt text](consistency_formula.png "Title")


In [4]:
def normalize(A):
    """Normalize array A

    Args:
        A (numpy array): array to normalize

    Returns:
        numpy array: normalized array
    """
    scale_factor = A.max() - A.min()
    B = np.ones_like(A)*A.min()

    A = (A - B)/scale_factor
    return A

In [5]:
def prep_vector(df_num,i):
    """Prepare vector for further calculations

    Args:
        df_num (str): decides if data from df1 or df2
        i (int): Index in the datafram

    Returns:
        numpy array: array of weights along with the bias
    """
    df = eval('df'+df_num)      # Get the dataframe df1 or df2
    a = convert_to_numpy(df['Weights of regression'][i])
    a.append(df["Bias of regression"][i])
    a = np.array(a)

    return a

In [12]:

closeness = []
for i in range(len(df1)):
    # Form numpy arrays a1 and a2 by appending corresponding biases to the list of weights
    a1 = prep_vector('1',i)
    a2 = prep_vector('2',i)

    # print(f"a1: {a1}")
    # print(f"a2: {a2}")
    # Normalize the vectors
    a1 = normalize(a1)
    a2 = normalize(a2)

    # Evaluating the required metric
    a = (np.linalg.norm(a1 - a2, 2)**2)

    closeness.append(2/(1 + math.exp(a)))

closeness = np.array(closeness)
mean = np.mean(closeness)
std = np.std(closeness)

print(f"Closeness: {mean} +/- {std}")


Closeness: 1.0 +/- 0.0
60
