## Experiment Setting: On the impact of features orthogonality for Link Representation Learning with Message Passing Neural Network(MPNN)
This experiment aims to analyze the capacity of MPNNs to capture structural features under varying configurations, including the type of MPNN, the node features employed. 


## Trial 1

1. Simplified MPNN: is a mapping from $f(\tilde{\mathbf{A}}, \mathbf{X}) \to \mathbf{H}$, $\text{softmax}(\text{Act}((\mathbf{\tilde{A}XW^0)W^1}))$.
    - $\mathbf{h_i} \in \mathbb{R}^{d}$, embedding of vertex $i$, $n$: number of vertex, $d$ number of dimension.
    - $\mathbf{X} \in \mathbb{R}^{n \times d}$: initial node features
    - $\mathbf{H}^*$: optimized node embedding w.r.t. all $\mathbf{W}$
    - $\text{Act}$: Activation function, mostly nonlinear and [ReLU](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)). 
    

2. Loss Function: $\argmin_{H^*}  \underbrace{\sum_{(i,j) \in E_\text{pos}} \Vert 1 - \sigma(h_i, h_j)^\top \Vert_2^2}_{\text{positive samples}} + \underbrace{\sum_{(i,j) \in E_{\text{neg}}} \Vert (0 - \sigma(h_i, h_j)^\top \Vert)_2^2}_{\text{negative samples}}$ 



In [None]:
import os
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

root = '/pfs/work7/workspace/scratch/cc7738-kdd25/Universal-MP/trials/'

In [1]:
# List to store DataFrames
def merge_csv_files(results_dir, output_file, key):
    dataframes = []
    results_dir = root + results_dir
    # Iterate over all files in the directory
    for filename in os.listdir(results_dir):
        if filename.endswith(f"{key}.csv"):  # Check if file ends with 'CN_ddi.csv'
            file_path = os.path.join(results_dir, filename)
            print(f"Loading file: {file_path}")
            df = pd.read_csv(file_path)  # Read the CSV file
            dataframes.append(df)

    # Concatenate all DataFrames
    if dataframes:
        merged_ppr = pd.concat(dataframes, ignore_index=True)
        # Group by 'Model' and 'NodeFeat' and calculate mean and variance
        stats = merged_ppr.groupby(['Model', 'NodeFeat'])['Test_Loss'].agg(['mean', 'var']).reset_index()

        stats.to_csv(output_file, index=False)
        print(f"Merged {len(dataframes)} files into {output_file}")
    else:
        print("No files ending with 'PPR_ddi.csv' found in the directory.")
    return merged_ppr


In [None]:
# Data values from the merged statistics
def plot_result(df):

    # Group and process the data
    result = df.groupby(['Model', 'NodeFeat'])['Test_Loss'].agg(['mean', 'var']).reset_index()
    result.columns = ['Model', 'NodeFeat', 'Mean_Test_Loss', 'Variance_Test_Loss']

    # Handle non-positive variance
    result['Variance_Test_Loss'] = result['Variance_Test_Loss'].apply(lambda x: x if x > 0 else 1e-9)

    # Unique models and features
    unique_models = result['Model'].unique()
    unique_features = result['NodeFeat'].unique()

    # Define colors
    colors = ['#fbb4ae', '#b3cde3', '#ccebc5', '#decbe4', '#fed9a6']

    # Define plot settings
    width = 0.15  # Width of each bar
    x = np.arange(len(unique_models))  # X positions for the models

    # Create the plot
    fig, ax = plt.subplots(figsize=(12, 6))

    for i, feature in enumerate(unique_features):
        feature_data = result[result['NodeFeat'] == feature]
        values = feature_data['Mean_Test_Loss'].values
        errors = np.sqrt(feature_data['Variance_Test_Loss'].values)  # Standard deviation for error bars

        # Plot bar with error bars
        ax.bar(x + i * width, values, width, yerr=errors, label=feature, color=colors[i % len(colors)], capsize=5)

    # Set labels and title
    ax.set_ylabel('Mean Test Loss')
    ax.set_title('Mean and Variance of Test Loss for Common Neighbor Across Models', loc='center', fontsize=14, fontweight='bold')
    ax.set_xticks(x + width * (len(unique_features) - 1) / 2)
    ax.set_xticklabels(unique_models)
    ax.legend(loc='upper right', title="Feature Type")

    plt.tight_layout()
    return plt

In [None]:
# Define the directory containing the files
results_dir = "results/ddi"
output_file = "merged_CN_ddi.csv"

merged_cn = merge_csv_files("results/ddi", "merged_CN_ddi.csv", "CN_ddi")
print(merged_cn)
img = plot_result(merged_cn)
img.savefig('CN_ddi.png')

merged_ppr = merge_csv_files("results/ddi", "merged_PPR_ddi.csv", "PPR_ddi")
print(merged_ppr)
img = plot_result(merged_ppr)
img.savefig('PPR_ddi.png')

In [None]:
merged_ppr

In [None]:
merged_cn

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Data values from the merged statistics

def plot_result(df):
    # Plot settings
    fig, ax = plt.subplots(figsize=(10, 6))
    x = np.arange(len(df["Model"].unique()))  # x-axis positions for each method
    width = 0.15  # Adjusted width for additional feature type

    colors = ['#fbb4ae', '#b3cde3', '#ccebc5', '#decbe4', '#fed9a6']  # Soft pastel shades

    unique_features = df["NodeFeat"].unique()
    methods = df["Model"].unique()

    complete_index = pd.MultiIndex.from_product([methods, unique_features], names=["Model", "NodeFeat"])
    df = df.set_index(["Model", "NodeFeat"]).reindex(complete_index, fill_value=0).reset_index()

    for i, feature in enumerate(unique_features):
        feature_data = df[df["NodeFeat"] == feature]
        values = feature_data["Mean_Loss"].values
        errors = np.sqrt(feature_data["Variance_Loss"].values)  # Standard deviation for error bars

        # Plot bar with error bars
        ax.bar(x + i * width, values, width, yerr=errors, label=feature, color=colors[i], capsize=5)

        # Adding data labels for mean values
        for j, val in enumerate(values):
            ax.text(x[j] + i * width, val + 0.001, f'{val:.3f}', ha='center', va='bottom')

    # Set labels and title
    ax.set_ylabel('Mean Test Loss')
    ax.set_title('Mean and Variance of Test Loss for Common Neighbor Across Models', loc='center', fontsize=14, fontweight='bold')
    ax.set_xticks(x + width * (len(unique_features) - 1) / 2)
    ax.set_xticklabels(methods)
    ax.legend(loc='upper right', title="Feature Type")


    plt.tight_layout()
    plt.show()



### GCN 
$\mathbf{X}^{\prime} = \mathbf{\hat{D}}^{-1/2} \mathbf{\hat{A}}
\mathbf{\hat{D}}^{-1/2} \mathbf{X} \mathbf{W}$

$\mathbf{\hat{A}} = \mathbf{A} + \mathbf{I}$ denotes the adjacency matrix with inserted self-loops and 
$\hat{D}_{ii} = \sum_{j=0} \hat{A}_{ij}$ its diagonal degree matrix.

$    \mathbf{x}^{\prime}_i = \mathbf{W}^{\top} \sum_{j \in
    \mathcal{N}(i) \cup \{ i \}} \frac{e_{j,i}}{\sqrt{\hat{d}_j
    \hat{d}_i}} \mathbf{x}_j $

$\hat{d}_i = 1 + \sum_{j \in \mathcal{N}(i)} e_{j,i}$, where $e_{j,i}$ denotes the edge weight from source node `j` to target node `i` (default: `1.0`)

### SAGE 
$\mathbf{x}^{\prime}_i = \mathbf{W}_1 \mathbf{x}_i + \mathbf{W}_2 \cdot
    \mathrm{mean}_{j \in \mathcal{N(i)}} \mathbf{x}_j $


$    \mathbf{x}_j \leftarrow \sigma ( \mathbf{W}_3 \mathbf{x}_j +
    \mathbf{b}) $

### GIN
$\mathbf{x}^{\prime}_i = h_{\mathbf{\Theta}} \left( (1 + \epsilon) \cdot
        \mathbf{x}_i + \sum_{j \in \mathcal{N}(i)} \mathrm{ReLU}
        ( \mathbf{x}_j + \mathbf{e}_{j,i} ) \right)$

### LINKX
$\mathbf{H}_{\mathbf{A}} = \text{MLP}_{\mathbf{A}}(\mathbf{A})$

$\mathbf{H}_{\mathbf{X}} = \textrm{MLP}_{\mathbf{X}}(\mathbf{X})$

$\mathbf{Y} = \textrm{MLP}_{f} \left( \sigma \left( \mathbf{W}
[\mathbf{H}_{\mathbf{A}}, \mathbf{H}_{\mathbf{X}}] +
\mathbf{H}_{\mathbf{A}} + \mathbf{H}_{\mathbf{X}} \right) \right)$

In [None]:
plot_result(merged_ppr)

PPR