Keywords: Graph Federated Learning, Heterogeneity, Dual-aspect Knowledge Sharing, Semantic Knowledge, Structural Knowledge
Abstract: Graph Federated Learning (GFL) enables distributed graph representation learning while protecting the privacy of graph data. However, GFL suffers from heterogeneity arising from diverse node features and structural topologies across multiple clients. To address both types of heterogeneity, we propose a novel graph Federated learning method via Semantic and Structural Alignment (FedSSA), which shares the knowledge of both node features and structural topologies among clients. For node feature heterogeneity, we propose a novel variational model to infer class-wise node distributions, so that we can cluster clients based on inferred distributions and construct cluster-level representative distributions. Then, we minimize the divergence between local and cluster-level distributions to facilitate semantic knowledge sharing. For structure heterogeneity, we employ spectral Graph Neural Networks (GNNs) and propose a novel spectral energy measure to characterize structural information, so that we can cluster clients based on spectral energy and build cluster-level spectral GNNs. Then, we align the spectral characteristics of local spectral GNNs with those of cluster-level spectral GNNs to enable structural knowledge sharing. Experiments on six homophilic and five heterophilic graph datasets under both non-overlapping and overlapping partitions demonstrate that FedSSA consistently outperforms eleven state-of-the-art methods.
😉 If FedSSA is helpful to you, please star this repo. Thanks! 🤗
Before running or modifying the code, you need to:
-
Make sure Anaconda or Miniconda is installed.
-
Clone this repo to your machine.
# git clone this repository git clone https://github.com/blgpb/FedSSA cd FedSSA # create a new Anaconda env conda create -n fedssa python=3.8 -y conda activate fedssa
-
Install required packages
# install python dependencies pip install -r requirements.txt -
It is recommended to run experiments via NVIDIA GeForce RTX 4090!
Requirements:
- Python 3.8.8
- PyTorch 1.12.0+cu113
- PyTorch Geometric 2.5.1
- NumPy, SciPy, and other dependencies listed in
requirements.txt
python main.py --dataset Cora --n_clients 20 --mode disjoint--dataset: Dataset name (Cora, CiteSeer, PubMed, Computers, Photo, ogbn-arxiv, Roman-empire, Amazon-ratings, Minesweeper, Tolokers, Questions)--mode: Partition mode (disjointfor non-overlapping,overlappingfor overlapping partitions)--n-clients: Number of clients (default: 10)
Download Pre-processed Datasets
Download from the Google Drive (https://drive.google.com/file/d/1PyqvR6yL43Om42fdsbKHj5WCgREvi3St/view?usp=sharing) and then unzip it. Place the datasets folder in the same path as README.md.
FedSSA demonstrates superior performance across diverse graph datasets:
- Homophilic Datasets: Consistent improvements across Cora, CiteSeer, PubMed, Amazon-Computer, Amazon-Photo, and ogbn-arxiv
- Heterophilic Datasets: Average improvement of 5.79% over the second-best method on all heterophilic datasets (Roman-empire, Amazon-ratings, Minesweeper, Tolokers, Questions)
- Robustness: Superior performance under both non-overlapping and overlapping partition settings
- Dual-Aspect Knowledge Sharing: Addresses both node feature heterogeneity and structural topology heterogeneity
- Semantic Knowledge Sharing: Variational model-based inference of class-wise node distributions with cluster-level alignment
- Structural Knowledge Sharing: Spectral GNN with novel spectral energy measure for structural information capture and alignment
- Scalability: Efficient distributed computation across multiple clients
- Versatility: Comprehensive coverage of 11 graph datasets with both homophilic and heterophilic properties
This project is licensed under the GNU General Public License v3.0 (GPL-3.0). See LICENSE.txt for details.
If you have any questions or suggestions, please feel free to contact us.
If you find this work helpful, please consider giving us a star! ⭐
