Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

🎉 Accepted by ACM SIGKDD 2026 Research Track

📊 Datasets

All required datasets are hosted on Hugging Face. You can download them here.

data/
├── train_data/
│   ├── raw_data/
│   │   ├── graphr1/
│   │   │   └── sft_train_data_filter2048_10000_balance.csv    # GraphR1 SFT raw data (for SFT data generation)
│   │   ├── {arxiv,children,computer,history,photo,pubmed,reddit,sports,wn18rr}/
│   │   │   ├── {dataset}_summary.jsonl                        # Node summaries (for RL data generation)
│   │   │   └── {dataset}_graph_bert_encoded.pt                # Graph BERT encodings (for RL data generation)
│   └── rl_data/
│       └── test_1050.parquet                                  # GraphR1 validation data (for RL validation data)
└── test_data/
    └── raw_data/
        └── gofa_test_data/
            ├── gofa_test_data_53114.csv                       # GOFA benchmark (for evaluation)
            └── gofa_supply_data_35603.csv                     # GOFA supplement benchmark (for evaluation)

outputs/synthetic_data/
├── sft_data/
│   └── graph_sft_data.json                                    # SFT training data (generated by sft_data_generator.py)
└── rl_data/
    ├── train_9799_graphr2_subgraph.parquet                    # RL training data (generated by rl_data_generator.py)
    └── test_600_graphr2_subgraph.parquet                      # RL validation data (generated by rl_data_generator.py)

data/train_data/raw_data/graphr1/: Source data for SFT prompt construction and teacher reasoning.
data/train_data/raw_data/{dataset}/: Graph datasets with node summaries and BERT encodings, used for RL subgraph sampling (via preprocess.py) and prompt construction.
data/train_data/rl_data/test_1050.parquet: GraphR1 validation set, converted to SSR prompt format during RL data generation.
data/test_data/: GOFA benchmark datasets for model evaluation.
outputs/synthetic_data/: Generated training/validation data.

⚙️ Configuration

Some environment-specific configurations (e.g., cluster node hostnames, Docker image paths, volume mounts, model checkpoint paths, and vLLM service addresses) are provided as placeholders in the scripts and Python source files. Please search for and replace these placeholders to match your own environment before running.

🚀 SSR-SFT

Prepare SFT data

Generate the SFT dataset by running the following script.

Data prerequisites: Download the GraphR1 SFT data and place it at data/train_data/raw_data/graphr1/sft_train_data_filter2048_10000_balance.csv.

Model prerequisites: Deploy the teacher model and diversity model via vLLM on your GPU nodes using ./scripts/deploy_api.sh. Update the node list, model path, and served model name in the script, then:

cd scripts
bash deploy_api.sh

Update choose_name and URL_LIST_DS in ./utils/utils.py for the teacher model.
Update choose_name and URL_LIST_DV in ./data_generation/quality_check/graph_diversity.py for the diversity model.

The script executes three steps automatically:

Construct prompts: Extract graph information from the GraphR1 SFT data and build SSR prompts.
Teacher reasoning + quality filtering: Use the teacher model to generate reasoning traces, compute structural diversity scores, and filter samples by answer correctness and diversity. This step loops until the correct ratio reaches 90%.
Construct training data: Convert the filtered data into the final SFT training format.

cd ./data_generation/pipelines
python sft_data_generator.py

Note: We also provide pre-generated SFT data for convenience. Download and place it at outputs/synthetic_data/sft_data/graph_sft_data.json.

SFT training

We utilize LlamaFactory for the SFT process. We recommend using the official Docker image.

Configuration:

Update model_name_or_path and output_dir in ./supervised_finetuning/full_sft.yaml.
Update the node list, DATA_DIR, and YAML_FILE in ./supervised_finetuning/run_sft_shell.sh.

Execution: Run the following script to launch distributed SFT training across the specified nodes:

cd ./supervised_finetuning
bash run_sft_shell.sh

🧠 SSR-RL

Prepare RL data

Generate the RL dataset by running the following script.

Data prerequisites:

Download the graph datasets (arxiv, children, computer, history, photo, pubmed, reddit, sports, wn18rr) and place them under data/train_data/raw_data/{dataset}/. Each dataset directory should contain {dataset}_summary.jsonl and {dataset}_graph_bert_encoded.pt.
Generate subgraph data by running:

cd ./data_generation/pipelines
python preprocess.py

Download test_1050.parquet (GraphR1 validation data) and place it at data/train_data/rl_data/.

Model prerequisites: Deploy the SFT model via vLLM on your GPU nodes using ./scripts/deploy_api.sh. Update the node list, model path, and served model name in the script, then:

cd scripts
bash deploy_api.sh

Update choose_name and URL_LIST_RL in ./data_generation/pipelines/rl_data_generator.py for the SFT model.

The script executes three steps automatically:

Construct RL training data: Sample subgraphs from the graph datasets, construct SSR prompts, and use the SFT model to assess question difficulty (easy/medium/hard). This step loops until the required number of samples per difficulty level is collected.
Convert training data: Convert the filtered JSONL data into Parquet format for RL training.
Construct validation data: Transform the GraphR1 validation data into the SSR prompt format and convert to Parquet.

cd ./data_generation/pipelines
python rl_data_generator.py

Note: We also provide pre-generated RL data for convenience. Download and place them at outputs/synthetic_data/rl_data/.

RL training

We use verl v0.6.x to carry out the RL process. To align with our adaptive subgraph denoising paradigm, we modified several core components:

Training process:
- reinforcement_learning\verl\trainer\config\ppo_trainer.yaml: Added parameters for the second-stage RL initiation.
- reinforcement_learning\verl\trainer\ppo\ray_trainer.py: Modified fit and _validate functions to customize the training/validation loop.
Actor rollout:
- reinforcement_learning\verl\workers\rollout\vllm_rollout\vllm_rollout_spmd.py: Optimized the generate_sequences function for multi-path reward calculation.
Reward function:
- reinforcement_learning\verl\workers\reward_manager\naive.py: Customized the __call__ function for graph-specific reward logic.
- reinforcement_learning\verl\utils\reward_score\__init__.py: Import the reward function that we have implemented.
- reinforcement_learning\verl\utils\reward_score\subgraph_size.py: Our core implementation of the reward function.

Execution: We recommend the verl Docker image. The RL training is split into two independent stages:

Stage 1: Authenticity-Reinforced RLVR (in_second_stage=false): Trains the model to strictly follow the Sample-Select-Reason pipeline. The reward function $R_1$ uses nested logic to enforce subgraph authenticity ($\text{Status}{real}$), selection consistency ($\text{Status}{consist}$), and answer correctness ($\text{Status}_{ans}$).
Stage 2: Denoising-Reinforced RLVR (in_second_stage=true): Built upon Stage 1, this stage adds a structural parsimony reward to encourage selecting purer (smaller) subgraphs. The reward function $R_2$ extends $R_1$ with a size-based bonus for correct answers.

Configuration: Update the model path, data path, and output path in ./reinforcement_learning/run_rl_stage1.sh and ./reinforcement_learning/run_rl_stage2.sh to match your local settings.

Step 1: Start the Ray cluster

Update the following variables in ./scripts/start_ray_cluster.sh:

NODES: List of node hostnames (the first node serves as the Ray head)
IMAGE_PATH: Path to the verl Docker image tar file

Then execute:

cd scripts
bash start_ray_cluster.sh

Step 2: Run Stage 1 training

Update the following parameters in ./reinforcement_learning/run_rl_stage1.sh:

data.train_files: Path to the training Parquet file (e.g., outputs/synthetic_data/rl_data/train_*_graphr2_subgraph.parquet)
data.val_files: Path to the validation Parquet file (e.g., outputs/synthetic_data/rl_data/test_*_graphr2_subgraph.parquet)
actor_rollout_ref.model.path: Path to the SFT model checkpoint
trainer.default_local_dir: Output directory for Stage 1 checkpoints
trainer.n_gpus_per_node: Number of GPUs per node
trainer.nnodes: Number of nodes (must match the Ray cluster size)

SSH into the head node and execute the training script inside the Docker container:

ssh <HEAD_NODE>
docker exec -it verl bash
cd reinforcement_learning
bash run_rl_stage1.sh

Step 3: Run Stage 2 training

After Stage 1 completes, update the following parameters in ./reinforcement_learning/run_rl_stage2.sh:

actor_rollout_ref.model.path: Path to a Stage 1 checkpoint (e.g., .../verl_grpo_graph_rl_stage1/global_step_*/actor/huggingface)
data.train_files and data.val_files: Paths to the training and validation Parquet files
trainer.default_local_dir: Output directory for Stage 2 checkpoints
trainer.n_gpus_per_node and trainer.nnodes: Cluster configuration
trainer.second_stage_reward_lambda: Controls the denoising intensity (default: 0.1)

SSH into the head node and execute:

ssh <HEAD_NODE>
docker exec -it verl bash
cd reinforcement_learning
bash run_rl_stage2.sh

📦 Trained Model & Evaluation

Download Model

The final trained model is available on Hugging Face.

Evaluation

Model prerequisites: Deploy the trained model via vLLM on your GPU nodes using ./scripts/deploy_api.sh. Update the node list, model path, and served model name in the script, then:

cd scripts
bash deploy_api.sh

Update choose_name and URL_LIST_EVAL in ./evaluation/evaluate_trained_model.py for the evaluation servers.

Execution: Run the evaluation script:

cd evaluation
bash run_evaluate.sh

📝 Citation

If you find this work helpful, please consider citing:

@article{li2026beyond,
  title={Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models},
  author={Li, Fengzhi and Zhang, Liang and Zuo, Yuan and Zhao, Ruiqing and Liu, YanSong and Ma, Yunfei and Meng, Fanyu and Feng, Junlan},
  journal={arXiv preprint arXiv:2603.02938},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
appendix		appendix
data_generation		data_generation
evaluation		evaluation
reinforcement_learning		reinforcement_learning
scripts		scripts
supervised_finetuning		supervised_finetuning
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

📊 Datasets

⚙️ Configuration

🚀 SSR-SFT

Prepare SFT data

SFT training

🧠 SSR-RL

Prepare RL data

RL training

📦 Trained Model & Evaluation

Download Model

Evaluation

📝 Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

📊 Datasets

⚙️ Configuration

🚀 SSR-SFT

Prepare SFT data

SFT training

🧠 SSR-RL

Prepare RL data

RL training

📦 Trained Model & Evaluation

Download Model

Evaluation

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages