KG-R1 is a reinforcement learning framework for knowledge-graph question answering with a single LLM agent and a lightweight KG retrieval server.
The framework is designed around two goals:
- strong accuracy-efficiency tradeoffs, matching or outperforming multi-module KG-RAG baselines while using a much smaller model
- transferability across knowledge graphs, where the trained policy can be reused without modifying the model and only the KG server adaptation changes
In the experiments accompanying this repository, KG-R1 uses Qwen2.5-3B while several prior works rely on substantially larger models such as Qwen2.5-72B or Qwen3-235B.
This repository focuses on the framework itself: data preparation, KG server setup, training, evaluation, and adapting the system to new knowledge graphs.
Built upon veRL, KG-R1 supports PPO and GRPO training, multiple LLM backbones, and a schema-agnostic server interface for Freebase-style, Wikidata-style, temporal, and custom knowledge graphs.
Answer the given question. You can interact with the knowledge graph through the following actions:
- get_tail_relations(entity): Get relations where entity is the subject
- get_head_relations(entity): Get relations where entity is the object
- get_tail_entities(entity, relation): Get objects for entity-relation pairs
- get_head_entities(entity, relation): Get subjects for relation-entity pairs
Use <search>action_name(arguments)</search> to query the KG. Results appear in <information></information>.
Reason with <think></think> tags. Provide final answer in <answer></answer> tags.
Question: {question}
Base URL: http://127.0.0.1:8001/retrieve
Core Operations:
- get_tail_relations(entity): Find all relations where entity is the head/subject
- get_head_relations(entity): Find all relations where entity is the tail/object
- get_tail_entities(entity, relation): Get tail entities for head-relation pairs
- get_head_entities(entity, relation): Get head entities for relation-tail pairs
KG-R1 enables iterative exploration:
- Initial Question Analysis → Identify key entities
- KG Exploration → Multi-turn relation and entity discovery (up to 5 turns)
- Answer Synthesis → Combine retrieved knowledge for final answer
Semantic factuality evaluation using GPT-based judge for accurate answer assessment beyond exact string matching.
- Installation
- Quick start
- Inference
- Use your own dataset
- Use your own knowledge graph
- Features
- Acknowledge
- Citations
conda create -n kgr1 python=3.10
conda activate kgr1
# install torch [or you can skip this step and let vllm to install the correct version for you]
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu121
# install vllm
pip3 install vllm==0.6.3 # or you can install 0.5.4, 0.4.2 and 0.3.1
# verl
pip install -e .
# flash attention 2
conda install -c nvidia cuda-toolkit=12.1
pip3 install flash-attn --no-build-isolation
pip install wandb
# Additional dependencies for KG processing
pip install fastapi uvicorn requests aiohttp
pip install networkx # for knowledge graph operationsThe KG-R1 system requires a knowledge graph server for retrieval operations.
conda create -n kg_server python=3.10
conda activate kg_server
# Core dependencies for KG server
pip install fastapi uvicorn pydantic requests
pip install transformers datasets huggingface_hub
pip install networkx pandas pyarrow
# For efficient KG processing
pip install numpy scipyThis section describes the end-to-end setup used to reproduce the CWQ and WebQSP experiments.
1. Download the datasets
bash scripts/setup_data_kg.shChoose the option that downloads all data in the interactive menu.
2. Download the Freebase RDF dump
wget -O freebase-rdf-latest.gz \
http://commondatastorage.googleapis.com/freebase-public/rdf/freebase-rdf-latest.gz3. Process the Freebase entities
bash scripts/process_entitites_freebase.sh4. Convert entity names and ids
python scripts/convert_entities.py5. Build the search-augmented CWQ and WebQSP files
python scripts/data_process_kg/cwq_search_augmented_initial_entities.py
python scripts/data_process_kg/webqsp_search_augmented_initial_entities.pyAfter preprocessing, the main evaluation files should be available at:
data_kg/cwq_search_augmented_initial_entities/{train,test}.parquet
data_kg/webqsp_search_augmented_initial_entities/{train,test}.parquet
Start the KG server before training or evaluation:
./kg_retrieval_launch_cwq.shThis serves the retrieval API used by KG-R1 on http://127.0.0.1:8001/retrieve.
Train KG-R1 on CWQ with GRPO:
bash train_grpo_kg_qwen_3b_cwq_f1_turn5.shThe repository also includes related training variants for different turn budgets and datasets.
Once the KG server is running, you can evaluate either a HuggingFace model or a local checkpoint.
CWQ example:
CUDA_VISIBLE_DEVICES=0 bash eval_scripts/kg_r1_eval_main/eval_qwen_3b_turn5_hf.sh \
your-org/KG-R1-model \
cwqWebQSP example:
CUDA_VISIBLE_DEVICES=0 bash eval_scripts/kg_r1_eval_main/eval_qwen_3b_turn5_hf.sh \
your-org/KG-R1-model \
webqsp \
--experiment_postfix=webqsp-mainCWQ example:
bash eval_scripts/kg_r1_eval_main/eval_qwen_3b_turn5_local.sh \
/path/to/checkpoint_root \
cwqEach evaluation run writes:
- Pass@K summary JSON
- Per-example detailed JSONL predictions
- Evaluation logs
- Exact match / F1 metrics
Launch the KG retrieval server:
./kg_retrieval_launch_cwq.shThen run inference:
conda activate kgr1
python infer_kg_r1.py --checkpoint verl_checkpoints/your_trained_modelYou can modify the question parameter to test different knowledge graph questions. The model will interactively explore the KG using the 4 basic operations and provide reasoning traces.
For each knowledge graph question-answer sample, it should be a dictionary containing:
data = {
"data_source": "your_kg_dataset",
"original_query": question,
"target_text": answer,
"query_entities": ["entity1", "entity2"], # Initial entities
"query_id": unique_id,
"split": "train/test/dev"
}Your knowledge graph should provide the following structure:
# Entity-relation-entity triples
kg_data = {
"entities": {"entity_id": "human_readable_name"},
"relations": {"relation_id": "human_readable_name"},
"triples": [
["head_entity_id", "relation_id", "tail_entity_id"],
# ... more triples
]
}You can refer to scripts/data_kg/process_datasets.py for concrete data processing examples for CWQ and WebQSP datasets.
To use your own knowledge graph, you need to set up the KG server with your data:
- Prepare your KG data in the required format (see above)
- Start the KG server with your data directory:
# Your KG data should be organized as:
# your_kg_data/
# ├── entities.json
# ├── relations.json
# ├── train_simple.json
# └── test_simple.json
python kg_r1/search/server.py --port 8001 --data_dir your_kg_data- Configure your training script to point to your KG server:
# In your training script, update:
actor_rollout_ref.rollout.search.search_url="http://127.0.0.1:8001/retrieve"The KG server supports the 4 basic operations:
get_tail_relations(entity): Find relations where entity is the subjectget_head_relations(entity): Find relations where entity is the objectget_tail_entities(entity, relation): Get tail entities for head-relation pairsget_head_entities(entity, relation): Get head entities for relation-tail pairs
KG-R1 supports different types of knowledge graphs with a schema-agnostic design. The system works with:
- Freebase-style KGs: Entity-centric with rich relations
- Wikidata KGs: Property-based knowledge representation
- Temporal KGs: Time-aware knowledge graphs
- Domain-specific KGs: Custom knowledge graphs for specific domains
The main philosophy is to launch a KG server separately from the RL training pipeline, providing a clean API interface.
The LLM agent calls the KG server through the search API at http://127.0.0.1:8001/retrieve.
You can refer to kg_r1/search/server.py for the complete KG server implementation, which includes:
- FastAPI server: RESTful API for KG operations
- Concurrent processing: ThreadPoolExecutor for handling multiple requests
- Action routing: Dispatches requests to appropriate KG operations
- Error handling: Robust error handling for malformed queries
KG-R1's key advantage is cross-KG transferability. Models trained on one KG can transfer to different KG schemas without retraining, enabling plug-and-play usage.
If you use KG-R1 in your research, citation information will be provided upon publication.

