Skip to content

Philisense/language-graph-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LGM: Enhancing Large Language Models with Conceptual Meta-Relations and Iterative Retrieval

This repository is the official implementation of LGM: Enhancing Large Language Models with Conceptual Meta-Relations and Iterative Retrieval.

LGM

Large language models (LLMs) exhibit strong semantic understanding, yet struggle when user instructions involve ambiguous or conceptually misaligned terms. We propose the Language Graph Model (LGM) to enhance conceptual clarity by extracting meta-relations—inheritance, alias, and composition—from natural language. The model further employs a reflection mechanism to validate these meta-relations. Leveraging a Concept Iterative Retrieval Algorithm, these relations and related descriptions are dynamically supplied to the LLM, improving its ability to interpret concepts and generate accurate responses. Unlike conventional Retrieval-Augmented Generation (RAG) approaches that rely on extended context windows, our method enables large language models to process texts of any length without the need for truncation. Experiments on standard benchmarks demonstrate that the LGM consistently outperforms existing RAG baselines.

Requirements

Neo4j

  1. We use neo4j-community-3.5.13 as the database for graph data storage. Download the Windows version or macOS/Linux version. Please follow the official manual for installation.
  2. And then config the Neo4j URI, user name and password in sources\config.ini
  3. To use the Neo4j database more efficiently, you can create indexes. The corresponding statements are as follows:
CALL db.index.fulltext.createNodeIndex("root_sentence_lemma_index", ["_ROOT_"], ["sentenceLemma"]);

Python

The python version is 3.10.x. And to install requirements:

pip install -r requirements.txt

Datasets

  1. HotpotQA dataset can download from http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_distractor_v1.json
  2. Musique dataset can download from https://huggingface.co/datasets/bdsaglam/musique/blob/main/musique_ans_v1.0_dev.jsonl

Models

If you need to test an online model, you must prepare the corresponding API_KEY and URL, then input them into the sources\config.ini file and configure the relevant parameters. For local models, only the corresponding URL is required. The Deepseek model must be configured as the answer matching model. If you want to test the performance of the LLama3 model, you also need to configure the LLama3 model.

Evaluation

Please run the learning stage first. Then run the answering stage.

To evaluate HotpotQA, run:

python evaluate/eval.py --dataset hotpot --model deepseek --stage learn --path path/to/hotpot_dev_distractor_v1.json
python evaluate/eval.py --dataset hotpot --model deepseek --stage answer --path path/to/hotpot_dev_distractor_v1.json

python evaluate/eval.py --dataset hotpot --model llama --stage learn --path path/to/hotpot_dev_distractor_v1.json
python evaluate/eval.py --dataset hotpot --model llama --stage answer --path path/to/hotpot_dev_distractor_v1.json

To evaluate Musique, run:

python evaluate/eval.py --dataset musique --model deepseek --stage learn --path path/to/musique_data_v1.0/musique_ans_v1.0_dev.jsonl
python evaluate/eval.py --dataset musique --model deepseek --stage answer --path path/to/musique_data_v1.0/musique_ans_v1.0_dev.jsonl

python evaluate/eval.py --dataset musique --model llama --stage learn --path path/to/musique_data_v1.0/musique_ans_v1.0_dev.jsonl
python evaluate/eval.py --dataset musique --model llama --stage answer --path path/to/musique_data_v1.0/musique_ans_v1.0_dev.jsonl

To evaluate Reflection, run:

python tests/test_reflection.py

Results

Our model achieves the following performance on :

Model / Dataset HotpotQA Musique
Model Deepseek v3-0324 Llama-3.3-70B-Instruct-AWQ AVG Deepseek v3-0324 Llama-3.3-70B-Instruct-AWQ AVG
Language Graph Model 89.46%87.06%88.26% 68.13%63.07%65.60%
GraphRAG 188.55%82.59%85.57%64.98%63.16%64.07%
GraphRAG 286.90%69.21%78.06%48.98%48.61%48.79%
LightRAG 287.94%76.34%82.14%65.36%50.33%57.84%
FastRAG 372.66%72.26%72.46%39.91%36.51%38.21%
Dify68.53%43.64%56.09%52.32%18.27%35.29%

We analyze the contribution of each component via ablation on HotpotQA (DeepSeek v3-0324). The maximum input size was reduced from 120,000 to 30,000 characters. Figure shows that F1 varies only mildly (std 0.009) and Recall remains stable (std 0.0038). The best F1 (89.46%) occurs at 60,000 with Recall 99.09%, indicating robustness to context budget.

TEXT_SIZE_F1

Contributing

Language Graph Model source is under the MIT License.

About

Enhancing Large Language Models with Conceptual Meta-Relations and Iterative Retrieval.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages