HGR-MLF

Heterogeneous Graph Reasoning with Multi-Level Filtering for Document-Level Relation Extraction

The core challenge of Document-level Relation Extraction (DocRE) lies in handling long-range dependencies across sentences. Although heterogeneous graph reasoning has demonstrated great potential in modeling complex entity interactions, this complexity is a double-edged sword. In long documents, heterogeneous graphs introduce a large number of background vocabularies irrelevant to relation identification, resulting in a high signal-to-noise ratio (SNR) problem in the graph. Specifically, traditional graph construction methods often generate excessive false edges, leading to "over-smoothing" or "error accumulation" of semantic information during multi-layer graph convolution reasoning. Ultimately, this makes it difficult for the model to distinguish between genuine logical relations and incidental co-occurrence noise.

To address this challenge, we propose a Heterogeneous Graph Reasoning with Multi-Level Filtering for Document-level Relation Extraction (HGR-MLF-DocRE), whose core idea is to implement "full lifecycle" noise blocking through a multi-level filtering mechanism. This mechanism is not a simple post-processing step but is deeply integrated with heterogeneous graph reasoning:

Input stage: Attention mechanisms are used to remove text spans that contribute nothing to relation judgment, ensuring the "purity" of initial node representations;

Reasoning stage: Redundant connections in heterogeneous graph reasoning are dynamically pruned through topological structure optimization (e.g., meta-path pruning), preventing inference paths from deviating from targets amid complex meta-path interactions;

Output stage: Consistency correction is applied using classification probability distributions to resolve common one-to-many or many-to-many relation conflicts in DocRE, ensuring the logical coherence of extracted relations.

Experimental results show that this "filtering-guided reasoning" architecture significantly improves the accuracy of information representation, which proves the effectiveness of combining heterogeneous graph reasoning (for enriching information representation) and multi-level filtering mechanisms (for information denoising) in the DocRE task.

Environments

Ubuntu-18.10.1(4.18.0-25-generic)
Python(3.6.8)
Cuda(10.1.243)

Dependencies

matplotlib (3.3.2)
networkx (2.4)
nltk (3.4.5)
numpy (1.19.2)
torch (1.3.0)

Data

First you should get pretrained Bert_base model from huggingface and put it into ./bert/bert-base-uncased/.
Before running our code you need to obtain the DocRED dataset from the author of the dataset, Here.
After downing DocRED, you can use gen_data_extend_graph.py to preprocess data for Glove-HDR-DREM and use gen_bert_data_extend_graph.py to preprocess data for BERT-HDR-DREM. Finally, processed data will be saved into ./prepro_data and ./prepro_data_bert respectively.
For the CDR, you can obtain it from https://biocreative.bioinformatics.udel.edu/tasks/biocreative-v/track-3-cdr/. For the GDA, you can obtain it from https://bitbucket.org/alexwuhkucs/gda-extraction/src/master/.

Run code

train.py used to start training
test.py used to evaluation model's performance on Dev or Test set.
Config.py is for training Glove-based model And Config_bert.py is used for training Bert_based model

Evaluation

For Dev set, you can use test.py to evaluate you trained model. For Test set, you should first use test.py to get test results which saved in ./result, and submit it into Condalab competition.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bert/bert-base-uncased		bert/bert-base-uncased
models		models
new_data		new_data
prepro_data		prepro_data
result		result
README.md		README.md
evaluation.py		evaluation.py
gen_bert_data_extend_graph.py		gen_bert_data_extend_graph.py
gen_data_extend_graph.py		gen_data_extend_graph.py
pronoun_list.txt		pronoun_list.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HGR-MLF

Environments

Dependencies

Data

Run code

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HGR-MLF

Environments

Dependencies

Data

Run code

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages