Skip to content

Faizan-E-Mustafa/GNN_EACL_Workshop

Repository files navigation

GNN Link Prediction

This repository contains the code for the following paper : Annotating PubMed Abstracts with MeSH Headings using Graph Neural Network

Abstract:

The number of scientific publications in the biomedical domain is continuously increasing with time. An efficient system for indexing these publications is required to make the information accessible according to the user’s information needs. Task 10a of the BioASQ challenge aims to classify PubMed articles according to the MeSH ontology so that new publications can be grouped with similar preexisting publications in the field without the assistance of time-consuming and costly annotations by human annotators. In this work, we use Graph Neural Network (GNN) in the link prediction setting to exploit potential graph-structured information present in the dataset which could otherwise be neglected by transformer-based models. Additionally, we provide error analysis and a plausible reason for the substandard performance achieved by GNN.

Setup

Create new virtual environment if necessary

python -m venv .venv

Python version: 3.10

Once environment is activated use following command to install required packages.

pip install -e .
pip install -r requirements.txt

data folder can be downloaded from here

It should contain following files:

folder structure

Preprocessing

Following command prepares required embeddings.

python src/preprocessing.py

Following command prepares datasets and creates negative edges before training so that the same edges can be used for different runs.

python src/graph_preparation.py

Training

Train GNN model. BCE or Focal Loss

python gnn.py

Train GNN with Dynamic Random sampling

python gnn_drs.py

Train GNN with mixup

python gnn_mixup.py

BibTeX

@inproceedings{mustafa-etal-2023-annotating,
    title = "Annotating {P}ub{M}ed Abstracts with {M}e{SH} Headings using Graph Neural Network",
    author = "Mustafa, Faizan E  and
      Boutalbi, Rafika  and
      Iurshina, Anastasiia",
    booktitle = "The Fourth Workshop on Insights from Negative Results in NLP",
    month = may,
    year = "2023",
    address = "Dubrovnik, Croatia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.insights-1.9",
    doi = "10.18653/v1/2023.insights-1.9",
    pages = "75--81",
}

Releases

No releases published

Packages

No packages published

Languages