Skip to content
/ rad Public

Code for experiments in Automatic Classification and Visualization of Text Data on Rare Diseases

License

Notifications You must be signed in to change notification settings

lrei/rad

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Classification of Text Data on Rare Diseases

Code used in experiments for the paper: Automatic Classification and Visualization of Text Data on Rare Diseases.

@Article{jpm14050545,
AUTHOR = {Rei, Luis and Pita Costa, Joao and Zdolšek Draksler, Tanja},
TITLE = {Automatic Classification and Visualization of Text Data on Rare Diseases},
JOURNAL = {Journal of Personalized Medicine},
VOLUME = {14},
YEAR = {2024},
NUMBER = {5},
ARTICLE-NUMBER = {545},
URL = {https://www.mdpi.com/2075-4426/14/5/545},
PubMedID = {38793127},
ISSN = {2075-4426},
DOI = {10.3390/jpm14050545}
}

This code is provided to facilitate replication and for documentation. It includes several hardcoded paths used during the experiments.

Model

A fine-tuned model is available at on Huggingface: rad_small.

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="lrei/rad-small")

# Simple high-level usage
pipe(["The patient suffer from a complex genetic disorder.", "The patient suffers from a common genetic disorder."])

Dataset

The dataset used to train this model is available on zenodo. It is a subset of abstracts obtained from PubMed and sorted into the 3 classes on the basis of their MeSH terms.

Like the model, the dataset is provided for demonstration and methodology validation purposes. The original PubMed data was randomly under-sampled.

About

Code for experiments in Automatic Classification and Visualization of Text Data on Rare Diseases

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published