Skip to content
master
Switch branches/tags
Code
This branch is 125 commits ahead, 9 commits behind quadflor/Quadflor:master.
Contribute

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

Using Deep Learning for Title-Based Semantic Subject Indexing to Reach Competitive Performance to Full-Text

This repository contains the code for the JCDL paper Using Deep Learning for Title-Based Semantic Subject Indexing to Reach Competitive Performance to Full-Text. It is based on and extents the multi-label classification framework Quadflor.

Installation

Install Python 3.4 or higher and

#install necessary packages
sudo apt-get install libatlas-base-dev gfortran python3.4-dev python3.4-venv build-essential

#install python modules in a virtual environment with pip (this may take a while):
python3 -m venv lucid_ml_environment
source lucid_ml_environment/bin/activate
cd Code
pip install -r requirements.txt

Replicating the results

In order to enhance the reproducability of our study, we uploaded a copy of the title datasets to Kaggle. Moreover, we provide the configurations used to produce the results from the paper.

To rerun any of the (title) experiments, do the following:

  1. Download the econbiz.csv and pubmed.csv files, respectively, and copy them to the folder Resources.
  2. Open the .cfg file of the respective method that you want to run (MLP, BaseMLP, CNN, or LSTM) from the Experiments folder. Copy the command in the third (if you want to evaluate on a single fold) or fifth (if you want to do a full 10-fold-cross-validation) line.
  3. In the command, adjust the parameter for the option --tf-model-path parameter (specifies where to save the weights of the models, which can be gigabytes, so make sure you have enough disk space), and the --pretrained_embeddings parameter to the location of the GloVe model in your environment.
  4. cd to the folder Code/lucid_ml and run the command.

About

Code for the JCDL paper "Using Deep Learning for Title-Based Semantic Subject Indexing to Reach Competitive Performance to Full-Text"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published