Skip to content

OptSLA: An Optimization-based approach for Sequence Labeling Aggregation

Notifications You must be signed in to change notification settings

NasimISU/OptSLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

OPTSLA: an Optimization-Based Approach for Sequential LabelAggregation

OPTSLA is an Optimization-based Sequential Label Aggregation method, that jointly considers the characteristics of sequential labeling tasks, workers reliabilities, and advanced deep learning techniques to conquer the challenge of annotation aggregation.

Structure

The code is divided into 2 folders, code and dataset. The dataset folder contains the files NER dataset formatted into conll format. Code folder contains python files.

Embedding

We use Glove for embedding, please download glove.6B from below link, unzip and place unzipped files in OPTSLA folder.

http://nlp.stanford.edu/data/glove.6B.zip

Python file details

Python file Description
conlleval.py This file is used for evaluating the output
data_preprocessing.py This file is used for data preprocessing
evaluation.py This is the main file containing OPTSLA implementation
functions.py This file contains implementation of few necessary functions

Usage

The following is the sequence of execution:

NOTE: Please update variables in python files before proceeding.

Pre-processing of the data is the first step, to perform the task execute below command

  • python data_preprocessing.py

A new folder named iteration0 is created in execution folder which contains pre-processed files.

Now, execute OPTSLA by running following command

  • python evaluation.py

Once aggregation is done, the model can be evaluated by running following command

  • python calculations.py

This will create a file in calculations folder with the results.

Contact

In case of any queries, please contact us at

  • nasim@iastate.edu
  • aditkulk@iastate.edu

Evaluation Note

The dataset provided contains 4515 sentences, the results published in the paper are evaluated on 3466 sentences to match with baselines.

References

https://www.aclweb.org/anthology/2020.findings-emnlp.119/

Citing

@inproceedings{sabetpour2020optsla, title={OptSLA: an Optimization-Based Approach for Sequential Label Aggregation}, author={Sabetpour, Nasim and Kulkarni, Adithya and Li, Qi}, booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings}, pages={1335--1340}, year={2020} }

About

OptSLA: An Optimization-based approach for Sequence Labeling Aggregation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published