Skip to content

uclanlp/Fast-and-Robust-Text-Classification

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code
This branch is 28 commits ahead of rizwan09:master.

Robust Text Classifier on Test-Time Budgets

Md Rizwan Parvez, Tolga Bolukbasi Kai-Wei Chang,Venkatesh Saligrama: EMNLP-IJCAI 2019

For details, please refer to this paper

We design a generic framework for learning a robust text classification model that achieves high accuracy under different selection budgets (a.k.a selection rates) at test-time. We take a different approach from existing methods and learn to dynamically filter a large fraction of unimportant words by a low-complexity selector such that any high-complexity classifier only needs to process a small fraction of text, relevant for the target task. To this end, we propose a data aggregation method for training the classifier, allowing it to achieve competitive performance on fractured sentences. On four benchmark text classification tasks, we demonstrate that the framework gains consistent speedup with little degradation in accuracy on various selection budgets.

Our framework
Our proposed framework. Given a selection rate, a selector is designed to select relevant words and pass them to the classifier. To make the classifier robust against fractured sentences, we aggregate outputs from different selectors and train the classifier on the aggregated corpus.

please see the run_scripts run_experiments.py, lstm_experiments.py, source code skim_LSTM.py

See an example source code of L1 regularized bag-of-words selector: (i) train model (ii) generate selector output text using the trained model

@article{parvez2018building,
 title={Building a Robust Text Classifier on a Test-Time Budget},
 author={Parvez, Md Rizwan and Balukbasi, Tolga and Sarigrama, Venkatesh},
 booktile={emnlp}
 year={2019}
}
Results

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.1%
  • Perl 1.0%
  • C 0.9%
  • HTML 0.7%
  • Jsonnet 0.2%
  • Scilab 0.1%