Skip to content

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



7 Commits

Repository files navigation

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling

Code for text data experiments in:

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling
Benedikt Boecking, Willie Neiswanger, Eric Xing, and Artur Dubrawski
International Conference on Learning Representations (ICLR), 2021

In brief: Please check out the IWS.ipynb notebook. The notebook will walk you through an example application of IWS from start to finish. You can choose to perform the experiment yourself, or to simulate an oracle which responds to queries about proposed labeling functions.


If you have access to GPUs and want to ensure they can be used, please first check the correct way to install PyTorch on your system here:

Once you have installed pytorch (or if you don't care about using a GPU for now) you can install all requirements by running:

pip install -r requirements/requirements_pip.txt
# or
conda install -c conda-forge -c pytorch --file requirements/requirements_conda.txt


To download all data for the text experiments, run:

cd datasets
wget -O iws_datasets.tar.gz
tar -xzvf iws_datasets.tar.gz
rm iws_datasets.tar.gz

Please see datasets/ for links and references to the original data sources and please cite the original sources where appropriate.

Running the IWS Notebook

To run text data experiments, please see the IWS.ipynb notebook.

This notebook will walk you through a full example of interactive weak supervision (IWS), from start to finish. It allows you to choose a text dataset, generate a family of labeling functions (LFs), and then run IWS on this family of LFs, either with an automated oracle or by querying you directly for feedback on LFs. It then trains a downstream classifier via weak supervision methods, using the LFs learned during IWS.


The code in this repository is shared under the MIT license, available in the LICENSE file.


Please cite our paper if you use code from this repo:

  title={Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling},
  author={Boecking, Benedikt and Neiswanger, Willie and Xing, Eric and Dubrawski, Artur},
  booktitle={International Conference on Learning Representations},