Skip to content

littlewine/snorkel-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repo contains the code used to run the experiments of Semi-supervised Ensemble Learning with Weak Supervision for Biomedical Relationship Extraction, presented in the Automated Knowledge Base Construction 2019 conference in Amherst, Massachusetts.

This specific methodology can be used as is to every relationship extraction problem, to extend training datasets to arbitrarily large weakly supervised datasets. If you are using it, please cite our paper

The code is based on snorkel v0.6.2, a framework for information extraction using weak supervision.

Build Status Documentation License

Installation

Snorkel uses Python 2.7 or Python 3 and requires a few python packages which can be installed using conda and pip.

Setting Up Conda

Installation is easiest if you download and install conda. You can create a new conda environment with e.g.:

conda create -n py2Env python=2.7 anaconda

And then run the correct environment:

source activate py2Env

Installing dependencies

First install NUMBA, a package for high-performance numeric computing in Python via Conda:

conda install numba

Then install the remaining package requirements:

pip install --requirement python-package-requirement.txt

Finally, enable ipywidgets:

jupyter nbextension enable --py widgetsnbextension --sys-prefix

Note: If you are using conda and experience issues with lxml, try running conda install libxml2.

Note: Currently the Viewer is supported on the following versions:

  • jupyter: 4.1
  • jupyter notebook: 4.2

In some tutorials, etc. we also use Stanford CoreNLP for pre-processing text; you will be prompted to install this when you run run.sh.

Running

After installing, just run:

./run_local.sh

The code used to perform the experiments for semi-supervised learning (using ML models as weak sources of supervision) can be found in /my-code/