MPEDS Annotation Interface
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

MPEDS Annotation Interface

This is the annotation interface used in creating datasets for the Machine-learning Protest Event Data System (MPEDS). While applied to the specific task of coding for protest events, this can also be used for the development of other types of event datasets.

This system is built in Python using the Flask microframework. It can source articles parsed from Lexis-Nexis (using the split-ln.py script), Apache Solr, or XML files formatted in News Industry Text Format, such as the LDC's New York Times Annotated Corpus.

It also uses Bootstrap for CSS and jQuery for JavaScript. It only works in Firefox (for now).

Setup

To populate the database with example information, first run the setup script.

python setup.py

This will add five users: an admin (admin), two first-pass coders (coder1p_1, coder1p_2), and two second-pass coders (coder2p_1, coder2p_2). They will all have the password default). It will add a variable hierarchy for second-pass coding. It will also enter metadata for all the articles in the example-articles directory, and queue them up for the first-pass coders.

Then run the Flask test server with the following.

python mpeds_coder.py

Publications

  • MPEDS: Automating the Generation of Protest Event Data. 2017. SocArXiv

Development plan

This is a product in early alpha stages. Features we hope to have working eventually:

  • Template system for variables
  • Ability to specify multiple article sources
  • Generalizing an n-pass structure and control flow
  • Ability for multiple database integration
  • Cross-browser compatibility

If you're a movement or event data scholar and have a specific project for which you think this would be a good tool, shoot Alex Hanna (alex.hanna@gmail.com) a message.

Acknowledgments

Development of this interface has been supported by a National Science Foundation Graduate Research Fellowship and National Science Foundation grant SES-1423784. Thanks to Emanuel Ubert and Katie Fallon for working with this system since its inception, and to many undergraduate annotators who have put a lot of time working with and refining this system.