Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
2012_Sandy_Hurricane
2013_Alberta_Floods
2013_Boston_Bombings
2013_Oklahoma_Tornado
2013_Queensland_Floods
2013_West_Texas_Explosion
README.md

README.md

Contents of this directory

This directory contains tweets labeled by crowdsourcing workers. Each tweet is accompanied by a label, which is the result of the majority voting among at least 3 crowdsourcing workers.

There is one sub-directory per crisis, for each of the following disasters:

On-topic/Off-topic files: *-ontopic_offtopic.csv

Contents: Each file contains approximately 10,000 tweets. 50% of these tweets were sampled from the geo-based sample, and 50% from the keywords-based sample. These two samples are described in [Olteanu et al. 2014].

Labels: These files contain labels indicating if a tweet is on-topic (related to the crisis at hand), or off-topic (not related to it).

File format: One tweet per line with the following comma-separated fields: tweet id, tweet text, tweet label

Questions/inquiries

[Olteanu et al. 2014] Alexandra Olteanu, Carlos Castillo, Fernando Diaz, Sarah Vieweg: "CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises". ICWSM 2014.

For inquiries please contact Alexandra Olteanu, or Carlos Castillo, or Fernando Diaz, or Sarah Vieweg.

Version history

  • 2014-10-26: v1.0, initial release containing labeled tweets only.