Using Social Media to Enhance Emergency Situation Awareness

Project for the Web Information Retrieval course at La Sapienza, Università di Roma held by Prof. Andrea Vitaletti and Prof. Luca Becchetti.

The project wants to detect an emergency situation in real time through tweets flow scanning using machine learning algorithms and users as sensors.

Link to the Slideshare presentation. Scientific paper on the repository.

Authors

Daniele Davoli - Linkedin profile
Danilo Marzilli - Linkedin profile
Andrea Lombardo - Linkedin profile

Dataset

For training and validating our machine learning system, we have used a dataset of 5,642 manually annotated tweets in the Italian language. The tweets are related to 4 different natural disasters occurred in Italy between 2009 and 2014. For each tweet is reported:

tweet ID;
text;
source;
author’s screen name;
author’s ID;
latitude and longitude (if available);
time;
disaster ID (see below);
class.

Tweets have been manually annotated by humans and divided among 3 classes according to the information they convey:

damage class: tweets related to the disaster and carrying information about damages to the infrastructures or on the population;
no damage class: tweets related to the disaster but not carrying relevant information for the assessment of damages;
not relevant class: tweets collected while building the dataset, but not related to any disaster (noise).

The dataset ins also available in this repository. Validations with your datasets are welcome :)

Data pipeline

We process our dataset in this order:

Import data from the .csv file;
Preprocessing our tweets in order to remove punctuation, stop words and digits and to implement the stemming algorithm;
Trasform the tweets in vectors in a space vector where the axis are the vocabulary terms and give to each vector a TF-IDF (Term Frequency and Inverted Document Frequency) weight;
Cluster our tweets (now vectors) in main topics;
Train a SVM classifier in order to distinguish the tweets in relevant and not relevant.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.idea		.idea
main		main
plots		plots
Cresci-SWDM15.csv		Cresci-SWDM15.csv
Paper.pdf		Paper.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

main

main

plots

plots

Cresci-SWDM15.csv

Cresci-SWDM15.csv

Paper.pdf

Paper.pdf

README.md

README.md

Repository files navigation

Using Social Media to Enhance Emergency Situation Awareness

Authors

Dataset

Data pipeline

About

Releases

Packages

Contributors 3

Languages

danieledav/emergency_detection

Folders and files

Latest commit

History

Repository files navigation

Using Social Media to Enhance Emergency Situation Awareness

Authors

Dataset

Data pipeline

About

Resources

Stars

Watchers

Forks

Languages