Name		Name	Last commit message	Last commit date
parent directory ..
NLTK		NLTK
data		data
spaCy		spaCy
1 - Tweets from IDs.ipynb		1 - Tweets from IDs.ipynb
README.md		README.md

Twitter Disasters

This project involved summarizing tweets which are spatially and geographically linked to a disaster, and creating a summary of tweets which can then be useful to rescue teams.

In essence, this involved implementing the following two papers:

Extracting Situational Information from Microblogs during Disaster Events: a Classification-Summarization Approach
Summarizing Situational Tweets in Crisis Scenario

The tweets used are from Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Message.

Link to blog posts where I describe my approach:

COWTS: https://medium.com/@gabrieltseng/summarizing-tweets-in-a-disaster-e6b355a41732

COWABS: https://medium.com/@gabrieltseng/summarizing-tweets-in-a-disaster-part-ii-67db021d378d

I repeat the exercise using both NLTK and spaCy, to compare the results of using different NLP tools.

Tweets from IDs

This notebook involved using Twython to get the tweets from the tweet IDs (since the above corpus only stores tweet ids and user ids).

The following two files are both in the NLTK and spaCy folders:

Content Word Based Tweet Summarization (COWTS)

In this notebook, I identify content words in the tweets, and assign them tf-idf scores. I then use this information (and Integer Linear Programming) to generate a summary of the best tweets.

COntent Words Based ABstractive Summarization (COWABS)

In this notebook, I use the tweet summarization created in COWTS to generate a word graph, and word paths through this word graph. I then use Integer Linear Programming to pick the best word paths, to create a summary which goes beyond the tweets to generate a paragraph.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

twitter_disasters

twitter_disasters

NLTK

NLTK

data

data

spaCy

spaCy

1 - Tweets from IDs.ipynb

1 - Tweets from IDs.ipynb

README.md

README.md

README.md

Twitter Disasters

Tweets from IDs

Content Word Based Tweet Summarization (COWTS)

COntent Words Based ABstractive Summarization (COWABS)

Files

twitter_disasters

Directory actions

More options

Directory actions

More options

Latest commit

History

twitter_disasters

Folders and files

parent directory

Twitter Disasters

Content Word Based Tweet Summarization (COWTS)

COntent Words Based ABstractive Summarization (COWABS)