Skip to content

Successfully established a machine learning model for detecting whether a given tweet is about a real disaster or not.

Notifications You must be signed in to change notification settings

SayamAlt/Detection-of-Disaster-from-Tweets

Repository files navigation

Detection-of-Disaster-from-Tweets

Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter (i.e. disaster relief organizations and news agencies).

But, it’s not always clear whether a person’s words are actually announcing a disaster. Take this example:

Disaster Tweet

The author explicitly uses the word “ABLAZE” but means it metaphorically. This is clear to a human right away, especially with the visual aid. But it’s less clear to a machine.

Objective

Develop a machine learning model that predicts which Tweets are about real disasters and which ones aren’t.

Python's Libraries Used

  • Numpy
  • Pandas
  • Seaborn
  • Matplotlib.pyplot
  • Warnings
  • NLTK
  • re
  • string
  • SymSpellPy
  • Sklearn
  • XGBoost
  • WordCloud

Predictive Accuracy of Machine Learning Models Used

Model Name Accuracy Score
Random Forests Classifier 79.06%
Decision Tree Classifier 75.38%
Multinomial Naive Bayes 80.53%
Support Vector Classifier 79.62%
K Nearest Neighbors 68.89%
Logistic Regression 79.06%
XG Boost Classifier 77.64%

Best Performing Algorithm

The Multinomial Naive Bayes made the most accurate predictions on detection of a real disaster from any tweet given in the dataset with an overall accuracy of almost 81%. This proves the effectiveness of the Multinomial Naive Bayes algorithm in text classification tasks.

Worst Performing Algorithm

K Nearest Neighbors had the worst performance among all the ML algorithms used, having an accuracy of just over 68%.

Acknowledgments

This dataset was created by the company figure-eight and originally shared on their 'Data For Everyone' website here.

About

Successfully established a machine learning model for detecting whether a given tweet is about a real disaster or not.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published