Skip to content

seena00/Headline-Language-Analyzer

 
 

Repository files navigation

Headline Sarcasm Detection


A set of three different classifiers that identify satire in news headlines.


Summary: Recent debate over fake news makes it important to have a method of identifying the validity of headlines. Especially with the nature of the contemporary political atmosphere, satire and real news are increasingly difficult to differentiate. Using an extensive dataset of headlines from Kaggle, our group first appended the original dataset with numerical variables describing each headline. Then, we performed a hypothesis test to determine the statistical significance of headline character length on sarcastic vs. non-sarcastic articles (A/B testing). Finally, we built, trained, and evaluated three classifiers (Naive Bayes, k-Nearest Neighbors, and Multilayer Perceptron) for the purpose of detecting sarcasm in news headlines.

Link to our dataset: https://www.kaggle.com/rmisra/news-headlines-dataset-for-sarcasm-detection

Link to our pitch deck: https://docs.google.com/presentation/d/1FRci0cazV-hCFm9mKHR-GUqdMgsujb5T9tqFilcxw54/edit?usp=sharing


Seena Saiedian, Edward Liu, Alex Xu, Will Furtado, Jason Xiong, and Kathlee Wong's project for Data Science Society at Berkeley's Fall 2019 General Membership program.

About

A set of three different classifiers that identify satire in news headlines.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%