Reddit front page prediction from titles and subreddit origin.
Sentiment analysis is a complex topic, I believe there is a perfect application to see how people react to headlines. Here we can use the reddit's voting engine on what is "popular" to see how people react to headlines.
The data would be headlines, times, web-urls, and subreddits of new and front page posts.
Using standard NLP modeling techniques, TF-IDF, tokenizing, stemming of headlines.
Using a classification model, or ensemble of classification models with labels as to whether or not they made it to the front page.
Accuracy as to whether or not predicted posts make it to the front page.
Web app with word cloud, and allows users to get a prediction probability on a headline they type in.