Skip to content

naomiehl/MachineLearning_RetweetPrediction

Repository files navigation

Covid19_RT_Prediction

Twitter and Retweets

Twitter is a microblogging platform that allows you to send and receive short posts called tweets. Users may interact with each other by commenting, liking and rebroadcasting a tweet. Rebroadcasting a tweet, which is called retweeting, shares the tweet with the users followers without any change. Retweeting can be seen as amplifying the spread of original content and thus retweet prediction is a crucial task when studying information spreading processes.

Retweets allow one to track the flow of information on Twitter because they indicate situations where a user felt a tweet was important enough that he shared it with his followers. For this reason, to predict information spreading in Twitter, we wish to predict the number of retweets a tweet might get. Some applications that take advantage of the number of retweets include fake news spreading and tracking and mass emergency management.

Retweet prediction challenge

For this challenge, your mission is to accurately predict the number of retweets a tweet will get. The provided dataset is a small subset that was extracted from the COVID19 Twitter dataset that was collect by the DaSciM team during the first wave of lockdowns (March 2020). It contains tweet related information, as the text and the number of hashtags, mentions and URLs contained in the tweet, and user related information as the number of followers and tweet that he has published.

Your solution can be based on supervised or unsupervised techniques or on a combination of both. You should aim for the minimum Mean Absolute Error (MAE) .

About

Projet INF554 - Ecole Polytechnique

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors