Code to reconstruct Twitter cascades given a set of tweets and a list of followers from the participants in the conversation
The actual code in the encoding_twitter_cascades
notebook:
- read tweets from a mongodb collection.
- extract quoted embedded tweets.
- extract retweeted embedded tweets.
- for each tweet, define its actionType (tweet, retweet, retweet of quote, etc.) and set rootID, parentID, or provisoryParentID accordingly (see
encoding_cascade_functions.checkActionType
). - find best parentID to replace provisoryParent
- find parentID for retweets and quotes
- find rootID for replies and those initially with provisoryParent
- at this point, cascades are reconstructed and can be exported in the Socialsim output format.
- compute several metrics/features to generate a cascade summary collection and save to mongo.