This is the base code for Dynamics of Polarizing Rhetoric in Congressional Tweets
- install pytorch (CUDA recommended if GPU available)
- clone this git and install requirements
- install mongoDB, set up a database named
wanderingpole
and a collection namedtweets
Using the new academic license for Twitter, it is possible to collect a full collection of tweets for each user. This is done using collect.py in the collectingtweets directory.
Relies on the terrific HuggingFace versions of RoBERTa/BERTweet. Run with train_classifier.py.
The latest (pretrained) model we use for classifying polarization is available on huggingface
After training a model (or downloading the pre-trained model), classification can be run on the tweets in the mongoDB collection using the files in the classify_tweets directory
Various scripts available in the analyses directory.