Twitter reply-to-retweet ratio scraping code
This is the scraping and front-end code used to acquire and visualize the data discussed in A Quick Look at the Reply-to-Retweet Ratio.
Requirements: Python 3.6+ (f-strings!)
$ git clone email@example.com:fastforwardlabs/tweetratio.git $ cd tweetratio $ python3 -m virtualenv venv $ source venv/bin/activate $ pip install -r requirements.txt $ mkdir -p raw minified csv # for output
realDonaldTrump's last 3200 tweets as json, and add a
reply_count field to each tweet, do
>>> import tweetratio >>> tweetratio.get_user('realDonaldTrump')
This code has to scrape as well as make API calls, so it will take 30-60 minutes, depending on the speed of your internet connection.
The tweets can then be found in
If you want a minified copy of the tweets, which contains only the keys necessary for the visualization, and the same data as a CSV file, do
>>> import analysis >>> analysis.process('realDonaldTrump')
The minified JSON is saved to
minified/realDonaldTrump.json. The CSV is saved
To run the
locally, download and minify the data for
(see above). If you'd like to plot other accounts, download those and change
$ mv minified/* web/data/ $ cd web $ python3 -m http.server
analysis.py contains simple code to load the tweets as a pandas DataFrame.
>>> import analysis >>> tweets = analysis.load_df() >>> analysis.plot_trend(tweets)
batch_download.py demonstrates how to download the
tweets for a list of users (e.g. the U.S. senators as of June 2017).