US covid tweets with sentiments and geolocations

This repository contains Tweet IDs and sentiment classifications (Positive, Neutral, Negative, Mixed) for all 12,670,890 tweets analyzed in the paper "Online geolocalized emotion across US cities during the COVID crisis: Universality, policy response, and connection with local mobility". All Tweet IDs in the repository have 'geo' or 'user_location' attributes associated with them, which can be extracted after hydrating the IDs. IDs are organized into folders by date, and stored along with sentiments in csv files with columns "ID" and "Sentiment". A suggested resource for hydrating the tweets is the Hydrator GUI, which allows for hydration with a user-friendly interface and progress indicators. Details on the sentiment classification and geolocation aggregation proocedures used in the manuscript can be found in journal link.

To comply with Twitter’s Terms of Service, we are only publicly releasing the Tweet IDs and inferred sentiments of the collected Tweets, and the data is released for non-commercial research use only. If you use this dataset, you should remain in compliance with Twitter’s Terms of Service, and cite the following manuscript:

Feng, S., Kirkley, A. Integrating online and offline data for crisis management: Online geolocalized emotion, policy response, and local mobility during the COVID crisis. Sci Rep 11, 8514 (2021)


