# Test `tweepy` functionality

Steps:
1. Create an app at https://developer.twitter.com/en/apps.

* What is your primary reason for using Twitter developer tools? --> Hobbyist: Exploring the API.
* How will you use the Twitter API or Twitter data? --> I am interested in tracking changes in opinions and attitudes during the coronavirus pandemic to identify events or people who are more influential in driving changes. The code and analysis results will be freely shared on Github.
* Please describe how you will analyze Twitter data including any analysis of Tweets or Twitter users. --> I plan to perform sentiment analysis on tweets from US users after specific events, e.g. new CDC recommendations, shelter-in-place orders, travel restriction announcements. 
* Please describe how and where Tweets and/or data about Twitter content will be displayed outside of Twitter. --> I will share analysis results on my Github page github.com/lucymli/tablecloth and on my blog at lucymli.com.

2. Wait for approval from Twitter.

Twitter requested additional information.

The core use case, intent, or business purpose for your use of the Twitter APIs.
* Use of Twitter APIs would be purely for research purposes.  I am interested in tracking changes in opinions and attitudes during the coronavirus pandemic to identify events or people who are more influential in driving changes.  The code and analysis results will be shared on Github.
If you intend to analyze Tweets, Twitter users, or their content, share details about the analyses you plan to conduct, and the methods or techniques.
* I will produce time series of tweets related to COVID-19 for different locations around the world, train deep learning models to infer sentiment (towards pandemic restrictions), and identify correlations between changes in public health orders/incidence/COVID-19 media reports and changes in sentiment.
If your use involves Tweeting, Retweeting, or liking content, share how you’ll interact with Twitter accounts, or their content.
* I will not tweet, retweet, or like content.
If you’ll display Twitter content off of Twitter, explain how, and where, Tweets and Twitter content will be displayed with your product or service, including whether Tweets and Twitter content will be displayed at row level, or aggregated.
* No individual tweets will be displayed, and no user names will be revealed.  Only aggregated data will be displayed.

3. Go to the API Keys tab, and copy the Consumer Key and Consumer Secret Keys.
> Name your app: twitter_covid_tablecloth

4. Click 'Create my access token' and copy the Access Token and Access Token Secret.
5. Create config file named 'tweepy.ini' that looks like this:

[twitter]\
ConsumerKey = [Consumer Key]\
ConsumerSecret = [Consumer Secret Key]\
AccessToken = [Access Token]\
AccessTokenSecret = [Access Token Secret]

### Set up Twitter authentication

In [115]:
import tweepy
import configparser
import zipcodes
import pandas as pd
import censusdata
import us

config = configparser.ConfigParser()
config.read('tweepy.ini')


auth = tweepy.OAuthHandler(config["twitter"]["ConsumerKey"], config["twitter"]["ConsumerSecret"])
auth.set_access_token(config["twitter"]["AccessToken"], config["twitter"]["AccessTokenSecret"])

api = tweepy.API(auth)

### Get public tweets

In [30]:
public_tweets = api.home_timeline()
for tweet in public_tweets:
    print(tweet.text)

wait, "go away for a year and a half until it disappears"? so it's not going anywhere for a year and a half? #Debates2020
It’s the only thing Trump can do https://t.co/oPyLfY02yt
I’ve been waiting for @Sarah_Brayne’s book! Can’t wait to read it. https://t.co/I994opdZqH
yes masks, yes [easy, affordable, and broad] access to rapid testing! #Debates2020
RT @CharlesPPierce: Goggles??
RT @ddale8: Trump's first sentence is false. 2.2 million people were not "expected to die."

That was an estimate for *what would happen if…
if there's a cure, why do we need ventilators? #Debates2020
RT @DrTsion: Confirming Amy Coney Barrett will end abortion care as we know it | Opinion https://t.co/ZerUWfSgYQ #SRHR by Dr. Jenn Conti @D…
5 days to go until the leading data science virtual training conference! Don’t miss your chance to learn from the b… https://t.co/cr86IBtmCx
Live poultry feeding and trading network and the transmission of avian influenza A(H5N6) virus in a large city in C… https://t.co/IC7C

## Search keywords

In [43]:
searchQuery = 'COVID or COVID-19 or coronavirus' # Keyword

new_tweets = api.search(q=searchQuery, count=100, 
                        result_type = "recent",
                        lang = "en") 

In [40]:
new_tweets[0]._json["text"]

'RT @Healthline: Nearly a quarter of hospitalized coronavirus patients have experienced long-term heart damage, including myocardial injury,…'

## Search by geography

In [78]:
county_census_geo = censusdata.censusgeo([('county', '*')])

In [91]:
county_census_names = censusdata.geographies(county_census_geo, 'acs1', 2018).keys()

In [119]:
# Refer to https://www.fgdl.org/metadata/metadata_archive/fgdl_html/cenacs_dec11.htm
# for variables e.g. B01003_001E is total population size
county_census = censusdata.download('acs1', 2018, county_census_geo, ['B01003_001E']).rename(columns={"B01003_001E":"total_pop_size"}).assign(name=county_census_names)
county_census["county"] = county_census["name"].apply(lambda x: x.split(",")[0])
county_census["state_name"] = county_census["name"].apply(lambda x: x.split(",")[1])
county_census["state"] = county_census["state_name"].apply(lambda x: us.states.lookup(x).abbr)

In [64]:
zips = pd.DataFrame(zipcodes.list_all()).astype({"lat":float, "long"})

In [67]:
county_zips = zips[zips["county"]!=""].groupby(["county", "state"])[["lat", "long"]].median().reset_index()

In [121]:
county_info = pd.merge(county_zips, county_census)

Sort counties by total_pop_size in 2018

In [124]:
county_info.sort_values(by="total_pop_size", ascending=False)

Unnamed: 0,county,state,lat,long,total_pop_size,name,state_name
200,Davis County,UT,41.03875,-111.93980,10105518,"Davis County, Utah",Utah
403,Lake County,OH,41.66620,-81.33990,5180493,"Lake County, Ohio",Ohio
168,Collier County,FL,26.14180,-81.71680,4698619,"Collier County, Florida",Florida
42,Bastrop County,TX,30.11685,-97.33600,4410824,"Bastrop County, Texas",Texas
198,Davidson County,TN,36.16560,-86.78195,3343364,"Davidson County, Tennessee",Tennessee
...,...,...,...,...,...,...,...
389,Klamath County,OR,42.39300,-121.70990,64227,"Klamath County, Oregon",Oregon
737,Toa Alta Municipio,PR,18.35780,-66.25810,63746,"Toa Alta Municipio, Puerto Rico",Puerto Rico
357,Jefferson County,MO,38.28080,-90.46515,63711,"Jefferson County, Missouri",Missouri
602,Providence County,RI,41.84890,-71.46180,63227,"Providence County, Rhode Island",Rhode Island


In [125]:
new_tweets = county_info.sort_values(by="total_pop_size", ascending=False).head(n=100).\
apply(lambda x: f'{x["lat"]},{x["long"]},10mi', axis=1).\
apply(lambda x: api.search(q=searchQuery, geocode=x, result_type = "recent", lang = "en"))


Recent tweets from the 100 most populous counties in the US

In [129]:
new_tweets.iloc[0][0]._json["text"]

'RT @LaytonFYI: With high levels of COVID-19, face coverings are required &amp; social gatherings are limited to 10 people or less. Keep 6 ft. o…'

## Get user information

In [16]:
user = api.get_user('twitter')