## References

#### Twitter Data Mining: A Guide to Big Data Analytics Using Python, Anthony Sistilli
- https://www.toptal.com/python/twitter-data-mining-using-python

#### Mining Twitter Data with Python (Part 1: Collecting data), Marco Bonzanini
- https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/

#### Tweepy Documentation, v3.6.0
- http://tweepy.readthedocs.io/en/v3.6.0/index.html
- API Reference http://tweepy.readthedocs.io/en/v3.6.0/api.html#api-reference

#### Twitter Developer Docs
- https://developer.twitter.com/en/docs
- Search Tweets (Guides) https://developer.twitter.com/en/docs/tweets/search/guides/standard-operators
- Search Tweets (API Reference) https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets
- Introduction to Tweet JSON https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/intro-to-tweet-json
- Tweet Objects https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object

#### How to use Twitter’s Search REST API most effectively., Bhaskar Karambelkar
- https://www.karambelkar.info/2015/01/how-to-use-twitters-search-rest-api-most-effectively./

#### stackoverflow
- Questions tagged [tweepy] https://stackoverflow.com/questions/tagged/tweepy
- -filter:retweets https://stackoverflow.com/questions/38872195/tweepy-exclude-retweets

## Load Packages

In [1]:
import numpy as np
import pandas as pd
import tweepy

In [2]:
import json

## Assign Authentication Keys, Tokens and Secrets

In [3]:
import config

consumer_key = config.twitter_anidata_consumer_key
consumer_secret = config.twitter_anidata_consumer_secret
access_token = config.twitter_anidata_access_token
access_token_secret = config.twitter_anidata_access_token_secret

## Create API Object

In [4]:
# Creating the authentication object
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
# Setting access token and secret
auth.set_access_token(access_token, access_token_secret)
# Creating the API object while passing in auth information
api = tweepy.API(auth) 

## About API Arguments

http://tweepy.readthedocs.io/en/v3.6.0/api.html#api-reference

API.search(q[, lang][, locale][, rpp][, page][, since_id][, geocode][, show_user])
Returns tweets that match a specified query.

Parameters:	
- q – the search query string
- lang – Restricts tweets to the given language, given by an ISO 639-1 code.
- locale – Specify the language of the query you are sending. This is intended for language-specific clients and the default should work in the majority of cases.
- rpp – The number of tweets to return per page, up to a max of 100.
   - `rpp` has been replaced by `count`
- page – The page number (starting at 1) to return, up to a max of roughly 1500 results (based on rpp * page.
- since_id – Returns only statuses with an ID greater than (that is, more recent than) the specified ID.
- geocode – Returns tweets by users located within a given radius of the given latitude/longitude. The location is preferentially taking from the Geotagging API, but will fall back to their Twitter profile. The parameter value is specified by “latitide,longitude,radius”, where radius units must be specified as either “mi” (miles) or “km” (kilometers). Note that you cannot use the near operator via the API to geocode arbitrary locations; however you can use this geocode parameter to search near geocodes directly.
- show_user – When true, prepends “<user>:” to the beginning of the tweet. This is useful for readers that do not display Atom’s author field. The default is false.

Return type:	list of SearchResults objects

## Scrape and Explore

In [5]:
# read our own timeline (i.e. our Twitter homepage)
for status in tweepy.Cursor(api.home_timeline).items(10):
    # Process a single status
    print(status.text)

https://t.co/IAyGjIfRmd
Grilling Tips and Techniques https://t.co/OgertcNftB
How T.I. rose to fame—and his next act: saving the neighborhood that could've killed him https://t.co/rsx9cueH0Q
Hackers meddled in the 2016 U.S. presidential election. Now cyber companies are offering states free help. https://t.co/Nnzb8CNQCD
10 popular techniques used by manipulators (and how to fight them) https://t.co/Dkf6sGdewK
Our Moon May Have Briefly Harbored Life, Say Astrobiologists
https://t.co/7TlexGbCty
20+ Great Patios For Eating &amp; Drinking In Washington, DC https://t.co/4H8OjgQSzW via @washingtondc
#DYK that GDOT is planning, constructing or has completed a number of P3 projects? What's a "P3" you might ask? Che… https://t.co/8QP3PoWH93
5 Pro-Tips For Data Scientists To Write Good Code https://t.co/oOvCPdtgMv
Opinion: The documents show the bureau relied heavily on the Steele dossier https://t.co/3NR4VryS0g


In [6]:
def process_or_store(tweet):
    print(json.dumps(tweet))

In [7]:
# process/store the JSON
for status in tweepy.Cursor(api.home_timeline).items(10):
    # Process a single status
    process_or_store(status._json)

{"created_at": "Tue Jul 24 23:06:09 +0000 2018", "id": 1021894333237215233, "id_str": "1021894333237215233", "text": "https://t.co/IAyGjIfRmd", "truncated": false, "entities": {"hashtags": [], "symbols": [], "user_mentions": [], "urls": [{"url": "https://t.co/IAyGjIfRmd", "expanded_url": "http://www.forbes.com/sites/jamos/2018/07/24/will-the-curtain-close-on-small-town-movie-theaters/?utm_source=TWITTER&utm_medium=social&utm_term=Dottie/#646f7474696", "display_url": "forbes.com/sites/jamos/20\u2026", "indices": [0, 23]}]}, "source": "<a href=\"http://www.forbes.com\" rel=\"nofollow\">Malorie</a>", "in_reply_to_status_id": null, "in_reply_to_status_id_str": null, "in_reply_to_user_id": null, "in_reply_to_user_id_str": null, "in_reply_to_screen_name": null, "user": {"id": 91478624, "id_str": "91478624", "name": "Forbes", "screen_name": "Forbes", "location": "New York, NY", "description": "Official Twitter account of https://t.co/LUUqtjU6Xh, homepage for the world's business leaders.", "u

### Assign API Argument Values

In [8]:
# The search term you want to find
query = '#georgia -filter:retweets'
# Language code (follows ISO 639-1 standards)
language = "en"
count = 100

### Search Twitter

In [9]:
results = api.search(q=query, lang=language, count=count, tweet_mode='extended')

for tweet in results:
   # printing the text stored inside the tweet object
   print(tweet.user.name, tweet.full_text)

happygoth Whew, just barely did the thing. #vote #runoff #georgia https://t.co/RTYs0wqATe
Respectfully Submitted #Georgia. Don’t be discouraged. All supporters of blue candidates! Soldier on! It’s your voice that could preserve America’s democracy. https://t.co/n5pKg4qElm
Jay Glatting @Golden_Isles Sea Island Most Expensive In #Georgia 
https://t.co/G7jyJz1ALv #RealEstate
Vahraz :), no words could ever describe this level of stupidity. Don’t forget who voted for him. :)
https://t.co/rE913uO6HY
#JasonSpencer #stupidass #publiceducation #georgia #Georgian
UAB Thank you to all of the bus drivers for caring for kids. They are unsung heroes. #BeKiND https://t.co/0uBI0akjcG #driver-recruitment-retention #Georgia # via @SBFMagazine
James Sinko Latest Marietta Barnes Mill Weather: Time 07:00PM, Temp: 80.4°F, Dew Point: 74.1°F, Feels Like: 85.3°F, Wind: SSE at 3mph #GAwx #Georgia #GaWxCond
Steve Wiltfong New #Georgia five-star DT commit Travon Walker is going to play a lot of MIKE backer this f

In [10]:
df = pd.DataFrame({"text": [x.full_text for x in results],
                   "name": [x.user.name for x in results]},
                   index = [x.id_str for x in results])


In [11]:
df.index.name = "tweet_id"

In [12]:
df

Unnamed: 0_level_0,name,text
tweet_id,Unnamed: 1_level_1,Unnamed: 2_level_1
1021893574034694145,happygoth,"Whew, just barely did the thing. #vote #runoff..."
1021893496209383424,Respectfully Submitted,#Georgia. Don’t be discouraged. All supporters...
1021893271864389632,Jay Glatting,@Golden_Isles Sea Island Most Expensive In #Ge...
1021892857295261697,Vahraz,":), no words could ever describe this level of..."
1021892845886570496,UAB,Thank you to all of the bus drivers for caring...
1021892797396344834,James Sinko,Latest Marietta Barnes Mill Weather: Time 07:0...
1021892748536934400,Steve Wiltfong,New #Georgia five-star DT commit Travon Walker...
1021892333493788672,Evening Sports Page,🚨BREAKING🚨\n\n#Georgia adds another 5⭐️ to the...
1021892209854107649,Rivulet Liqueur,Can’t wait until #SAVFW2018 cheers! #Savannah ...
1021892028500717570,Bae,How much y’all pay for #car #insurance in #Geo...


In [13]:
len(results)

100

In [15]:
df.to_csv("anidata_twitter.csv")