The **Twitter Search API** only returns results from the last 7 days. This means that if you are using Tweepy to search for anything - a word, a phrase, or a hashtag - you'll only get recent results. 

This *does not* apply to using the API to access a user's Tweets. Therefore, if your data mining is based on collecting tweets from a set of users, then you can obtain results older than 7 days, as long as you are getting tweets from timelines. 

In this notebook, I'm going to introduce a Python module called ``got3`` that allows you to executes searches for Tweets at any date range. It's a clever application of web scraping (which we discussed briefly in a previous lecture) where you emulate visiting the website, download the HTML, and extract specific results. Since, strangely enough, the *browser version* of search *does* allow searching for older tweets, the ``got3`` emulates visiting the browser search bar, "executes" a search per your parameters, and scrapes the results. In the Python environment, you get tweet objects that you can manipulate and store as normal.

If you decide to use ``got3``, then you can "experiment" using [Twitter's browser search](https://twitter.com/search-advanced?lang=en). If you can generate a set of criteria on Twitter's browser search that gets the tweets you need, then you can configure ``got3`` to get those tweets as well. 

There are two things we need to do to get ``got3`` working on our systems.

First, there's another library that is a dependency of ``got3``. It's called ``pyquery``. In your command line/terminal window, install ``pyquery`` with the following command.

```
pip install pyquery
```

Then, it's time to "install" ``got3`` itself. Unfortunately, ``got3`` is not available through ``pip``, the Python library manager we have been relying on all this time. Instead, we simply have to *put* the ``got3`` files in the right location. On Canvas, there is a zip file you can download. Unzip it, and put the entire directory (which is named ``got3``) in the directory from which you launch Python. As long as it's there, we're all set. 

In [1]:
import got3

To use ``got3``, we first have to make a ``tweetCriteria`` objects that contain our search parameters, and then we have to provide the ``tweetCriteria`` to some code that actually retrieves the tweets. 

Here, I'll make a ``tweetCriteria`` object and the explain what's going on. 

In [2]:
tweetCriteria = got3.manager.TweetCriteria()
tweetCriteria.setQuerySearch("europe refugees")
tweetCriteria.setSince("2015-05-01")
tweetCriteria.setUntil("2015-09-30")
tweetCriteria.setMaxTweets(1000)

<got3.manager.TweetCriteria.TweetCriteria at 0x234162207b8>

The code 

```python
tweetCriteria = got3.manager.TweetCriteria()
```

initializes the ``tweetCriteria`` object. Once we have initialized it, we call several methods belonging to it to specifically specify our search criteria. 

Using ``.setQuerySerach`` I specify my search string. 

Using ``.setSince`` and ``.setUntil``, I indicate the window of time I am searching.

I use ``setMaxTweets`` to indicate the number of tweets I want. 

Now that our ``tweetCriteria`` object is initialized *and* set up, we submit it to another method to actually retrieve the tweets.

In [3]:
tweets = got3.manager.TweetManager.getTweets(tweetCriteria)

The ``tweetCriteria`` object that you made serves as the argument to the method ``got3.manager.TweetManager.getTweets``. This returns an iterable containing all the tweets that match your criteria. 

Since we requested 1000 tweets, there are 1000 in the iterable.

In [5]:
len(tweets)

1000

Let's take a look at what information comes with an individual tweet.

In [12]:
tweet = tweets[2]

The tweet's author id and username. You can use this to gather more tweets using either ``got`` or tweepy.

In [13]:
print(tweet.author_id)
print(tweet.username)

14361155
Refugees


The text of the tweet:

In [15]:
tweet.text

'Rescuers arrived too late to save them all http:// trib.al/wwFa4a3 #Europe #Greece #Syria pic.twitter.com/noIIBfu6Oz'

The number of favorites and retweets.

In [16]:
print(tweet.favorites)
print(tweet.retweets)

26
78


The date and the location (if it has one).

In [33]:
print(tweet.date)
print(tweets[95].geo)

2015-09-30 04:43:19
Manhattan, NY


The hashtags in the Tweet, but as one string. 

In [34]:
hashtags = tweets[24].hashtags
print(hashtags)

#UniteBlue #GunSense #ENDTHENRA


You can split the hashtags using the string method split.

In [35]:
hashtags = hashtags.split()
print(hashtags)

['#UniteBlue', '#GunSense', '#ENDTHENRA']


Same with mentions. 

In [36]:
mentions = tweets[338].mentions
print(mentions)
mentions = mentions.split()
print(mentions)

@UNHCR @YouTube
['@UNHCR', '@YouTube']


In [37]:
mentions

['@UNHCR', '@YouTube']

# Criteria Objects

Once you've created a criteria object, there are number of methods you can call to set the criteria for your search.

In [None]:
tweetCriteria = got3.manager.TweetCriteria()

* ``.setUsername`` restricts yours search to a specific user.
* ``.setQuerySearch`` indicates what term or phrase the returned Tweets should contain. Hashtags work here as well.
* ``.setSince`` takes a string in the format ``YYYY-MM-DD`` and sets the start date for the search window
* ``.setUntil`` takes a string in the format ``YYYY-MM-DD`` and sets the end date for the search window
* ``.setMaxTweets`` indicates how many Tweets to return. If you don't indicate this criteria, the code will attempt to get *ALL* tweets that meet the other criteria. Note, that if your search criteria is either vague enough or popular enough you might get a huge amount of tweets. You may want to compartmentalize your searches. Instead of searching for all tweets that contain #iphone in June, July, and August, write Python code that iterates through *each day* of these months and gets all tweets per day.
* ``.setLang`` takes the two-letter code of a language to restrict the language of the returned tweets.

In [52]:
# Get 50 tweets by barackobama that contain the word 'united' and are in English
# and were published between 10 Sep 2015 and 10 Sep 2016

tweetCriteria.setUsername("barackobama")
tweetCriteria.setQuerySearch("united")
tweetCriteria.setSince("2015-09-10")
tweetCriteria.setUntil("2016-09-10")
tweetCriteria.setMaxTweets(50)
tweetCriteria.setLang("en")

<got3.manager.TweetCriteria.TweetCriteria at 0x23417326518>

In [56]:
obama_united_tweets = got3.manager.TweetManager.getTweets(tweetCriteria)

In [57]:
for x in obama_united_tweets[:5]:
    print(x.text)
    print(x.date)
    print("*"*50)

The United States is leading the way in the fight to #ActOnClimate . pic.twitter.com/vkUMjg6XKc
2016-09-06 23:57:30
**************************************************
The United States just hit one million solar installations—enough to power 5.5 million homes. #MillionSolarStrong
2016-05-03 19:00:16
**************************************************
Retweet if you believe it's time for the United States to #LeadOnLeave . pic.twitter.com/8jsAkBBqzx
2016-04-11 20:34:06
**************************************************
Clean energy currently employs 2.5 million people in the United States. Check it out: http:// ofa.bo/gA3q #ActOnClimate
2016-04-06 03:13:43
**************************************************
The United States of America has the strongest, most durable economy in the world—and it's getting better. pic.twitter.com/mCyYWccvf5
2016-04-01 21:12:40
**************************************************


# GOT and Tweet Location

At first glance, it seems that ``got3`` doesn't provide you with the ability to search by location. However, keep in mind that what ``got3`` is doing behind the scenes is *going to the browser search form*, submitting a search, and retrieving the results. There *is* a way to search for a location in the browser search form; it's by providing a hidden parameter directly to the search string. 

For example, in the online Twitter interface search box, if you input the following string:

```
debate near:"Memphis, Tennessee" within:15km
```

This searches for all the Tweets that contain the word ``debate`` that are within 15km of Memphis, TN. The ``near:`` in the query string indicates the thing after it is a query criteria; same with ``within:``. 

We can "jerry-rig" this on ``got3`` by formatting our argument to ``setQuerySearch`` in the same fashion. 

In [60]:
tweetCriteria = got3.manager.TweetCriteria()
tweetCriteria.setQuerySearch('debate near:"Memphis, TN" within:15km')
tweetCriteria.setMaxTweets(10)
memphis_tweets = got3.manager.TweetManager.getTweets(tweetCriteria)

Note that the location after ``near:`` has to be **in double quotation marks**. That means the *string itself* needs to be in single quotation marks. If you do it the other way around:

```python
"debate near:'Memphis, TN' within:15km" # This won't work
```

It won't work. 

How do you find out how exactly you should "name" a location? 

I recommend you go to the [Twitter Advanced Search page](https://twitter.com/search-advanced?lang=en), click on the option "Near this place", and start typing out the location of interest to you. Twitter will auto-suggest places. Find the one you're looking for and use the wording they use. 

## Latitude and Longitude

It's possible to use this ``.setQuerySearch`` tweak to specify a location by latitude and longitude. The in-query parameter is called ``geocode``, and you need to enter a latitude, longitude, and radius separated by commas. The radius needs to be followed by either ``mi`` or ``km``, indicating units. Note that unlike the ``near:`` criteria, the ``geocode`` criteria does *not* have to be in double quotation marks. 

Here's what people in the capital of the Philippines thought about Trump in April of this year. 

In [70]:
tweetCriteria = got3.manager.TweetCriteria()
tweetCriteria.setQuerySearch('Trump geocode:14.5995,120.9842,20km')
tweetCriteria.setSince("2016-04-01")
tweetCriteria.setUntil("2016-04-30")
tweetCriteria.setMaxTweets(10)
trump_in_philippines = got3.manager.TweetManager.getTweets(tweetCriteria)

In [71]:
for x in trump_in_philippines[:5]:
    print(x.text)
    print('*'*50)

si binay parang donald trump na plastic version :-) HEHEHEHEHE
**************************************************
Donald Trump is confused of 9/11 and 7-Eleven Hahahahaha TF
**************************************************
#republican #standyourground #trump #alllivesmatter @Commander Shooting Range https://www. instagram.com/p/BEDzCAbtid-/
**************************************************
Seriously America if you're going to make Donald Trump President you… https://www. instagram.com/p/BD_LbGoiHWiy g2DYQbbrMOefuoRK2pDQ-ePTHU0/ …
**************************************************
now im worried what if trump will win as president? #iHeartRadioStoleOurAward
**************************************************
