# Get old tweets

* There is a python package, "got", which can be cloned from the link below.
* Link: https://github.com/Jefferson-Henrique/GetOldTweets-python
* Clone the repo and locate folder named got(for Python2) or got3(for Python3).
* It is useful for getting tweets more than 6 months.
* It's directly scraping from twitter web.
* As more recent tweets are located on the top of olders ones, it's scraping from the end of month to beginning of the month.
* If you want to scrape tweets over month, you will need to split range of time into smaller pieces and implement following function.

In [1]:
import got3
import pandas as pd

**List of timesplits**

In [2]:
list_of_time_tuples = [("-01-01", "-01-31"),("-02-01", "-02-28"),("-03-01", "-03-31"),("-04-01", "-04-30"),
                       ("-05-01", "-05-31"),("-06-01", "-06-30"),("-07-01", "-07-31"),("-08-01", "-08-31"),
                       ("-09-01", "-09-30"),("-10-01", "-10-31"),("-11-01", "-11-30"),("-12-01", "-12-31")]

**Function scraping tweets**
* Input: key_word for query, year for search, max_tweets, list_of_time_tuples
* Output: dataframe of scraped tweets with columns, ["year", "date", "id", "username", "text"]

In [3]:
def get_monthly_tweets(key_word, year, max_tweets, list_of_time_tuples):
    tweets_df = pd.DataFrame()
    for split in list_of_time_tuples:
        since = str(year) + split[0]
        until = str(year) + split[1]
        tweetCriteria = got3.manager.TweetCriteria().setQuerySearch(key_word)\
                .setSince(since).setUntil(until).setMaxTweets(max_tweets)
        tweets = got3.manager.TweetManager.getTweets(tweetCriteria)
        for tweet in tweets:
            temp = {"year": year, "date": tweet.date,"id": tweet.id,"username": tweet.username,"text": tweet.text}
            tweets_df = tweets_df.append(temp, ignore_index=True)
        print("Collected tweets from month %s." % split[0].split("-")[1])
    return tweets_df

## Example: Texas GDP

In [4]:
texas_2015 = get_monthly_tweets("Texas GDP", year=2015, max_tweets=1000, list_of_time_tuples=list_of_time_tuples)

Collected tweets from month 01.
Collected tweets from month 02.
Collected tweets from month 03.
Collected tweets from month 04.
Collected tweets from month 05.
Collected tweets from month 06.
Collected tweets from month 07.
Collected tweets from month 08.
Collected tweets from month 09.
Collected tweets from month 10.
Collected tweets from month 11.
Collected tweets from month 12.


In [10]:
texas_2015.sample(10)

Unnamed: 0,date,id,text,username,year
11207,2015-12-23 11:30:56,679700729938984960,Did you know that if #Texas was its own countr...,,2015.0
4947,2015-05-28 15:27:53,604006197100863488,CleanTX & @ATI_UT Issue Economic Impact Report...,,2015.0
7417,2015-08-20 03:43:30,634269510833471488,"""Removing undocumented workers would wipe out ...",,2015.0
575,2015-01-24 02:18:26,558886528324493312,$ TXN :US Trading Radar: FOMC Meeting And US G...,,2015.0
11414,2015-12-19 23:17:27,678428977841704961,Canada's GDP = GDP of State of Texas http:// w...,,2015.0
1773,2015-02-09 19:55:13,564950681501855744,@petercnorton the other issue is with gdp of T...,,2015.0
3136,2015-04-15 14:41:58,588411959785041920,North Texas grows by one person every 5 minute...,,2015.0
6537,2015-07-20 17:00:18,623236011292659712,"#Texas has a higher GDP than Spain, Korea and ...",,2015.0
3433,2015-04-17 14:15:34,589130093538119680,"Hey MSNBC, Gov Perry was the commander in chie...",,2015.0
6863,2015-07-28 16:01:31,626120319636365312,@scottlincicome Texas GDP running above 5%. Ch...,,2015.0
