# Downloading Twitter Data

Twitter provides a packages for downloading tweets that can be installed with pip or pip3:

```
pip install tweepy
```

Before getting started with the code below, you need a Twitter account, and then you need to get a Twitter Developer Account. Sign up [here](https://developer.twitter.com/en/apply-for-access.html)

A personal account with standard access is a good starting point. You can always upgrade later.

After you have developer access, you will be taken to a documentation page to help you get started. To get the authentication keys you need to create an app. You will need to have a web page set up for the app. The guidelines walk you through it. 

Once the app is set up you can get keys and put them in the code as outlined below.


In [1]:
consumer_key = 'your info here'
consumer_secret = 'your info here'
access_token = 'your info here'
access_token_secret = 'your info here'

In [3]:
# make sure to install tweepy first
import tweepy as tw

auth = tw.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tw.API(auth, wait_on_rate_limit=True)

In [4]:
# try a search
search_words = '#nlp+#ml -filter:retweets'
date_since = "2019-07-01"

In [5]:
# Collect 5 tweets about nlp
tweets = tw.Cursor(api.search,
              q=search_words,
              lang="en",
              since=date_since).items(5)
tweets

<tweepy.cursor.ItemIterator at 0x111085160>

## Stored information

A lot of information is available in each tweet besides the text. Type tweet-dot-tab to see a list

These include:
tweet.user.location

In [6]:
# save data in a list
user_location_text = [[tweet.user.screen_name, tweet.user.location, tweet.text]
                      for tweet in tweets]

In [7]:
# store it in a pandas data frame
import pandas as pd

df = pd.DataFrame(data=user_location_text,
                 columns=['user', 'location', 'tweet'])
df.head()

Unnamed: 0,user,location,tweet
0,Deep_In_Depth,"Digne-les-Bains, France",Moore's Law Is Dying. This Brain-Inspired Anal...
1,Colin_Hung,Toronto,Underpinning effective #AI #NLP #ML and #SDOH ...
2,BreanaPatelnyc,🗽 🇺🇸www.breanapatel.com,Difficulties in explaining machine learning (M...
3,Deep_In_Depth,"Digne-les-Bains, France",AI Can Read A Cardiac MRI In 4 Seconds: Do We ...
4,CasonCherry,"New Hampshire, USA",The ⁦@lexfridman⁩ podcast with ⁦@GaryMarcus⁩ g...
