## Collecting tweets using the Twitter API

"In computer programming, an **Application Programming Interface (API)** is a set of subroutine definitions, protocols, and tools for building application software." [wikipedia](https://en.wikipedia.org/wiki/Application_programming_interface)

The Twitter API is the tool we use to collect tweets from Twitter
- https://dev.twitter.com/streaming/public
- https://dev.twitter.com/rest/public

Install [tweepy](http://www.tweepy.org/)

```
pip install tweepy
```



In [None]:
# this will install tweepy on your machine
!pip install tweepy

Create a Twitter app and find your consumer token and secret

1. go to https://apps.twitter.com/
2. click `Create New App`
3. fill in the details
4. click on `manage keys and access tokens`
5. copy paste your *Consumer Key (API Key)* and *Consumer Secret (API Secret)* below:
6. click `create my access token`

In [None]:
consumer_key = 'xxx'
consumer_secret = 'xxx'
access_token = 'xxx'
access_token_secret = 'xxx'


### Authentificate with the Twitter API


In [None]:
import tweepy

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

# create the api object that we will use to interact with Twitter
api = tweepy.API(auth)

In [None]:
# example of:
tweet = api.update_status('Hello Twitter')

In [None]:
# see all the information contained in a tweet:
print(tweet)

### Step 1: Creating a StreamListener

This simple stream listener prints status text. The on_data method of Tweepy’s StreamListener conveniently passes data from statuses to the on_status method.
Create class MyStreamListener inheriting from StreamListener and overriding on_status.:

In [None]:
#override tweepy.StreamListener to make it print tweet content when new data arrives
class MyStreamListener(tweepy.StreamListener):

    def on_status(self, status):
        print(status.text)

### Step 2: Creating a Stream

We need an api to stream. See Authentication Tutorial to learn how to get an api object. Once we have an api and a status listener we can create our stream object.:

In [None]:
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener)

### Step 3: Starting a Stream

A number of twitter streams are available through Tweepy. Most cases will use filter, the user_stream, or the sitestream. For more information on the capabilities and limitations of the different streams see [Twitter Streaming API Documentation](https://dev.twitter.com/streaming/overview/request-parameters)

In this example we will use filter to stream all tweets containing the word python. The track parameter is an array of search terms to stream.

In [None]:
myStream.filter(track=['python'])

In [None]:
myStream.disconnect()

In [None]:
myStream.filter(track=['realdonaldtrump,trump'], languages=['en'])

In [None]:
#override tweepy.StreamListener to make it save data to a file
class StreamSaver(tweepy.StreamListener):
    def __init__(self, filename, api=None):
        self.filename = filename
        
        self.num_tweets = 0
        
        tweepy.StreamListener.__init__(self, api=api)
        
        
    def on_data(self, data):
        #print json directly to file
        
        with open(self.filename,'a') as tf:
            tf.write(data)
            
        self.num_tweets += 1
        
        print(self.num_tweets)
            
    def on_error(self, status):
        print(status)

In [None]:
saveStream = StreamSaver(filename='trumpTweets.txt')
mySaveStream = tweepy.Stream(auth = api.auth, listener=saveStream)


In [None]:
mySaveStream.filter(track=['realdonaldtrump,trump'], languages=['en'])


In [None]:
mySaveStream.disconnect()

def 