# Python and Google’s Natural Language API

## 1. Install the libraries
We are going to use tweepy to gather the tweet data. We will use nltk to help us clean the tweets. Google Natural Language API will do the sentiment analysis. python-telegram-bot will send the result through Telegram chat.

```
pip3 install tweepy nltk google-cloud-language python-telegram-bot
```


## 2. Get Twitter API Keys
To be able to gather the tweets from Twitter, we need to create a developer account to get the Twitter API Keys first.

## 3. Enable Google Natural Language API
We need to enable the Google Natural Language API first if we want to use the service.

Go to Google Developers Console and create a new project (or select the one you have).

In the project dashboard, click “ENABLE APIS AND SERVICES”, and search for Cloud Natural Language API.

### 3a. Create service account key
If we want to use Google Cloud services like Google Natural Language, we need a service account key. This is like our credential to use Google’s services.

Go to Google Developers Console, click “Credentials” tab, choose “Create credentials” and click “Service account key”.

There is a .json file that will be automatically downloaded, name it creds.json.

Set the GOOGLE_APPLICATION_CREDENTIALS with the path of our creds.json file in the terminal.

```
export GOOGLE_APPLICATION_CREDENTIALS='[PATH_TO_CREDS.JSON]'
```


## 4. Write the program
This program will gather all the tweets containing the defined keyword in the last 24 hours with a maximum of 50 tweets. Then it will analyze the tweets’ sentiments one by one. We will send the result (average sentiment score) through Telegram chat.

This is a simple workflow of our program.

> connect to the Twitter API -> search tweets based on the keyword -> clean all of the tweets -> get tweet’s sentiment score -> send the result

### 4a. Connect to the Twitter API
The first thing that we need to do is gather the tweets’ data, so we have to connect to the Twitter API first.



In [1]:
# Import the tweepy library.
import tweepy

In [2]:
# Define the keys that we generated earlier.

# Twitter API Credentials
CONS_KEY = "TOKEN_HERE"
CONS_SECRET = "TOKEN_HERE"
ACC_TOKEN = "TOKEN_HERE"
ACC_SECRET = "TOKEN_HERE"

In [3]:
# Make a function called authentication to connect to the API, with four parameters which are all of the keys.

def authentication(cons_key, cons_secret, acc_token, acc_secret):
    auth = tweepy.OAuthHandler(cons_key, cons_secret)
    auth.set_access_token(acc_token, acc_secret)
    api = tweepy.API(auth)
    return api

### 4b. Search the tweets
We can search the tweets with two criteria, based on time or quantity. If it’s based on time, we define the time interval and if it’s based on quantity, we define the total tweets that we want to gather. Since we want to gather the tweets from the last 24 hours with maximum tweets of 50, we will use both of the criteria.

Since we want to gather the tweets from the last 24 hours, let's take yesterday’s date as our time parameter.



In [4]:
from datetime import datetime, timedelta
today_datetime = datetime.today().now()
yesterday_datetime = today_datetime - timedelta(days=1)
today_date = today_datetime.strftime('%Y-%m-%d')
yesterday_date = yesterday_datetime.strftime('%Y-%m-%d')

In [5]:
# Connect to the Twitter API using a function we defined before.

api = authentication(CONS_KEY,CONS_SECRET,ACC_TOKEN,ACC_SECRET)

In [6]:
HASHTAG = ['#ppmadrid', '#csxmadrid', '#28M', '#eleccionesMadrid','#MásMadrid','#MasMadrid','#VOX','#EspañaViva','#psoe','#ElMadridQueQuieres']


In [7]:
# Example
def rest_tweets(self, query, lang="es", limit=None):
        """
        returns all the tweets within 7 days top according to the query received by this method
        returns the complete tweet
        :param query: should contain all the words and can include logic operators
        should also provide the period of time for the search
        ex: rock OR axe 
        (visit https://dev.twitter.com/rest/public/search to see how to create a query)
        :param lang: the language of the tweets
        :param limit: defines the maximum amount of tweets to fetch
        :return: tweets: a list of all tweets obtained after the request
        """
        tweets = []

        for tweet in tw.Cursor(self.api.search, q=query, lang=lang).items(limit):
            tweets.append(tweet._json)

        return tweets

In [8]:
def save_hashtag(hashtag):
    for status in tweepy.Cursor(api.search, q=HASHTAG).items(1000):
        try:
            for media in status.extended_entities['media']:
                print(media['media_url'])
                urllib.request.urlretrieve(media['media_url'], os.path.join(os.getcwd(), os.path.join('files', 'riko_meme'), media['media_url'].link.split('/')[-1]))
        except AttributeError:
            pass 

In [12]:
save_hashtag(HASHTAG)

In [14]:
def analyzetweets(self, access_token, access_token_secret, mytweets=False, q=None):
    auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)
    sentimentlist = []
    subjectivitylist = []
    number = NUMBER_OF_TWEETS
    tweets = tweepy.Cursor(api.user_timeline).items() if mytweets else tweepy.Cursor(api.search, q=q).items(number)
    for index, tweet in enumerate(tweets):
        analysis = TextBlob(tweet.text).sentiment
        sentimentlist.append(analysis.polarity)
        subjectivitylist.append(analysis.subjectivity)
        self.update_state(state="RUNNING", meta={"current": index + 1, "total": number})
    sentimentavg = float(sum(sentimentlist) / max(len(sentimentlist), 1))
    subjectivityavg = float(sum(subjectivitylist) / max(len(subjectivitylist), 1))
    return {"current": number, "total": number, "subjectivityavg": subjectivityavg, "sentimentavg": sentimentavg} 

Define our search parameters. q is where we define our keyword, since is the start date for our search, result_type='recent' means we are going to take the newest tweets, lang='en' is going to take the English tweets only, and items(total_tweets) is where we define the maximum tweets that we are going to take.

In [11]:
search_result = tweepy.Cursor(api.search, 
                              q=HASHTAG, 
                              since=yesterday_date,
                              result_type='recent',
                              exclude_replies = True,
                              lang='es').items(numbers)

NameError: name 'numbers' is not defined

Wrap those codes in a function called search_tweets with keyword and total_tweets as the parameters.

In [None]:
def search_tweets(keyword, total_tweets):
    today_datetime = datetime.today().now()
    yesterday_datetime = today_datetime - timedelta(days=1)
    today_date = today_datetime.strftime('%Y-%m-%d')
    yesterday_date = yesterday_datetime.strftime('%Y-%m-%d')
    api = authentication(CONS_KEY,CONS_SECRET,ACC_TOKEN,ACC_SECRET)
    search_result = tweepy.Cursor(api.search, 
                                  q=HASHTAG, 
                                  since=yesterday_date, 
                                  result_type='recent', 
                                  lang='es').items(total_tweets)
    return search_result