<a href="https://colab.research.google.com/github/alihammadbaig/Sentiment_Analysis/blob/master/Sentiment_Analyzer_using_Python_and_Google_NL_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**I'm going to make a Telegram Bot that will do the sentiment analysis of tweets related to the keyword that we define.**

# Libraries


*   [Tweepy](http://www.tweepy.org/) to gather tweet data
*   [nltk](https://www.nltk.org/) to cleanse the tweets
*  [Google Natural Language API](https://cloud.google.com/natural-language/) for sentiment analysis
*  [Python Telegram Bot](https://github.com/python-telegram-bot/python-telegram-bot) to send the results through Telegram chat



In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

In [0]:
!ls "/content/gdrive/My Drive/Colab Notebooks/data/"

In [0]:
!pip install tweepy
!pip install nltk
!pip install google-cloud-language
!pip install python-telegram-bot

In [0]:
!export GOOGLE_APPLICATION_CREDENTIALS="/content/gdrive/My Drive/Colab Notebooks/data/creds.json"

In [0]:
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/content/gdrive/My Drive/Colab Notebooks/data/creds.json"

# Plan of Action

**This program will gather all the tweets containing the defined keyword in the last 24 hours with a maximum of 50 tweets. Then it will analyze the tweets’ sentiments one by one. We will send the result (average sentiment score) through Telegram chat.**

This is a simple workflow of our program.

connect to the Twitter API -> search tweets based on the keyword -> clean all of the tweets -> get tweet’s sentiment score -> send the result

## 1-Connect to the Twitter API

In [0]:
import tweepy

In [0]:
ACC_TOKEN = ''
ACC_SECRET = ''
CONS_KEY = ''
CONS_SECRET = ''

### API Authentication

In [0]:
def authentication(cons_key, cons_secret, acc_token, acc_secret):
    auth = tweepy.OAuthHandler(cons_key, cons_secret)
    auth.set_access_token(acc_token, acc_secret)
    api = tweepy.API(auth)
    return api

### Search the Tweets

In [0]:
def search_tweets(keyword, total_tweets):
    # gather the tweets from the last 24 hours
    today_datetime = datetime.today().now()
    yesterday_datetime = today_datetime - timedelta(days=1)
    today_date = today_datetime.strftime('%Y-%m-%d')
    yesterday_date = yesterday_datetime.strftime('%Y-%m-%d')
    
    api = authentication(CONS_KEY,CONS_SECRET,ACC_TOKEN,ACC_SECRET)
    
    search_result = tweepy.Cursor(api.search, 
                                  q=keyword, 
                                  since=yesterday_date, 
                                  result_type='recent', 
                                  lang='en').items(total_tweets)
    
    return search_result

### Clean the Tweets

* Lets clean the tweets for Google NL API to perform better
* I'll use NLTK and a bit of RegEx to clean the tweets

In [0]:
import re
from nltk.tokenize import WordPunctTokenizer

In [0]:
def clean_tweets(tweet):
    user_removed = re.sub(r'@[A-Za-z0-9]+','',tweet.decode('utf-8'))
    link_removed = re.sub('https?://[A-Za-z0-9./]+','',user_removed)
    number_removed = re.sub('[^a-zA-Z]', ' ', link_removed)
    lower_case_tweet= number_removed.lower()
    tok = WordPunctTokenizer()
    words = tok.tokenize(lower_case_tweet)
    clean_tweet = (' '.join(words)).strip()
    return clean_tweet

### Get tweet’s sentiment

In [0]:
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types

Make a function called get_sentiment_score which takes tweet as the parameter, and returns the sentiment score.

In [0]:
def get_sentiment_score(tweet):
    client = language.LanguageServiceClient()
    document = types\
               .Document(content=tweet,
                         type=enums.Document.Type.PLAIN_TEXT)
    sentiment_score = client\
                      .analyze_sentiment(document=document)\
                      .document_sentiment\
                      .score
    return sentiment_score

### Analyze the tweets

In [0]:
def analyze_tweets(keyword, total_tweets):
    score = 0
    tweets = search_tweets(keyword, total_tweets)
    for tweet in tweets:
        cleaned_tweet = clean_tweets(tweet.text.encode('utf-8'))
        sentiment_score = get_sentiment_score(cleaned_tweet)
        score += sentiment_score
        print('Tweet: {}'.format(cleaned_tweet))
        print('Score: {}\n'.format(sentiment_score))
    final_score = round((score / float(total_tweets)),2)
    return final_score

In [0]:
def send_the_result(bot, update):
    keyword = update.message.text
    final_score = analyze_tweets(keyword, 50)
    if final_score <= -0.25:
        status = 'NEGATIVE | ❌'
    elif final_score <= 0.25:
        status = 'NEUTRAL | 🔶'
    else:
        status = 'POSITIVE | ✅'
#     bot.send_message(chat_id=update.message.chat_id,
#                      text='Average score for '
#                            + str(keyword) 
#                            + ' is ' 
#                            + str(final_score) 
#                            + ' | ' + status)

In [30]:
# from telegram.ext import Updater, MessageHandler, Filters

def main():
#     updater = Updater('YOUR_TOKEN')
#     dp = updater.dispatcher
#     dp.add_handler(MessageHandler(Filters.text, send_the_result))
#     updater.start_polling()
#     updater.idle()
  analyze_tweets("sarfraz", 50)

if __name__ == '__main__':
    main()

Tweet: cheif selector inzamam ul haq is also involved in all that he is one of main reason behind issues in pak cricket
Score: 0.0

Tweet: families ko kis ney kia kaha hai what did anyone say to amir or sarfraz family bus malik ko kiya
Score: 0.0

Tweet: rt if anyone can fix pak cricket in current circumstances it s pm khan himself he advised captain sarfraz now he
Score: 0.0

Tweet: our caption are not fit medicaly and is fit no role of sarfraz as captain in team sarfraz has no
Score: 0.0

Tweet: rt nikamii a group is active against sarfraz who want imad as captain that group has imad parchi imam malik etc in it
Score: 0.6000000238418579

Tweet: rt nikamii under champion sarfraz ahmed ct champion sarfraz ahmed quetta gladiators champion sarfraz ahmed haters sti
Score: 0.4000000059604645

Tweet: rt nikamii sarfraz was yawning during the match this clearly shows that he was suffering from restlessness according to media repo
Score: -0.699999988079071

Tweet: rt nikamii sarfraz was in fa