<a href="https://www.kaggle.com/code/hendrasbmitb/weekly-tweet-sentiment?scriptVersionId=128000238" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Weekly Tweet Sentiment



This notebook is based on Jessica Garson's step-by-step tutorial on her blog, [**How to analyze the sentiment of your own Tweets**](https://developer.twitter.com/en/blog/community/2020/how-to-analyze-the-sentiment-of-your-own-tweets). This is the [**complete code**](https://github.com/hendro93/weekly-tweet-sentiment).


Setting up

Before you can get started you will need to make sure you have the following:

- Python 3 [**installed**](https://wiki.python.org/moin/BeginnersGuide/Download)
- Twitter Developer account: if you don’t have one already, you can [**apply for one**].(https://developer.twitter.com/en/portal/dashboard)
- [**A Twitter developer app**](https://developer.twitter.com/en/docs/apps/overview), which can be created in your Twitter developer account. 
- A [**bearer token**](https://developer.twitter.com/en/docs/authentication/oauth-2-0/bearer-tokens) for your app
- An account with Microsoft Azure’s [**Text Analytics Cognitive Service**](https://azure.microsoft.com/en-us/products/cognitive-services/text-analytics/) and an endpoint created. You can check out Microsoft’s [**quick start guide on how to call the Text Analytics API**](https://learn.microsoft.com/en-us/azure/cognitive-services/language-service/overview).

You will also need to install the library [**Requests**](https://requests.readthedocs.io/en/latest/). Requests will be used to make HTTP requests to the Twitter and Azure endpoints and pandas which is used to shape the data.

In [1]:
!pip install azure-ai-textanalytics --pre

Collecting azure-ai-textanalytics
  Downloading azure_ai_textanalytics-5.3.0b2-py3-none-any.whl (321 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m321.5/321.5 kB[0m [31m12.6 MB/s[0m eta [36m0:00:00[0m
Collecting isodate<1.0.0,>=0.6.1
  Downloading isodate-0.6.1-py2.py3-none-any.whl (41 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.7/41.7 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting azure-core<2.0.0,>=1.24.0
  Downloading azure_core-1.26.4-py3-none-any.whl (173 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m173.9/173.9 kB[0m [31m13.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting azure-common~=1.1
  Downloading azure_common-1.1.28-py2.py3-none-any.whl (14 kB)
Installing collected packages: azure-common, isodate, azure-core, azure-ai-textanalytics
Successfully installed azure-ai-textanalytics-5.3.0b2 azure-common-1.1.28 azure-core-1.26.4 isodate-0.6.1
[0m

In [2]:
import requests
import json
import ast

from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics import TextAnalyticsClient

from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
azure_endpoint = user_secrets.get_secret("AZURE_LANGUAGE_ENDPOINT")
azure_language_key = user_secrets.get_secret("AZURE_LANGUAGE_KEY")
bearer_token = user_secrets.get_secret("bearer_token")

## Creating the URL

Before you can connect the Twitter API, you’ll need to set up the URL to ensure it has the right fields so you get the right data back. You’ll first need to create a function called  create_twitter_url in this function you’ll declare a variable for your handle, you can replace jessicagarson with your own handle (I changed the query format to a keyword which can be anything so it doesn't have to be a Twitter ID). The max_results can be anywhere from 1 to 100 (but keep in mind that Azure Sentiment Analysis will only process a maximum of 10 records per request). If you are using a handle that would have more than 100 Tweets in a given week you may want to build in some logic to handle pagination or use a library such as [**searchtweets-labs**](https://github.com/twitterdev/search-tweets-python). The URL will need to be formatted to contain the max number of results and the query to say that you are looking for Tweets from a specific handle. You’ll return the formatted URL in a variable called url, since you will need it to make a get GET request later.

Note the difference in how the most recent Twitter API urls are written!

In [3]:
def create_twitter_url():
    handle = "nasi goreng"
    max_results = 10
    mrf = "max_results={}".format(max_results)
    q = "query={}".format(handle)
    url = "https://api.twitter.com/2/tweets/search/recent?{}&{}".format(
        mrf, q
    )
    return url

The URL you are creating is:

In [4]:
create_twitter_url()

'https://api.twitter.com/2/tweets/search/recent?max_results=10&query=nasi goreng'

You can adjust your query if you wanted to exclude retweets or Tweets that contain media. You can make adjustments to the data that is returned by the Twitter API by adding additional fields and expansions to your query. Using a REST client such as [**Postman**](https://www.postman.com/) or [**Insomnia**](https://insomnia.rest/) can be helpful for seeing what data you get back and making adjustments before you start writing code. There is [**a Postman collection for Labs endpoints**](https://app.getpostman.com/run-collection/c3c275c6ea02c49c3311#?env%5BTwitter%20Developer%20Labs%5D=W3sia2V5IjoiY29uc3VtZXJfa2V5IiwidmFsdWUiOiJZb3VyIGNvbnN1bWVyIGtleSIsImVuYWJsZWQiOnRydWV9LHsia2V5IjoiY29uc3VtZXJfc2VjcmV0IiwidmFsdWUiOiJZb3VyIGNvbnN1bWVyIHNlY3JldCIsImVuYWJsZWQiOnRydWV9LHsia2V5IjoiYWNjZXNzX3Rva2VuIiwidmFsdWUiOiJZb3VyIGFjY2VzcyB0b2tlbiIsImVuYWJsZWQiOnRydWV9LHsia2V5IjoidG9rZW5fc2VjcmV0IiwidmFsdWUiOiJZb3VyIHRva2VuIHNlY3JldCIsImVuYWJsZWQiOnRydWV9LHsia2V5IjoiYmVhcmVyX3Rva2VuIiwidmFsdWUiOm51bGwsImVuYWJsZWQiOnRydWV9XQ==) as well.

In your main function, you can save this to a variable named data. Your main function should now have two variables one for url and one for data.

To connect to the Twitter API, you’ll create a function called twitter_auth_and_connect where you’ll format the headers to pass in your bearer_token and url. At this point, this is where you connect to the Twitter API by using the request package to make a GET request.

In [5]:
def twitter_auth_and_connect(bearer_token, url):
    headers = {"Authorization": "Bearer {}".format(bearer_token)}
    response = requests.request("GET", url, headers=headers)
    return response.json()



We will create a document-formatted file from Twitter instead of using the sample text in the Azure sample code, as the following code will be executed:

In [6]:
def lang_data_shape(res_json):
    data_only = res_json["data"]
    doc_start = '"documents": {}'.format(data_only)
    str_json = "{" + doc_start + "}"
    dump_doc = json.dumps(str_json)
    doc = json.loads(dump_doc)
    return ast.literal_eval(doc)

## Azure SDK for Python

Azure provides two examples of sentiment analysis code: **Sentiment Analysis** and **Sentiment Analysis with Opinion Mining**. I use the first one ([**sample_analyze_sentiment.py**](https://github.com/hendro93/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/sample_analyze_sentiment.py)) for the purposes of this textbook.

To connect to Azure, you will need to format your data, by setting the environment variables with your own values, in a similar way to how you did with the Twitter API URL:
1. AZURE_LANGUAGE_ENDPOINT - the endpoint to your Language resource.
2. AZURE_LANGUAGE_KEY - your Language subscription key



Obtaining sentiment scores

Before you can use Azure’s endpoint for generating sentiment scores, you will need to combine the Tweet data with the data that contains the generated languages. You can use pandas to assist in this data conversion process. You can convert the json object with detected languages into a data frame. Since you only want the abbreviations of the language you can do a list comprehension to get the iso6391Name which contains abbreviations of languages. The iso6391Name is contained inside of a dictionary, which is inside of a list and the list is inside of the data frame with language data. You can also turn the Tweet data into a data frame and attach the abbreviation for the languages of your Tweets to that same data frame. From there, you can send that Tweet data into a JSON format.

In [7]:
def main():
    url = create_twitter_url()
    res_json = twitter_auth_and_connect(bearer_token, url)
    documents = lang_data_shape(res_json)

    text_analytics_client = TextAnalyticsClient(
        endpoint=azure_endpoint, credential=AzureKeyCredential(azure_language_key)
    )
        
    result = text_analytics_client.analyze_sentiment(documents["documents"], show_opinion_mining=False)
    doc_result = [doc for doc in result if not doc.is_error]
    
    for document in doc_result:
        print("\n")
        print(document.sentiment)
        print("=======")
        for sentence in document.sentences:
            print(sentence.text)
            
if __name__ == "__main__":
    main()



neutral
@LonerMighty @Egi_nupe_ You speak melayu and don't know 'goreng goreng'? 
The nasi goreng you eat, don't they cook it before frying?

The progress you posted stated in step that you should cook. 
And step 3 is where the frying comes in... 
And that's typical means of making most 'goreng' in Malaysia


neutral
RT @tastemadeid: Aneka resep nasi goreng. 
Ada yang gak pakai kecap, ada yang penuh bumbu sampai berwarna kemerahan, dan lain-lain.

Video l…


neutral
Mukbang nasi goreng Arab,mie goreng Arab,sosis 3 pcs,kcf,ampela ayam

https://t.co/rNTteUPJgK


neutral
Nasi + tahu gecok, terus di campur, terus di jadiin nasi goreng, gatau dah namanya apaan, kreasi paman tuh https://t.co/olNapHRl07


neutral
RT @tastemadeid: Aneka resep nasi goreng. 
Ada yang gak pakai kecap, ada yang penuh bumbu sampai berwarna kemerahan, dan lain-lain.

Video l…


negative
RT @w_i_d_h_i: @rasjawa Dulu suka chicken pepper rice sejak harganya 29rb, tp setelah harganya tembus 60an rb kok rasanya terlalu

Each tweet will be evaluated for sentiment, and the results will be displayed as follows:

Azure intelligently draws conclusions based on positive, neutral, and negative sentiment score data from a text, so we no longer need to develop formulas to measure sentiment scores. The intelligent language detection feature of the Azure Sentiment Analysis service is another benefit. The most recent version has [**multilingual support**](https://learn.microsoft.com/en-us/azure/cognitive-services/language-service/language-detection/language-support).
