# Twitter Sentiment Analysis


*Notes : This research is prepared to crawl tweets from Twitter for sentiment analysis regarding Gibran-Prabowo as the presidential and vice-presidential candidates in the 2024 Indonesian election.*

*Prepared by* : **Achmad Dhani & Faris Arief Mawardi**

## I. Introduction

**Background:**

The sentiment analysis aims to gauge public opinion and feelings expressed on social media platforms, specifically Twitter, regarding these candidates' current political alliance and their active participation in the ongoing election campaign, leading up to the presidential and vice-presidential election scheduled for February 2024.

### 5W1H Key Factors:

**Who:**
- Prabowo Subianto and Gibran Rakabuming Raka, potential presidential and vice-presidential candidates.
- Twitter users expressing opinions and sentiments about these potential candidates.

**What:**
- Sentiment analysis on Twitter data discussing the alliance and active participataion of Prabowo and Gibran in the 2024 Indonesian election.
- Gathering tweets, analyzing the sentiment expressed, and understanding public opinion regarding this political alliance.

**When:**
- During the campaign leading up to the 2024 Indonesian election.
- Period of data collection is November 19th - December 17th of 2023.

**Where:**
- Twitter platform, particularly tweets written in Bahasa Indonesia, discussing #PrabowoGibran2024 or related hashtags.
- Focus might extend to specific regions within Indonesia where opinions may vary.

**Why:**
- To understand public sentiment, feelings, and opinions toward the candidacy of Gibran and Prabowo.
- To provide insights into the potential political alliance's reception among the electorate.

**How:**
- Collecting tweets related to #PrabowoGibran2024 and conducting sentiment analysis.
- Using Natural Language Processing (NLP) techniques to analyze the sentiment of tweets.
- Aggregating data, processing, and interpreting sentiment scores to derive insights.


**Problem Statement:**

Analyzing the sentiment polarity and intensity of Twitter discussions surrounding the Gibran-Prabowo alliance in preparation for the 2024 Indonesian election. The objective is to comprehend how public sentiment might impact their candidacy and overall political prospects as they actively engage in the ongoing election campaign leading up to the presidential and vice-presidential election scheduled for February 2024.

# II. Import Libraries and Packages

**Install Selenium**

In [1]:
pip install selenium

Note: you may need to restart the kernel to use updated packages.


**Import Libraries**

In [2]:
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from time import sleep
import pandas as pd

# III. Loading Data

In this section, we will load the datasets obtained from the scraping process done by using tweet harvest, crawler model built by : 
*Helmi Satria (helmisatria.com)*
*Notebook Source :* [GoogleColab](https://colab.research.google.com/drive/1f0dsbESPorxvS4CdFJF-FYW9c63_szg_#scrollTo=4UIL1x21P9rQ)

In [5]:
data = pd.read_csv('Prabowo-Gibran_11122023.csv', delimiter=";")
data

Unnamed: 0,created_at,id_str,full_text,quote_count,reply_count,retweet_count,favorite_count,lang,user_id_str,conversation_id_str,username,tweet_url
0,Mon Dec 11 23:59:00 +0000 2023,1734362175609946293,"Atas Kedatangan Pengungsi Rohingya, Meutya Haf...",0,0,51,11,in,721898354502930433,1734362175609946293,golkarpedia,https://twitter.com/golkarpedia/status/1734362...
1,Mon Dec 11 23:58:21 +0000 2023,1734362014196392392,TKN Prabowo-Gibran Kumpulkan Aktivis 98 dan Ko...,0,9,27,34,in,887743587579944960,1734362014196392392,OposisiCerdas,https://twitter.com/OposisiCerdas/status/17343...
2,Mon Dec 11 23:58:12 +0000 2023,1734361976091066533,@GlowTryz @Heraloebss @prabowo @gibran_tweet @...,0,0,0,0,in,720357434485813248,1733738244855255124,becaseimbibo,https://twitter.com/becaseimbibo/status/173436...
3,Mon Dec 11 23:57:36 +0000 2023,1734361825809187007,@GwynxAsh @ganjarpranowo @mohmahfudmd @aniesba...,0,0,0,0,in,720357434485813248,1734124388755316893,becaseimbibo,https://twitter.com/becaseimbibo/status/173436...
4,Mon Dec 11 23:57:07 +0000 2023,1734361701418709287,@gvfftj @aniesbaswedan @ganjarpranowo @prabowo...,0,0,0,0,in,720357434485813248,1734123706614718758,becaseimbibo,https://twitter.com/becaseimbibo/status/173436...
...,...,...,...,...,...,...,...,...,...,...,...,...
2005,Mon Dec 11 13:02:44 +0000 2023,1734197021270302956,Pendekatan Prabowo-Gibran terhadap pendidikan ...,0,0,0,0,in,1677898591069962240,1734197021270302956,BayuAnggoro0,https://twitter.com/BayuAnggoro0/status/173419...
2006,Mon Dec 11 13:02:42 +0000 2023,1734197014253215938,"Meski Ganjar-Mahfud memiliki program menarik, ...",0,0,0,0,in,1673200201333604352,1734197014253215938,agus_budianto05,https://twitter.com/agus_budianto05/status/173...
2007,Mon Dec 11 13:02:40 +0000 2023,1734197005881324008,Prabowo-Gibran menunjukkan keseriusan untuk me...,0,0,0,0,in,1673523183130136576,1734197005881324008,srii_ratnaa,https://twitter.com/srii_ratnaa/status/1734197...
2008,Mon Dec 11 13:02:39 +0000 2023,1734197002957930571,Keberpihakan Prabowo-Gibran terhadap pendidika...,0,0,0,0,in,1726881121303203840,1734197002957930571,jakaguntuur,https://twitter.com/jakaguntuur/status/1734197...


In [6]:
data = data.sample(710)
data

Unnamed: 0,created_at,id_str,full_text,quote_count,reply_count,retweet_count,favorite_count,lang,user_id_str,conversation_id_str,username,tweet_url
1620,Mon Dec 11 13:22:42 +0000 2023,1734202046289768699,Rencana pembangunan pendidikan Prabowo-Gibran ...,0,0,0,0,in,1678644518168715268,1734202046289768699,AbdulahSup,https://twitter.com/AbdulahSup/status/17342020...
1491,Mon Dec 11 13:30:20 +0000 2023,1734203968555827688,"Hasil Survei Prabowo-Gibran Moncer, AHY Yakin ...",0,0,0,0,in,1669945995923447815,1734203968555827688,MKafka55269,https://twitter.com/MKafka55269/status/1734203...
149,Mon Dec 11 21:28:27 +0000 2023,1734324291225960821,@DarmawatiDe @MurtadhaOne1 @bawaslu_RI @jokowi...,0,0,0,0,in,1688663918099812353,1734230083643298146,RatnaPuspi16942,https://twitter.com/RatnaPuspi16942/status/173...
1084,Mon Dec 11 13:57:14 +0000 2023,1734210735851655323,Kesan Gritte Agatha ketika bertemu Pak Prabowo...,0,0,0,2,in,1610207637513318404,1734210735851655323,gerindrakotasmg,https://twitter.com/gerindrakotasmg/status/173...
818,Mon Dec 11 14:49:12 +0000 2023,1734223813053944032,@tukangaduayam @jokowi @prabowo @gibran_tweet ...,0,1,0,1,in,815290931003826176,1733908926075404647,blantik_pedhet1,https://twitter.com/blantik_pedhet1/status/173...
...,...,...,...,...,...,...,...,...,...,...,...,...
610,Mon Dec 11 15:25:53 +0000 2023,1734233046994538841,@don_muzakir_ @prabowo @gibran_tweet Jangan pe...,0,0,0,0,in,1667604169988141057,1733796514793992445,ToniSetiaw68737,https://twitter.com/ToniSetiaw68737/status/173...
1286,Mon Dec 11 13:40:05 +0000 2023,1734206420441338335,Dukungan kami pada Prabowo-Gibran bukan hanya ...,0,0,0,0,in,1675718595807182848,1734206420441338335,dadangdarma_,https://twitter.com/dadangdarma_/status/173420...
1725,Mon Dec 11 13:17:31 +0000 2023,1734200740401619284,Prabowo-Gibran dianggap memiliki pemahaman yan...,0,0,0,0,in,1678655887202979841,1734200740401619284,Rezamulyadiii,https://twitter.com/Rezamulyadiii/status/17342...
1616,Mon Dec 11 13:22:55 +0000 2023,1734202101893681601,Meski Ganjar-Mahfud menawarkan program nasiona...,0,0,0,0,in,1726904993108946944,1734202101893681601,MohammadHa78532,https://twitter.com/MohammadHa78532/status/173...


# IV. Data Exploration

**4.1 Converting Feature Created_at into datetime Format**

In [7]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 710 entries, 1620 to 1828
Data columns (total 12 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   created_at           710 non-null    object
 1   id_str               710 non-null    int64 
 2   full_text            710 non-null    object
 3   quote_count          710 non-null    int64 
 4   reply_count          710 non-null    int64 
 5   retweet_count        710 non-null    int64 
 6   favorite_count       710 non-null    int64 
 7   lang                 710 non-null    object
 8   user_id_str          710 non-null    int64 
 9   conversation_id_str  710 non-null    int64 
 10  username             710 non-null    object
 11  tweet_url            710 non-null    object
dtypes: int64(7), object(5)
memory usage: 72.1+ KB


In [8]:
# Converting the 'created_at' column to datetime format
data['created_at'] = pd.to_datetime(data['created_at'])

# Sorting the DataFrame based on the 'created_at' column in descending order
data.sort_values(by='created_at', ascending=False, inplace=True)

# Displaying the DataFrame with the 'created_at' column cleaned and sorted
data

  data['created_at'] = pd.to_datetime(data['created_at'])


Unnamed: 0,created_at,id_str,full_text,quote_count,reply_count,retweet_count,favorite_count,lang,user_id_str,conversation_id_str,username,tweet_url
5,2023-12-11 23:55:55+00:00,1734361401635082594,@datuakrajoangek hal yg wajar Prabowo Gibran n...,0,0,1,0,in,79985385,1734340002065358901,steven_bawole,https://twitter.com/steven_bawole/status/17343...
7,2023-12-11 23:54:06+00:00,1734360943013019686,@kompascom prabowo gibran ga nyapres pun banso...,0,0,0,0,in,21816575,1734126052509938072,pianlauding,https://twitter.com/pianlauding/status/1734360...
12,2023-12-11 23:47:42+00:00,1734359333243400435,@HaeroenE @prabowo @gibran_tweet @bawaslu_RI H...,0,0,0,1,in,1511952083439996928,1734042369828557008,MukarromahNur78,https://twitter.com/MukarromahNur78/status/173...
13,2023-12-11 23:47:28+00:00,1734359276213461047,Apa saja Program Kerja Prabowo @prabowo Gibran...,0,0,0,1,in,1531971112929615872,1734359276213461047,rkrijuara,https://twitter.com/rkrijuara/status/173435927...
17,2023-12-11 23:41:41+00:00,1734357819061899548,@ShamsiAli2 @prabowo @gibran_tweet emang bisa...,0,2,0,0,in,1603626819898331136,1734357087906402663,denihartono513,https://twitter.com/denihartono513/status/1734...
...,...,...,...,...,...,...,...,...,...,...,...,...
1999,2023-12-11 13:03:10+00:00,1734197129080639815,"Meski program Ganjar-Mahfud menarik, Prabowo-G...",0,0,0,0,in,1677170516623519746,1734197129080639815,AripBud1mn,https://twitter.com/AripBud1mn/status/17341971...
2001,2023-12-11 13:03:07+00:00,1734197119081500830,"Meski Ganjar-Mahfud memiliki program unggulan,...",0,0,0,0,in,1678645194693181440,1734197119081500830,KarismaaDiah,https://twitter.com/KarismaaDiah/status/173419...
2002,2023-12-11 13:03:05+00:00,1734197111347171360,Keberpihakan Prabowo-Gibran terhadap guru ngaj...,1,0,0,0,in,1678679422323544067,1734197111347171360,DirjaIndra3,https://twitter.com/DirjaIndra3/status/1734197...
2006,2023-12-11 13:02:42+00:00,1734197014253215938,"Meski Ganjar-Mahfud memiliki program menarik, ...",0,0,0,0,in,1673200201333604352,1734197014253215938,agus_budianto05,https://twitter.com/agus_budianto05/status/173...


**4.2 Extracting The Followers and Account Verified Info from Twitter**

In [9]:
# Import libraries
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException  # Import NoSuchElementException
import time

# Initialize WebDriver
driver = webdriver.Edge()
driver.get("https://twitter.com/i/flow/login")
# Setup the login
time.sleep(5)
username = driver.find_element(By.XPATH,"//input[@name='text']")
username.send_keys("fariskoms")
next_button = driver.find_element(By.XPATH,"//span[contains(text(),'Next')]")
next_button.click()

time.sleep(5)
password = driver.find_element(By.XPATH,"//input[@name='password']")
password.send_keys('6Juli1996!')
log_in = driver.find_element(By.XPATH,"//span[contains(text(),'Log in')]")
log_in.click()

# Wait for login and directly navigate to the "latest" tweets search page
time.sleep(5)  # Wait for login process to complete

# Lists to store verified status and follower counts
VerifiedStatus = []
FollowersCount = []

# Function to scrape profile information
def scrape_profile_info(username):
    driver.get(f"https://twitter.com/{username}")
    time.sleep(5)
    
    # Initialize verified_status and follower_count
    verified_status = "Not Verified"
    follower_count = "Follower Count Not Found"
    
    try:
        # Check if the account is verified
        verified_element = driver.find_element(By.XPATH, "/html/body/div[1]/div/div/div[2]/main/div/div/div/div/div/div[3]/div/div/div/div/div[2]/div[1]/div/div[1]/div/div/span/span[2]/span/span/div/div/svg")
        verified_status = "Verified"
    except NoSuchElementException:
        pass
    
    try:
        # Get the follower count
        follower_element = driver.find_element(By.XPATH, "/html/body/div[1]/div/div/div[2]/main/div/div/div/div/div/div[3]/div/div/div/div/div[5]/div[2]/a/span[1]/span")
        follower_count = follower_element.text
    except NoSuchElementException:
        pass
    
    return verified_status, follower_count

# Iterate through each username in the DataFrame
for username in data['username']:
    verified_status, follower_count = scrape_profile_info(username)
    VerifiedStatus.append(verified_status)
    FollowersCount.append(follower_count)

# Add the verified status and follower count columns to the DataFrame
data['VerifiedStatus'] = VerifiedStatus
data['FollowersCount'] = FollowersCount

# Display the updated DataFrame
data

**4.3 Predicting the Sentiments using indonesian-roberta-base-sentiment-classifier**

In [17]:
import warnings
warnings.filterwarnings("ignore")

In [18]:
pip install transformers

Note: you may need to restart the kernel to use updated packages.


In [19]:
from transformers import pipeline
import pandas as pd




In [43]:
pretrained_name = "w11wo/indonesian-roberta-base-sentiment-classifier"

nlp = pipeline(
    "sentiment-analysis",
    model=pretrained_name,
    tokenizer=pretrained_name
)

Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFRobertaForSequenceClassification: ['roberta.embeddings.position_ids']
- This IS expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFRobertaForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFRobertaForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFRobertaForSequenceClassification for predictions without further training.


In [44]:
def sentiment_result(text):
    result= nlp(text)
    if result[0]['label'] == 'positive':
        return 'positif'
    elif result[0]['label'] == 'negative':
        return 'negatif'
    else:
        return 'netral'

In [151]:
test = nlp('saya sedih sekali hari ini')

In [46]:
test[0]['label']

'negative'

In [47]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 600 entries, 0 to 1036
Data columns (total 14 columns):
 #   Column               Non-Null Count  Dtype              
---  ------               --------------  -----              
 0   created_at           600 non-null    datetime64[ns, UTC]
 1   id_str               600 non-null    int64              
 2   full_text            600 non-null    object             
 3   quote_count          600 non-null    int64              
 4   reply_count          600 non-null    int64              
 5   retweet_count        600 non-null    int64              
 6   favorite_count       600 non-null    int64              
 7   lang                 600 non-null    object             
 8   user_id_str          600 non-null    int64              
 9   conversation_id_str  600 non-null    int64              
 10  username             600 non-null    object             
 11  tweet_url            600 non-null    object             
 12  VerifiedStatus       600 n

In [180]:
data['sentiment_label']= data['full_text'].apply(sentiment_result)

In [168]:
data

Unnamed: 0,created_at,id_str,full_text,quote_count,reply_count,retweet_count,favorite_count,lang,user_id_str,conversation_id_str,username,tweet_url,VerifiedStatus,FollowersCount,sentiment_label
1,2023-12-09 23:59:12+00:00,1733637452169142767,Sedikit demi sedikit kejanggalan mulai terjawa...,0,5,2,3,in,1615964192313606144,1733637452169142767,sejatidindaa,https://twitter.com/sejatidindaa/status/173363...,Not Verified,12,negatif
3,2023-12-09 23:58:52+00:00,1733637369532928216,Prabowo tidak layak jadi Presiden #PrabowoAro...,0,0,0,1,in,1615964192313606144,1733637369532928216,sejatidindaa,https://twitter.com/sejatidindaa/status/173363...,Not Verified,12,negatif
10,2023-12-09 23:52:04+00:00,1733635655207641327,@PartaiSocmed Tapi Prabowo Gibran mendukung ka...,0,0,0,0,in,58460531,1730846106316378408,bennikusyana,https://twitter.com/bennikusyana/status/173363...,Not Verified,Follower Count Not Found,negatif
11,2023-12-09 23:50:39+00:00,1733635299274883567,@JELAS_SEKALI @ganjarpranowo @prabowo @gibran_...,0,1,0,1,in,830031149510520833,1733289954284105772,Matinokhoirudi2,https://twitter.com/Matinokhoirudi2/status/173...,Not Verified,307,negatif
13,2023-12-09 23:49:32+00:00,1733635017702875142,@liputan6dotcom SURVEYOR PASTI AKAN MEMBUAT SE...,0,0,0,1,in,1354654742333427713,1733273804447809829,UmarSoleh17,https://twitter.com/UmarSoleh17/status/1733635...,Not Verified,Follower Count Not Found,negatif
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1057,2023-12-09 11:12:01+00:00,1733444384170070350,"Hasil Survei Prabowo-Gibran Moncer, AHY Yakin ...",0,0,0,0,in,2758817169,1733444384170070350,jamilsada_adl,https://twitter.com/jamilsada_adl/status/17334...,Not Verified,3,netral
1058,2023-12-09 11:11:39+00:00,1733444291836747983,AHY Tegaskan Demokrat Siap Kampanye Bareng Pra...,0,0,0,0,in,2758817169,1733444291836747983,jamilsada_adl,https://twitter.com/jamilsada_adl/status/17334...,Not Verified,3,netral
1059,2023-12-09 11:11:32+00:00,1733444261436424502,SBY Instruksikan Kader Demokrat Menangkan Prab...,0,0,0,0,in,2758817169,1733444261436424502,jamilsada_adl,https://twitter.com/jamilsada_adl/status/17334...,Not Verified,3,netral
1062,2023-12-09 11:09:17+00:00,1733443694781665557,@elde_barge123 @kompascom @prabowo @gibran_twe...,0,1,0,0,in,1675595881834500096,1733384505589932118,g1taman,https://twitter.com/g1taman/status/17334436947...,Not Verified,Follower Count Not Found,negatif


In [181]:
data['sentiment_label'].value_counts()

sentiment_label
netral     340
positif    200
negatif    152
Name: count, dtype: int64

In [182]:
# Fungsi untuk mengonversi label sentimen menjadi nilai numerik
def convert_sentiment_to_score(sentiment_label):
    if sentiment_label == 'positif':
        return 1
    elif sentiment_label == 'netral':
        return 0
    elif sentiment_label == 'negatif':
        return -1
    else:
        return None  # Handle nilai-nilai sentimen lain jika ada

# Menambahkan kolom baru 'sentiment_score' dengan nilai berdasarkan label sentimen
data['sentiment_score'] = data['sentiment_label'].apply(convert_sentiment_to_score)

# Menampilkan DataFrame dengan kolom baru 'sentiment_score'
data

Unnamed: 0,created_at,id_str,full_text,quote_count,reply_count,retweet_count,favorite_count,lang,user_id_str,conversation_id_str,username,tweet_url,VerifiedStatus,FollowersCount,sentiment_label,sentiment_score
0,2023-12-10 23:59:40+00:00,1733999955642368278,"@jokowi kalau prabowo gibran menang, gak kebay...",0,0,0,0,in,1726765877855703040,1733031085590872283,iluhkomang395,https://twitter.com/iluhkomang395/status/17339...,Not Verified,Follower Count Not Found,negatif,-1
1,2023-12-10 23:59:31+00:00,1733999919764242769,@VIVAcoid Prabowo gibran selalu cinta dengan r...,0,0,0,0,in,1709856794087141376,1733656918206652440,terdahulu11,https://twitter.com/terdahulu11/status/1733999...,Not Verified,208,positif,1
2,2023-12-10 23:59:07+00:00,1733999817528168803,1. Anies - Amin 2. Prabowo - Gibran 3. Ganjar ...,0,0,0,0,in,1494197185876881408,1733999817528168803,dhikaaa____,https://twitter.com/dhikaaa____/status/1733999...,Not Verified,Follower Count Not Found,netral,0
3,2023-12-10 23:59:00+00:00,1733999787777872170,"Pimpin Kunker Komisi X DPR, Hetifah Gali Poten...",0,0,45,7,in,721898354502930433,1733999787777872170,golkarpedia,https://twitter.com/golkarpedia/status/1733999...,Not Verified,2310,netral,0
4,2023-12-10 23:58:49+00:00,1733999743876071787,"@jokowi kalau prabowo gibran menang, gak kebay...",0,0,0,0,in,1726485377605881856,1733031085590872283,panbingin41,https://twitter.com/panbingin41/status/1733999...,Not Verified,Follower Count Not Found,negatif,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
687,2023-12-10 14:55:00+00:00,1733862885611458735,"Kompak Berkemeja Biru, Prabowo-Gibran Hadiri K...",0,2,0,3,in,69183155,1733862885611458735,detikcom,https://twitter.com/detikcom/status/1733862885...,Not Verified,21.3M,netral,0
688,2023-12-10 14:54:56+00:00,1733862871434686600,"Anak muda, yuk peduli! TKN Fanta Prabowo-Gibra...",0,0,0,0,in,1625353782791135234,1733862871434686600,adyanandita4,https://twitter.com/adyanandita4/status/173386...,Not Verified,61,netral,0
689,2023-12-10 14:54:41+00:00,1733862808737984548,@gibran_gen @gibran_tweet @prabowo Mendulang s...,0,0,0,0,in,1718714243288039426,1733495859982348498,SadieHerri43233,https://twitter.com/SadieHerri43233/status/173...,Not Verified,Follower Count Not Found,positif,1
690,2023-12-10 14:54:18+00:00,1733862710700650867,Survei Polstat: Prabowo-Gibran Kokoh di Puncak...,0,0,0,0,in,1686229274532147200,1733862710700650867,Netizenn08,https://twitter.com/Netizenn08/status/17338627...,Not Verified,12,netral,0


In [183]:
# Mencari dan mengekstrak hashtag dari kolom full_text
data['hashtag'] = data['full_text'].str.findall(r'#\w+')

In [184]:
data

Unnamed: 0,created_at,id_str,full_text,quote_count,reply_count,retweet_count,favorite_count,lang,user_id_str,conversation_id_str,username,tweet_url,VerifiedStatus,FollowersCount,sentiment_label,sentiment_score,hashtag
0,2023-12-10 23:59:40+00:00,1733999955642368278,"@jokowi kalau prabowo gibran menang, gak kebay...",0,0,0,0,in,1726765877855703040,1733031085590872283,iluhkomang395,https://twitter.com/iluhkomang395/status/17339...,Not Verified,Follower Count Not Found,negatif,-1,[]
1,2023-12-10 23:59:31+00:00,1733999919764242769,@VIVAcoid Prabowo gibran selalu cinta dengan r...,0,0,0,0,in,1709856794087141376,1733656918206652440,terdahulu11,https://twitter.com/terdahulu11/status/1733999...,Not Verified,208,positif,1,[]
2,2023-12-10 23:59:07+00:00,1733999817528168803,1. Anies - Amin 2. Prabowo - Gibran 3. Ganjar ...,0,0,0,0,in,1494197185876881408,1733999817528168803,dhikaaa____,https://twitter.com/dhikaaa____/status/1733999...,Not Verified,Follower Count Not Found,netral,0,[]
3,2023-12-10 23:59:00+00:00,1733999787777872170,"Pimpin Kunker Komisi X DPR, Hetifah Gali Poten...",0,0,45,7,in,721898354502930433,1733999787777872170,golkarpedia,https://twitter.com/golkarpedia/status/1733999...,Not Verified,2310,netral,0,"[#airlanggahartarto, #kuningkeren, #PrabowoGib..."
4,2023-12-10 23:58:49+00:00,1733999743876071787,"@jokowi kalau prabowo gibran menang, gak kebay...",0,0,0,0,in,1726485377605881856,1733031085590872283,panbingin41,https://twitter.com/panbingin41/status/1733999...,Not Verified,Follower Count Not Found,negatif,-1,[]
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
687,2023-12-10 14:55:00+00:00,1733862885611458735,"Kompak Berkemeja Biru, Prabowo-Gibran Hadiri K...",0,2,0,3,in,69183155,1733862885611458735,detikcom,https://twitter.com/detikcom/status/1733862885...,Not Verified,21.3M,netral,0,[#detikcom]
688,2023-12-10 14:54:56+00:00,1733862871434686600,"Anak muda, yuk peduli! TKN Fanta Prabowo-Gibra...",0,0,0,0,in,1625353782791135234,1733862871434686600,adyanandita4,https://twitter.com/adyanandita4/status/173386...,Not Verified,61,netral,0,[]
689,2023-12-10 14:54:41+00:00,1733862808737984548,@gibran_gen @gibran_tweet @prabowo Mendulang s...,0,0,0,0,in,1718714243288039426,1733495859982348498,SadieHerri43233,https://twitter.com/SadieHerri43233/status/173...,Not Verified,Follower Count Not Found,positif,1,[]
690,2023-12-10 14:54:18+00:00,1733862710700650867,Survei Polstat: Prabowo-Gibran Kokoh di Puncak...,0,0,0,0,in,1686229274532147200,1733862710700650867,Netizenn08,https://twitter.com/Netizenn08/status/17338627...,Not Verified,12,netral,0,"[#PrabowoPresiden, #PrabowoGibran, #pilpres2024]"


In [185]:
data['FollowersCount'].value_counts()
# data['FollowersCount'].value_counts()

FollowersCount
Follower Count Not Found    400
0                            23
5                            13
3                             9
26                            7
                           ... 
23.5K                         1
640                           1
5,119                         1
1,676                         1
61                            1
Name: count, Length: 181, dtype: int64

In [186]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 692 entries, 0 to 691
Data columns (total 17 columns):
 #   Column               Non-Null Count  Dtype              
---  ------               --------------  -----              
 0   created_at           692 non-null    datetime64[ns, UTC]
 1   id_str               692 non-null    int64              
 2   full_text            692 non-null    object             
 3   quote_count          692 non-null    int64              
 4   reply_count          692 non-null    int64              
 5   retweet_count        692 non-null    int64              
 6   favorite_count       692 non-null    int64              
 7   lang                 692 non-null    object             
 8   user_id_str          692 non-null    int64              
 9   conversation_id_str  692 non-null    int64              
 10  username             692 non-null    object             
 11  tweet_url            692 non-null    object             
 12  VerifiedStatus       692 no

**Export Dataset**

In [175]:
# Export dataset into csv
data.to_csv('cleaned_10122023.csv', index=False)

# V. Conclusions 

1. The Sentiment Analysis Prediction has been successfully executed and is ready for implementation in the case.
2. Advantages of using tweet harvest scraping include:
    - Faster and more efficient data scraping process.
    - Ability to gather a larger volume of data within a shorter timeframe.
3. Disadvantages include:
    - Inability to retrieve information from Verified Accounts, which could indicate user influence or a private user status.
    - Inability to retrieve user follower counts without registering for Twitter Developer options, a process that may take up to two weeks to obtain consumer keys and authorizations.