### Sentiment Analysis Related To The Increasing Amount of Indonesian VAT (PPN) to 12%.

In [None]:
# %pip install textblob
# %pip install googletrans==3.1.0a0

In [17]:
import pandas as pd
import numpy as np
from googletrans import Translator
from textblob import TextBlob

#### Outline

1. data labeling
2. data splitting
3. eda
4. data preprocessing
5. fitting (training data)
6. testing
7. evaluation

#### 1. Business Understanding

Indonesian Govt wants to know the impact of the increasing amount of VAT from the headlines of online news outlets in Indonesia. The govt contacted a company (PT AI Bahagia) to classify the responds whether it is good or not.

#### 2. Objective Metrics

As a data analyst + scientist in the company, you are being tasked to create sentiment analysis from online news outlets in Indonesia.

#### 3. Dataset Preparation

The dataset was created by scraping Indonesian online news outlets.

(check scraper.ipynb)

In [3]:
df = pd.read_csv('headlines_ppn_12percent.csv', sep=';')
df

Unnamed: 0,no,url,title,platform
0,0,https://finance.detik.com/berita-ekonomi-bisni...,Jubir Luhut Berikan Penjelasan soal PPN Naik J...,detik.com
1,1,https://finance.detik.com/detiktv/d-7659190/vi...,Video Luhut Sebut Pajak 12% Diundur,detik.com
2,2,https://finance.detik.com/berita-ekonomi-bisni...,Luhut Sebut PPN Naik Jadi 12% Bakal Diundur!,detik.com
3,3,https://finance.detik.com/berita-ekonomi-bisni...,Pengusaha Dipanggil Kemenkeu Bahas PPN Naik Ja...,detik.com
4,4,https://finance.detik.com/ekonomi-bisnis/d-765...,Airlangga & Sri Mulyani Kompak Ogah Respons Pe...,detik.com
...,...,...,...,...
409,409,https://www.liputan6.com/bisnis/read/5673215/p...,"PPN Naik jadi 12%, Siap-Siap Harga Barang Maki...",liputan6.com
410,410,https://www.liputan6.com/bisnis/read/5659024/p...,Pengusaha Mal Tolak PPN 12%: Gerus Daya Beli M...,liputan6.com
411,411,https://www.liputan6.com/bisnis/read/5654102/p...,"PPN Bakal Naik Tahun Depan, Siap-Siap!",liputan6.com
412,412,https://www.liputan6.com/bisnis/read/5639188/m...,"Minta Rencana PPN 12% di 2025 ditunda, Faisal ...",liputan6.com


#### 4. Data Labeling

- translate the indonesian title into english
- give polarity scores for each title using textblob (label the data)

**teks indo -> translate ke Inggris -> make lib buat dapetin polarity score nya -> kategoriin sesuai ama score nya**

Translate the title into EN to get polarity scores

https://stackoverflow.com/questions/67698072/translate-dataframe-columns-in-python-using-google-trans-new-library

In [None]:
translator = Translator()

df['title_en'] = df['title'].apply(lambda x: translator.translate(str(x), src='id', dest='en').text)

In [16]:
df

Unnamed: 0,no,url,title,platform,title_en
0,0,https://finance.detik.com/berita-ekonomi-bisni...,Jubir Luhut Berikan Penjelasan soal PPN Naik J...,detik.com,Luhut's spokesperson provides an explanation r...
1,1,https://finance.detik.com/detiktv/d-7659190/vi...,Video Luhut Sebut Pajak 12% Diundur,detik.com,Luhut's video says the 12% tax has been postponed
2,2,https://finance.detik.com/berita-ekonomi-bisni...,Luhut Sebut PPN Naik Jadi 12% Bakal Diundur!,detik.com,Luhut Says VAT Increase to 12% Will Be Postponed!
3,3,https://finance.detik.com/berita-ekonomi-bisni...,Pengusaha Dipanggil Kemenkeu Bahas PPN Naik Ja...,detik.com,Entrepreneurs Summoned by Ministry of Finance ...
4,4,https://finance.detik.com/ekonomi-bisnis/d-765...,Airlangga & Sri Mulyani Kompak Ogah Respons Pe...,detik.com,Airlangga and Sri Mulyani Compact Refuse to Re...
...,...,...,...,...,...
409,409,https://www.liputan6.com/bisnis/read/5673215/p...,"PPN Naik jadi 12%, Siap-Siap Harga Barang Maki...",liputan6.com,"VAT Increases to 12%, Get Ready for Prices of ..."
410,410,https://www.liputan6.com/bisnis/read/5659024/p...,Pengusaha Mal Tolak PPN 12%: Gerus Daya Beli M...,liputan6.com,Mall Entrepreneurs Reject 12% VAT: Eroding Peo...
411,411,https://www.liputan6.com/bisnis/read/5654102/p...,"PPN Bakal Naik Tahun Depan, Siap-Siap!",liputan6.com,"VAT Will Increase Next Year, Get Ready!"
412,412,https://www.liputan6.com/bisnis/read/5639188/m...,"Minta Rencana PPN 12% di 2025 ditunda, Faisal ...",liputan6.com,Asking for the 12% VAT plan in 2025 to be post...


Polarity Scores Using TextBlob

In [21]:
def text_polarity(text):
    blob = TextBlob(text=text)
    score = blob.sentiment.polarity
    return score

In [26]:
# title_en_str = df['title_en'].astype(str)

df['score'] = df['title_en'].astype(str).apply(text_polarity)
df

Unnamed: 0,no,url,title,platform,title_en,score
0,0,https://finance.detik.com/berita-ekonomi-bisni...,Jubir Luhut Berikan Penjelasan soal PPN Naik J...,detik.com,Luhut's spokesperson provides an explanation r...,0.000000
1,1,https://finance.detik.com/detiktv/d-7659190/vi...,Video Luhut Sebut Pajak 12% Diundur,detik.com,Luhut's video says the 12% tax has been postponed,0.000000
2,2,https://finance.detik.com/berita-ekonomi-bisni...,Luhut Sebut PPN Naik Jadi 12% Bakal Diundur!,detik.com,Luhut Says VAT Increase to 12% Will Be Postponed!,0.000000
3,3,https://finance.detik.com/berita-ekonomi-bisni...,Pengusaha Dipanggil Kemenkeu Bahas PPN Naik Ja...,detik.com,Entrepreneurs Summoned by Ministry of Finance ...,0.000000
4,4,https://finance.detik.com/ekonomi-bisnis/d-765...,Airlangga & Sri Mulyani Kompak Ogah Respons Pe...,detik.com,Airlangga and Sri Mulyani Compact Refuse to Re...,0.000000
...,...,...,...,...,...,...
409,409,https://www.liputan6.com/bisnis/read/5673215/p...,"PPN Naik jadi 12%, Siap-Siap Harga Barang Maki...",liputan6.com,"VAT Increases to 12%, Get Ready for Prices of ...",0.066667
410,410,https://www.liputan6.com/bisnis/read/5659024/p...,Pengusaha Mal Tolak PPN 12%: Gerus Daya Beli M...,liputan6.com,Mall Entrepreneurs Reject 12% VAT: Eroding Peo...,0.000000
411,411,https://www.liputan6.com/bisnis/read/5654102/p...,"PPN Bakal Naik Tahun Depan, Siap-Siap!",liputan6.com,"VAT Will Increase Next Year, Get Ready!",0.125000
412,412,https://www.liputan6.com/bisnis/read/5639188/m...,"Minta Rencana PPN 12% di 2025 ditunda, Faisal ...",liputan6.com,Asking for the 12% VAT plan in 2025 to be post...,0.000000


In [30]:
#neutral
df[df['score'] == 0].count()

no          205
url         205
title       205
platform    205
title_en    205
score       205
dtype: int64

In [31]:
#negative
df[df['score'] < 0].count()

no          69
url         69
title       69
platform    69
title_en    69
score       69
dtype: int64

In [32]:
#positive
df[df['score'] > 0].count()

no          140
url         140
title       140
platform    140
title_en    140
score       140
dtype: int64

Categorized the polarity score for data labeling

In [34]:
df['positive'] = np.where(df['score'] > 0, 1, 0)
df['neutral'] = np.where(df['score'] == 0, 1, 0)
df['negative'] = np.where(df['score'] < 0, 1, 0)

df

Unnamed: 0,no,url,title,platform,title_en,score,positive,neutral,negative
0,0,https://finance.detik.com/berita-ekonomi-bisni...,Jubir Luhut Berikan Penjelasan soal PPN Naik J...,detik.com,Luhut's spokesperson provides an explanation r...,0.000000,0,1,0
1,1,https://finance.detik.com/detiktv/d-7659190/vi...,Video Luhut Sebut Pajak 12% Diundur,detik.com,Luhut's video says the 12% tax has been postponed,0.000000,0,1,0
2,2,https://finance.detik.com/berita-ekonomi-bisni...,Luhut Sebut PPN Naik Jadi 12% Bakal Diundur!,detik.com,Luhut Says VAT Increase to 12% Will Be Postponed!,0.000000,0,1,0
3,3,https://finance.detik.com/berita-ekonomi-bisni...,Pengusaha Dipanggil Kemenkeu Bahas PPN Naik Ja...,detik.com,Entrepreneurs Summoned by Ministry of Finance ...,0.000000,0,1,0
4,4,https://finance.detik.com/ekonomi-bisnis/d-765...,Airlangga & Sri Mulyani Kompak Ogah Respons Pe...,detik.com,Airlangga and Sri Mulyani Compact Refuse to Re...,0.000000,0,1,0
...,...,...,...,...,...,...,...,...,...
409,409,https://www.liputan6.com/bisnis/read/5673215/p...,"PPN Naik jadi 12%, Siap-Siap Harga Barang Maki...",liputan6.com,"VAT Increases to 12%, Get Ready for Prices of ...",0.066667,1,0,0
410,410,https://www.liputan6.com/bisnis/read/5659024/p...,Pengusaha Mal Tolak PPN 12%: Gerus Daya Beli M...,liputan6.com,Mall Entrepreneurs Reject 12% VAT: Eroding Peo...,0.000000,0,1,0
411,411,https://www.liputan6.com/bisnis/read/5654102/p...,"PPN Bakal Naik Tahun Depan, Siap-Siap!",liputan6.com,"VAT Will Increase Next Year, Get Ready!",0.125000,1,0,0
412,412,https://www.liputan6.com/bisnis/read/5639188/m...,"Minta Rencana PPN 12% di 2025 ditunda, Faisal ...",liputan6.com,Asking for the 12% VAT plan in 2025 to be post...,0.000000,0,1,0


Export the dataframe

In [35]:
export_csv = df.to_csv('news_headlines_final.csv')

#### 5. Data Splitting