Skip to content

Indonesian Text Clustering about PPKM(Pemberlakuan Pembatasan Kegiatan Masyarakat)

Notifications You must be signed in to change notification settings

agus2121/Clustering--PPKM-Tweet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Clustering--tweet-PPKM

Dataset

I pull text data using Twitter API with keyword 'PPKM'.

Step

  1. Getting Data from Twitter with Tweepy
  2. Preprocessing Text : drop duplicate text, remove(emoticon,punctuation,stopwords),spell checking
  3. Create some visualization(Word cloud,n-gram) to got some insight
  4. converting text data using TFidf vectoriezer
  5. Clustering text data and create visualization for elbow method
  6. choose the best n_cluster and analyze again with KMeans

Reference

[1] Python Sastrawi. URL : https://github.com/har07/PySastrawi

About

Indonesian Text Clustering about PPKM(Pemberlakuan Pembatasan Kegiatan Masyarakat)

Topics

Resources

Stars

Watchers

Forks