Skip to content

Latest commit

 

History

History
23 lines (14 loc) · 693 Bytes

README.md

File metadata and controls

23 lines (14 loc) · 693 Bytes

Trump Tweets Topic modelling

trump_wordcloud

Natural Language Processing (NLP) project.

Using Term Frequency-Inverse Document Frequency (TF-IDF) and non-negative matrix factorization (NMF), I performed a topic modelling on the tweets of Donald Trump from July 19th to December 16th 2020. The dataset is originated from Kaggel.

Here are the steps that I followed:

  • Step 1: Reading the data
  • Step 2: Exploring the data
  • Step 3: Cleaning the dataset
  • Step 4: Tokenization
  • step 5: Stopwords removal
  • step 6: Lemmatization
  • step 7: Topic modelling

Please see the Jupiter Notebook: Trump_Tweets_Topic_modelling.ipynb