This repo make use of LDA models to discover how health issues over the world evolve over time. The discovery is done using tweets from 3 major health channels (BBC,CNN,CBC) from 2013 to 2015.
- Data Cleaning
- Data pre-processing
- Extra pre-processing to cater to our dataset
- Visualisation of topic-word distribution with pyLDAvis and wordcloud
- Visualisation of topic over time with seaborn