Auto Generated Insights of 2019 HR Tech Conference Twitter

I scrape tweets with #HRTechConf, and build Latent Dirichlet Allocation (LDA) model for auto detecting and interpreting topics in the tweets. Here is my pipeline:

Data gathering – twitter scrape
Data pre-processing
Generating word cloud
Train LDA model
Visualizing topics

Install

This project requires Python 3.6+ and the following Python libraries installed:

TwitterScraper, a Python script to scrape for tweets
NLTK(Natural Language Toolkit), a NLP package for text processing, e.g. stop words, punctuation, tokenization, lemmatization, etc.
Gensim, “generate similar”, a popular NLP package for topic modeling
Latent Dirichlet Allocation (LDA), a generative, probabilistic model for topic clustering/modeling
pyLDAvis, an interactive LDA visualization package, designed to help interpret topics in a topic model that is trained on a corpus of text data
NumPy
Pandas
matplotlib

Code

Code is provided in HRTech2019_LDA.py.

Topic Visualization

Interactive topic visualization

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
figures		figures
models		models
.gitattributes		.gitattributes
HRTech2019_LDA.py		HRTech2019_LDA.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

figures

figures

models

models

.gitattributes

.gitattributes

HRTech2019_LDA.py

HRTech2019_LDA.py

README.md

README.md

Repository files navigation

Auto Generated Insights of 2019 HR Tech Conference Twitter

Install

Code

Topic Visualization

About

Releases

Packages

Languages

wangpengcn/Auto-Generated-Insights-of-2019-HR-Tech-Conference-Twitter

Folders and files

Latest commit

History

Repository files navigation

Auto Generated Insights of 2019 HR Tech Conference Twitter

Install

Code

Topic Visualization

About

Resources

Stars

Watchers

Forks

Languages