LDA Analysis - Topic Modelling (via Apache Spark)

This is currently the 3rd version of LDA Analysis.
Mainly, we want to focus on extraction of topic terms and number of topics
Note that this is an unsupervised training model involving a lot of review dataset in order for the the LDA Topic Outputs to be sufficient and accurate

Please be aware that you don't have to do anything in the code other than outputting

Installing Requirements:

Add configurations from 'requirements.txt'. If you're using an IDE, it will prompt you to install the packages. Otherwise, simply run:

pip install <package_names>

Expected prerequisites:

You should know how to run Apache Spark on Python IDE. Make sure Apache Spark and Pypsark package (Python) is running properly before executing the program

Before execution:

Make sure the following are existed in the program:

'review_info' folder (for outputs)
'topics' folder (for list of topics)
'data' folder (for the original raw data of reviews)

Then, run:

python report_output.py

Warnings:

You may see a lot of WARNINGS and potentially SPARK error messages. Please do ignore them. Note also that SPARK may run more than once.

Resources:

https://spark.apache.org/docs/latest/ml-clustering.html
More will be updated here

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
review_info		review_info
share/py4j		share/py4j
topics		topics
venv		venv
README.md		README.md
__init__.py		__init__.py
back_to_review.py		back_to_review.py
filtering_analysis.py		filtering_analysis.py
lda_analysis.py		lda_analysis.py
load_data.py		load_data.py
report_output.py		report_output.py
requirements.txt		requirements.txt
stopwords.txt		stopwords.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LDA Analysis - Topic Modelling (via Apache Spark)

Installing Requirements:

Expected prerequisites:

Before execution:

Warnings:

Resources:

About

Uh oh!

Releases

Packages

Languages

Guide-Analytics/lda

Folders and files

Latest commit

History

Repository files navigation

LDA Analysis - Topic Modelling (via Apache Spark)

Installing Requirements:

Expected prerequisites:

Before execution:

Warnings:

Resources:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages