Skip to content

Text clustering in spark with scala using LDA Model on a TF-IDF matrix

License

Notifications You must be signed in to change notification settings

borisfoko/Spark-Text-Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Spark-Text-Clustering

The following project demonstrates how to use LDA Models in Scala in a Spark environment on TF-IDF matrixs of texts, in order to cluster those in different topics.

Requirements

Java (jdk-13.0.1) Scala (scala-sdk-2.12.10) Spark (Spark-3.0.0 and sbt-1.3.10)