[SAC'25] HARIN: A Novel Metric for Hierarchical Topic Model Assessment
Our analysis is performed in the following steps:
-
Dataset Collection:
- Our own: Korean twitter mentioning covid-19 vaccine(AstraZeneca, Janssen, Novavax, Moderna, Pfizer)
- public:
-
Preprocessing collected data
- Data cleansing
- Sentence correction
- Removed stopwords
-
HTM model comparison & selection
- Models : BERTopic, CluHTM, hLDA, and HyHTM
- Compute the HARIN (HierArchical haRmony INdex) score
- Compute the HARIN score per model
-
Comparing HARIN with Human Judgment
- Conduct surveys to derive human scores and survey results(Questionaire.zip)