Topic Coherence Measurements #1489

schoobani · 2023-08-28T10:42:56Z

Evaluating the quality of the discovered topics is challenging. This paper or This one, introduce a set of nice measurements for Topics Coherence and Diversity, like c_v coherence:

Given topic1={car,driver,wheel,speed}

Do you have any plans to add these types of measurements to the library? If not, what do you suggest as the best measurement of the coherency in topics?

The text was updated successfully, but these errors were encountered:

MaartenGr · 2023-08-28T11:13:01Z

#90 is actually a very nice thread about this specific topic that is a worthwhile read.

To save you from a long read, there are currently no plans to add these types of measurements to the library for the reason that measurement of coherency is exceedingly difficult and sometimes even flawed. Measuring coherence holds a degree of subjectivity which makes it difficult to use such an evaluation metric as a ground truth. As a result, whatever evaluation metric is implemented in BERTopic, some users are likely going to optimize for those metrics since they are implemented in BERTopic. The issue here is that the resulting performance cannot be guaranteed due to the somewhat subjective nature of these metrics.

Instead, each and every use case requires a different set of evaluation metrics. Although we like to evaluate topic modeling techniques with coherence it only shows a very small piece of what a topic model actually is. What about the quality of the document-topic assignment? The diversity of topics? What if we have labels instead of keywords? Etc. There simply is no "best" evaluation metric.

There is a nice package, called OCTIS, which has a number of interesting evaluation metrics implemented. You can use that package to evaluate BERTopic on your use case of interest.

schoobani · 2023-08-28T11:24:37Z

Thanks for the reply. gensim also has an implementation of them:
https://radimrehurek.com/gensim/models/coherencemodel.html

Obviously adding them to BERTopic would be redundant.

schoobani closed this as completed Aug 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topic Coherence Measurements #1489

Topic Coherence Measurements #1489

schoobani commented Aug 28, 2023 •

edited

Loading

MaartenGr commented Aug 28, 2023

schoobani commented Aug 28, 2023

Topic Coherence Measurements #1489

Topic Coherence Measurements #1489

Comments

schoobani commented Aug 28, 2023 • edited Loading

MaartenGr commented Aug 28, 2023

schoobani commented Aug 28, 2023

schoobani commented Aug 28, 2023 •

edited

Loading