LDA run with Python and LogLikelihhod #72

fvalle1 · 2022-05-13T10:10:12Z

Hello,

I am trying to reproduce your topic modeling results, in particular I would like to use a Python (sklearn) implementation of LDA to fit your cisTopicObject@binary.count.matrix.

I am having problems, in particular when loading the matrix and fitting it I am not able to reproduce the Log-Likelihood versus number of topics plot. In particular the Log-Likelihood, with Python implementations, has a monotone decreasing trend with the number of topics (opposite to yours, which increases and makes a plateau).

Do you have any guess of what I am missing?

cbravo93 · 2022-05-13T10:39:27Z

Hi!

Which function are you using exactly? For parameter optimization we use Collapsed Gibbs Sampling, I think this is different in the default sklearn function (they use VEM if I remember correctly?). If you read our paper (https://www.nature.com/articles/s41592-019-0367-1#Sec20), in FigS1 we compared the effect of several parameter estimation methods and found that only collapsed gibbs sampling works well with this type of data.

Cheers!

C

cbravo93 closed this as completed May 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LDA run with Python and LogLikelihhod #72

LDA run with Python and LogLikelihhod #72

fvalle1 commented May 13, 2022

cbravo93 commented May 13, 2022

LDA run with Python and LogLikelihhod #72

LDA run with Python and LogLikelihhod #72

Comments

fvalle1 commented May 13, 2022

cbravo93 commented May 13, 2022