Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDA run with Python and LogLikelihhod #72

Closed
fvalle1 opened this issue May 13, 2022 · 1 comment
Closed

LDA run with Python and LogLikelihhod #72

fvalle1 opened this issue May 13, 2022 · 1 comment

Comments

@fvalle1
Copy link

fvalle1 commented May 13, 2022

Hello,

I am trying to reproduce your topic modeling results, in particular I would like to use a Python (sklearn) implementation of LDA to fit your cisTopicObject@binary.count.matrix.

I am having problems, in particular when loading the matrix and fitting it I am not able to reproduce the Log-Likelihood versus number of topics plot. In particular the Log-Likelihood, with Python implementations, has a monotone decreasing trend with the number of topics (opposite to yours, which increases and makes a plateau).

Do you have any guess of what I am missing?

@cbravo93
Copy link
Member

Hi!

Which function are you using exactly? For parameter optimization we use Collapsed Gibbs Sampling, I think this is different in the default sklearn function (they use VEM if I remember correctly?). If you read our paper (https://www.nature.com/articles/s41592-019-0367-1#Sec20), in FigS1 we compared the effect of several parameter estimation methods and found that only collapsed gibbs sampling works well with this type of data.

Cheers!

C

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants