Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to transform the data from mallet to one that can be used by this tool? #10

Closed
rrgirish opened this issue Nov 6, 2015 · 2 comments

Comments

@rrgirish
Copy link

rrgirish commented Nov 6, 2015

We are able to get Lucene-LDA to compile by removing the lucene-3.0 Jar(and leave the 3.5 jar) from the lib directory.

However, when we try to run the indexDirectory command on the documents that we have, we observed that as per the readme and the source code, lucene-lda doesn't run MALLET by itself.

So we ran mallet on the data first and obtained the output from MALLET. However, after this Lucene-lda doesn't recognize the output from the mallet file(when we try to run the queryWithLDA. command). Does this need to be in some specific data format?

@stepthom
Copy link
Owner

stepthom commented Nov 9, 2015

lucene-lda doesn't read in MALLET files directly, at the moment. MALLET output files need to be preprocessed (e.g., using a simple script) to create the four input files described in the README, under the bullet "You have already executed LDA..."

@rrgirish
Copy link
Author

Yup.. figured that out. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants