Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Handling of blank lines in input corpuses #11

Open
DavidSorge opened this issue Aug 24, 2020 · 0 comments
Open

Inconsistent Handling of blank lines in input corpuses #11

DavidSorge opened this issue Aug 24, 2020 · 0 comments

Comments

@DavidSorge
Copy link

I ran into a (minor) issue using this tool to work with newspaper archive data.

In constructing my corpus, the process of removing words that did not have corresponding word-vectors resulted in empty lines in my input corpus.

The DMM model worked on the corpus without a problem, suggesting that there is a working mechanism in the code for handling this situation.

However, when I attempted to run DMMinf using the resulting model, I received a fatal error:

Error: Index: 0, Size: 0
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at models.LFDMM_Inf.sampleSingleInitialIteration(Unknown Source)
        at models.LFDMM_Inf.inference(Unknown Source)
        at LFTM.main(Unknown Source)

The obvious solution to my problem is to fix my corpus-producing code, and make sure I don't feed empty lines into the DMMinf model.

But I post the issue here in case a future user runs into the same issue, or in case you would like to fix a minor bug in your otherwise excellent tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant