-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java.lang.NullPointerException #2
Comments
Thank you for using our tool. Can you please share more details? I don't think it's a Java 8 issue. I would like to see the full execution command line. Did you include the -i "decompressed newsela folder path" argument? |
Thank you for your comments and analysis. I found a bug in the code related to how to deal with English abbreviations when they were correctly processed by the sentence splitter and it is the last sentence of the text. I just uploaded the fix.
I hope it works!
Best,
Marc
From: Chao6 <notifications@github.com>
Sent: viernes, 28 de septiembre de 2018 23:37
To: neosyon/SimpTextAlign <SimpTextAlign@noreply.github.com>
Cc: Marc F. S. <neosyon@gmail.com>; Comment <comment@noreply.github.com>
Subject: Re: [neosyon/SimpTextAlign] java.lang.NullPointerException (#2)
Hi, thanks for reply!
Would you please try to use the latest version of ComputeSimilarityBetweenTexts class to compute the similarity of this line regardless of which similarity measure?
A1_Grand_Prix Nelson Piquet , Jr. won the inaugural race of the series for A1 Team Brazil . The Curitiba , Brazil in January 2006 was canceled . 0
This line follows the same format as Hwang's Standard Wikipedia to Simple Wikipedia alignments dataset, each element is separated by '\t' like this:
A1_Grand_Prix\tNelson Piquet , Jr. won the inaugural race of the series for A1 Team Brazil .\tThe Curitiba , Brazil in January 2006 was canceled .\t0
The code will report a NullPointerException. The problem stems from the sentence splitter in getSubtexts function in TextProcessingUtils. I understand that by adjusting the pointer, we could avoid wrong sentences splitting like (Dr./Prof.) suggest by this article <https://stackoverflow.com/questions/17159513/split-paragraph-into-sentences-with-titles-and-numbers?answertab=oldest#tab-top> .
But in this case, if the only sentence contains abbreviation, this only sentence will be filtered out. An empty list will be returned by the getSubtexts and getCleanText functions in TextProcessingUtils class. In getAlignmentsUsingClosestCosSim function in VectorUtils class, closestIndex = -1, cleanSubtexts1.get(closestIndex).getText() and sims[closestIndex][i] will induce a NullPointerException.
Since when calculating sentences pair similarity, the input is splitted sentence. Thus to ensure the running, can I just replace if (! hasAbbreviation(sentence)) with if (true) in the getSubtexts function?
I think this operation will not affect the correctness of the similarity score, is this correct?
Thanks!
BTW, the second to the last main version (commit: <38ab50d> 38ab50d) which uses Stanford nlp DocumentPreprocessor doesn't have this problem.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#2 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AS63i8Cd7KleeeeH8-2e40ghHX5mTMvUks5ufpaCgaJpZM4V9JdV> .
|
Hi, Thanks for your useful tool, however I get the same error. Calculating IDF... My java version is Can you please guide how to resolve this, Thank you in adavance. |
Hi, thanks for your useful tool!
However, when I run the tool, I meet a bug which error log is:
My java version is 8, would you please take a look at this issue?
Thanks in advance!
The text was updated successfully, but these errors were encountered: