Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOMSource cannot be processed #160

Closed
bgyori opened this issue Mar 5, 2018 · 7 comments · Fixed by #162
Closed

DOMSource cannot be processed #160

bgyori opened this issue Mar 5, 2018 · 7 comments · Fixed by #162

Comments

@bgyori
Copy link
Contributor

bgyori commented Mar 5, 2018

On certain strings, I am seeing a large number of these errors

Error 
  DOMSource cannot be processed: check that saxon8-dom.jar is on the classpath

when reading via the python interface which uses a fat JAR of Eidos and its dependencies. These errors flood the screen but don't seem to have a serious effect, the reading completes and returns results.

I found that one example of a string that triggers this is "NOV". It looks like lowercase "nov" does too. So then I thought maybe it has to do with month names, and indeed "dec", "jan", "feb", etc. all result in the same error. Similarly, "november", "december", etc. result in the same.

So given this info, does anybody know what this could be?

@bgyori
Copy link
Contributor Author

bgyori commented Mar 5, 2018

Same error appears whenever a number that looks like a date is encountered, e.g. "2006".

@BeckySharp
Copy link
Contributor

BeckySharp commented Mar 5, 2018

this is not in eidos (I think).. maybe in CoreNLP?

Does this help inform the convo?
https://stackoverflow.com/questions/15438011/domsource-cannot-be-processed-check-that-saxon9-dom-jar-is-on-the-classpath

@kwalcock ?? ideas?

@BeckySharp
Copy link
Contributor

I mean... if our jar has a problem or we're mis-using CoreNLP or java, then it's our problem...

@kwalcock
Copy link
Member

kwalcock commented Mar 6, 2018

Notes: processors-main has a dependency on
"com.io7m.xom" % "xom" % "1.2.10",
The page for that, https://xom.nu/, says something like
XOM is not complete unto itself. It depends on an underlying SAX parser to read documents and feed the data into a tree structure. While theoretically any SAX2 compliant parser should work...
I wonder if transitive dependencies are not handled automatically.

@kwalcock
Copy link
Member

kwalcock commented Mar 6, 2018

Is there a call stack in any of that output?

@bgyori
Copy link
Contributor Author

bgyori commented Mar 6, 2018

No call stack other than the error I pasted above, PR #162 fixes the issue though.

@kwalcock
Copy link
Member

kwalcock commented Mar 6, 2018

The readme file says

Java 1.3 and earlier do not have a built-in XML parser so in these environments
you'll also need to install XOM's supporting libraries.
These include xalan.jar, xercesImpl.jar, and xml-apis.jar,
and are found in the lib directory. The versions shipped with XOM
are quite a bit faster and less buggy than the ones bundled with the JDK,
so you may well want to use them even in Java 1.4 and later. For example,

$ java -classpath xom-samples.jar:xom-1.2.10.jar:lib/xml-apis.jar:lib/xercesImpl.jar:lib/xalan.jar nu.xom.samples.PrettyPrinter filename.xml

I wonder if including

xml-apis.jar
xercesImpl.jar
xalan.jar

will fix it. I will try to force the error from Scala and verify. november it will be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants