You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When loading the french hyphenation file, the parser fails with this stack trace:
Caused by: java.lang.ArrayIndexOutOfBoundsException: 339
at net.davidashen.text.Hyphenator$Scanner.read(Hyphenator.java:448) ~[ddw-0.4.7-SNAPSHOT.jar:?]
at net.davidashen.text.Hyphenator$Scanner.cc2pat(Hyphenator.java:477) ~[ddw-0.4.7-SNAPSHOT.jar:?]
at net.davidashen.text.Hyphenator$Scanner.getSym(Hyphenator.java:386) ~[ddw-0.4.7-SNAPSHOT.jar:?]
at net.davidashen.text.Hyphenator.loadTable(Hyphenator.java:56) ~[ddw-0.4.7-SNAPSHOT.jar:?]
at de.tudarmstadt.ukp.dariah.annotator.HyphenationAnnotator.initHyphenator(HyphenationAnnotator.java:140) ~[ddw-0.4.7-SNAPSHOT.jar:?]
at de.tudarmstadt.ukp.dariah.annotator.HyphenationAnnotator.process(HyphenationAnnotator.java:152) ~[ddw-0.4.7-SNAPSHOT.jar:?]
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) ~[ddw-0.4.7-SNAPSHOT.jar:?]
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:385) ~[ddw-0.4.7-SNAPSHOT.jar:?]
... 8 more
The relevant line fails with cc == 339 == 0x0153 (œ)
The text was updated successfully, but these errors were encountered:
Certain hyphenation tables use encoding other than ISO-8859-1. To facilitate translation from that particular encoding to UCS, a list of codes and their unicode values can be passed to the hyphenator. See ruhyphal.tex, koicodes.txt for an example of a KOI8-R-encoded hyphenation table and a list of codes. [TeXHyph-J]
HyphenAnnotator initializes this to a 256-byte 1:1 table, but we ship utf-8 encoded files, so there.
We should probably just ignore that table by default.
When loading the french hyphenation file, the parser fails with this stack trace:
The relevant line fails with
cc == 339 == 0x0153
(œ)The text was updated successfully, but these errors were encountered: