Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.ArrayIndexOutOfBoundsException #17

Open
Ramesh-X opened this issue Nov 5, 2016 · 8 comments
Open

java.lang.ArrayIndexOutOfBoundsException #17

Ramesh-X opened this issue Nov 5, 2016 · 8 comments

Comments

@Ramesh-X
Copy link

Ramesh-X commented Nov 5, 2016

I'm using to parse a given text using the following command.

scripts/PARSE.sh < ../text.in > ../text.out 2> output_file.err

The model that I was trying to use was LDC2014T12. But I get the following error.

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 at edu.cmu.lti.nlp.amr.AMRParser$$anonfun$main$3.apply(AMRParser.scala:307) at edu.cmu.lti.nlp.amr.AMRParser$$anonfun$main$3.apply(AMRParser.scala:192) at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) at edu.cmu.lti.nlp.amr.AMRParser$.main(AMRParser.scala:192) at edu.cmu.lti.nlp.amr.AMRParser.main(AMRParser.scala)

I tried using other models given. But the same error occurred.
I tried using scripts/EVAL.sh also. It also gave the same error.
Any help..?

Thanks..

@bheinzerling
Copy link

The problem is that AMRParser tries to read a tokenization file that doesn't exist. It seems that instead of raising an exception this results in an empty array. This happens in line 169 of AMRParser.scala:

val tokenized = fromFile(options('tokenized).asInstanceOf[String]).getLines/.map(x => x)/.toArray

Trying to access an element of this empty array in line 197 causes an exception which gets handled, but during handling there is another attempted access in line 307, which causes the ArrayIndexOutOfBoundsException.

As a simple workaround in case your input text is already whitespace tokenized, you can replace line 169 with this line, run ./compile again, and everything should work:

val tokenized = input

Alternatively, you could try to run the tokenize script manually and set the --tok environment variable in config.sh

@ritwikmishra
Copy link

ritwikmishra commented Dec 26, 2017

I followed what @bheinzerling suggested and it worked for parsing. But when I run
scripts/ALIGN.sh < output_file2 > aligned_output_file
Here output_file2 is output of parsing step.
Same error is encountered

 ### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
	at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)

I went to the line number 40 of CorpusTool.scala and commented the line just like @bheinzerling suggested in case of AMRParser.scala . I added the following lines instead

val input = stdin.getLines.toArray
val tokenized = input

I compiled it.
Now the script ALIGN.sh runs without any Exception. And shows this

### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
 ### Running aligner ###

but it gives nothing as output. The file aligned_output_file is empty .

What can be done? (I am using the pre-trained models-2016.09.18.tgz only)
Thanks

@ConstantineLignos
Copy link

@ritwikmishra I was experiencing a similar problem and the solution in #16 solved it for me. Just comment out jamr/tools/cdec/corpus/support/quote-norm.pl line 149 to work around the crash, which appears to be a Perl bug, similar to https://rt.perl.org/Public/Bug/Display.html?id=124109.

@ritwikmishra
Copy link

@ConstantineLignos I tried what you suggested. And compiled it again. Now output comes

### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/ATS/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
   at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
   at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
   at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
   at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)

Bdw I am using CAMR parser now, it is working better as per my needs.

@calliwen
Copy link

calliwen commented Jan 22, 2019

I followed what @bheinzerling suggested and it worked for parsing. But when I run
scripts/ALIGN.sh < output_file2 > aligned_output_file
Here output_file2 is output of parsing step.
Same error is encountered

 ### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:48)
	at edu.cmu.lti.nlp.amr.CorpusTool$$anonfun$main$1.apply(CorpusTool.scala:43)
	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
	at edu.cmu.lti.nlp.amr.CorpusTool$.main(CorpusTool.scala:43)
	at edu.cmu.lti.nlp.amr.CorpusTool.main(CorpusTool.scala)

I went to the line number 40 of CorpusTool.scala and commented the line just like @bheinzerling suggested in case of AMRParser.scala . I added the following lines instead

val input = stdin.getLines.toArray
val tokenized = input

I compiled it.
Now the script ALIGN.sh runs without any Exception. And shows this

### Tokenizing ###
panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.
 ### Running aligner ###

but it gives nothing as output. The file aligned_output_file is empty .

What can be done? (I am using the pre-trained models-2016.09.18.tgz only)
Thanks

And I do the same thing like you in CorpusTool.scala file. And I got the follow message:

### Tokenizing ###
/Users/gaoyong/jamr/tools/cdec/corpus/support/utf8-normalize.sh: Cannot find ICU uconv (http://site.icu-project.org/) ... falling back to iconv. Quality may suffer.
iconv: conversion from utf8 unsupported
iconv: try 'iconv -l' to get the list of supported encodings
 ### Running aligner ###

So can you solve your problem?

@ConstantineLignos
Copy link

@calliwen I don't know how much this helps, but I am now seeing what others are, where commenting out the line of Perl I suggest above is not enough to fix it. We have two otherwise identical machines where one works and the other doesn't, and we haven't been able to sort out the difference.

However, in your case, I think this is the most important error:

panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.

If you comment out line 149 of that file, does the problem go away?

@calliwen
Copy link

@calliwen I don't know how much this helps, but I am now seeing what others are, where commenting out the line of Perl I suggest above is not enough to fix it. We have two otherwise identical machines where one works and the other doesn't, and we haven't been able to sort out the difference.

However, in your case, I think this is the most important error:

panic: swash_fetch got swatch of unexpected bit width, slen=1024, needents=64 at /home/ritwik/JAMR/jamr/tools/cdec/corpus/support/quote-norm.pl line 149, <STDIN> line 1.

If you comment out line 149 of that file, does the problem go away?

Thanks for the reply. When I use it on MacOS, I got the "ICU uconv" and "utf8 unsupported" error. But When I run it on Linux, I got the above msg output, and solved it with your solution.
Thanks a lot.

@ConstantineLignos
Copy link

@calliwen Glad it worked! You can probably get a working uconv from homebrew for MacOS. You may have to manually get the executables on your path, see https://apple.stackexchange.com/questions/201590/uconv-on-mac-os-x-anywhere .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants