Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIMEOUT: Timeout exceeded error trying tok = CoreNLPTokenizer() #23

Closed
RitwikGopi opened this issue Aug 4, 2017 · 15 comments
Closed

TIMEOUT: Timeout exceeded error trying tok = CoreNLPTokenizer() #23

RitwikGopi opened this issue Aug 4, 2017 · 15 comments

Comments

@RitwikGopi
Copy link

WhenI try

>>> from drqa.tokenizers import CoreNLPTokenizer
>>> tok = CoreNLPTokenizer()
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pexpect/expect.py", line 99, in expect_loop
    incoming = spawn.read_nonblocking(spawn.maxread, timeout)
  File "/usr/local/lib/python3.5/dist-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
    raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ritwik/rd/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 33, in __init__
    self._launch()
  File "/home/ritwik/rd/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 61, in _launch
    self.corenlp.expect_exact('NLP>', searchwindowsize=100)
  File "/usr/local/lib/python3.5/dist-packages/pexpect/spawnbase.py", line 390, in expect_exact
    return exp.expect_loop(timeout)
  File "/usr/local/lib/python3.5/dist-packages/pexpect/expect.py", line 107, in expect_loop
    return self.timeout(e)
  File "/usr/local/lib/python3.5/dist-packages/pexpect/expect.py", line 70, in timeout
    raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7ff89a70f128>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'@stagwiki: ~/rd/DrQA/data/corenlp\x07\x1b[01;32mritwik@stagwiki\x1b[00m:\x1b[01;34m~/rd/DrQA/data/corenlp\x1b[00m$ '
before (last 100 chars): b'@stagwiki: ~/rd/DrQA/data/corenlp\x07\x1b[01;32mritwik@stagwiki\x1b[00m:\x1b[01;34m~/rd/DrQA/data/corenlp\x1b[00m$ '
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 17048
child_fd: 5
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
    0: "b'NLP>'"

CLASSPATH is set properly

corenlp$ echo $CLASSPATH
/home/ritwik/rd/DrQA/data/corenlp/ejml-0.23.jar /home/ritwik/rd/DrQA/data/corenlp/javax.json-api-1.0-sources.jar /home/ritwik/rd/DrQA/data/corenlp/javax.json.jar /home/ritwik/rd/DrQA/data/corenlp/joda-time-2.9-sources.jar /home/ritwik/rd/DrQA/data/corenlp/joda-time.jar /home/ritwik/rd/DrQA/data/corenlp/jollyday-0.4.9-sources.jar /home/ritwik/rd/DrQA/data/corenlp/jollyday.jar /home/ritwik/rd/DrQA/data/corenlp/protobuf.jar /home/ritwik/rd/DrQA/data/corenlp/slf4j-api.jar /home/ritwik/rd/DrQA/data/corenlp/slf4j-simple.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-javadoc.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-models.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-sources.jar /home/ritwik/rd/DrQA/data/corenlp/xom-1.2.10-src.jar /home/ritwik/rd/DrQA/data/corenlp/xom.jar
@ajfisch
Copy link
Contributor

ajfisch commented Aug 4, 2017

Hi,

This is not how you specify the CLASSPATH in java. You need : as the separator between jars. Instead of adding each jar explicitly, just do export CLASSPATH=/home/ritwik/rd/DrQA/data/corenlp/*.

@RitwikGopi
Copy link
Author

I have done the same only.

@ajfisch
Copy link
Contributor

ajfisch commented Aug 4, 2017

Hm. Are you sure?

[afisch/~]$ export CLASSPATH=/home/ritwik/rd/DrQA/data/corenlp/*
[afisch/~]$ echo $CLASSPATH
/home/ritwik/rd/DrQA/data/corenlp/*

@RitwikGopi
Copy link
Author

ritwik@stagwiki:~$ export CLASSPATH=/home/ritwik/rd/DrQA/data/corenlp/*
ritwik@stagwiki:~$ echo $CLASSPATH
/home/ritwik/rd/DrQA/data/corenlp/ejml-0.23.jar /home/ritwik/rd/DrQA/data/corenlp/javax.json-api-1.0-sources.jar /home/ritwik/rd/DrQA/data/corenlp/javax.json.jar /home/ritwik/rd/DrQA/data/corenlp/joda-time-2.9-sources.jar /home/ritwik/rd/DrQA/data/corenlp/joda-time.jar /home/ritwik/rd/DrQA/data/corenlp/jollyday-0.4.9-sources.jar /home/ritwik/rd/DrQA/data/corenlp/jollyday.jar /home/ritwik/rd/DrQA/data/corenlp/protobuf.jar /home/ritwik/rd/DrQA/data/corenlp/slf4j-api.jar /home/ritwik/rd/DrQA/data/corenlp/slf4j-simple.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-javadoc.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-models.jar /home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-sources.jar /home/ritwik/rd/DrQA/data/corenlp/xom-1.2.10-src.jar /home/ritwik/rd/DrQA/data/corenlp/xom.jar

@ajfisch
Copy link
Contributor

ajfisch commented Aug 4, 2017

Weird. What shell are you using? bash? Try escaping: export CLASSPATH="/home/ritwik/rd/DrQA/data/corenlp/*"

@RitwikGopi
Copy link
Author

I am using bash. Above also gives the same result only

@RitwikGopi
Copy link
Author

Got the CLASSPATHS corrected. Still the error is occuring. Since those are jar files do I need to setup java for running this?

ritwik@stagwiki:~/rd/DrQA$ for d in ~/rd/DrQA/data/corenlp/*;do export CLASSPATH=$CLASSPATH:$d;done
ritwik@stagwiki:~/rd/DrQA$ echo $CLASSPATH
:/home/ritwik/rd/DrQA/data/corenlp/ejml-0.23.jar:/home/ritwik/rd/DrQA/data/corenlp/javax.json-api-1.0-sources.jar:/home/ritwik/rd/DrQA/data/corenlp/javax.json.jar:/home/ritwik/rd/DrQA/data/corenlp/joda-time-2.9-sources.jar:/home/ritwik/rd/DrQA/data/corenlp/joda-time.jar:/home/ritwik/rd/DrQA/data/corenlp/jollyday-0.4.9-sources.jar:/home/ritwik/rd/DrQA/data/corenlp/jollyday.jar:/home/ritwik/rd/DrQA/data/corenlp/protobuf.jar:/home/ritwik/rd/DrQA/data/corenlp/slf4j-api.jar:/home/ritwik/rd/DrQA/data/corenlp/slf4j-simple.jar:/home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0.jar:/home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-javadoc.jar:/home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-models.jar:/home/ritwik/rd/DrQA/data/corenlp/stanford-corenlp-3.8.0-sources.jar:/home/ritwik/rd/DrQA/data/corenlp/xom-1.2.10-src.jar:/home/ritwik/rd/DrQA/data/corenlp/xom.jar

@ajfisch
Copy link
Contributor

ajfisch commented Aug 4, 2017

Yes, you need Java 8

@ajfisch
Copy link
Contributor

ajfisch commented Aug 4, 2017

As a more direct test, try to see if you can get java -cp "/home/ritwik/rd/DrQA/data/corenlp/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit to work.

@RitwikGopi
Copy link
Author

I didn't have java installed. Installing it made it work in python. Anyway above code threw an error.

ritwik@stagwiki:~/rd/DrQA$ java -cp /home/ritwik/rd/DrQA/data/corenlp/*
Error: Could not find or load main class .home.ritwik.rd.DrQA.data.corenlp.javax.json-api-1.0-sources.jar

Anyway, it is working now. I think it will be a good idea to add java in to the dependencies list.

@netsafe
Copy link

netsafe commented Sep 17, 2018

The classpath is working in my case, java test string too - but the problem is:

09/17/2018 01:52:46 PM: [ Running on CPU only. ]
09/17/2018 01:52:46 PM: [ Initializing model... ]
09/17/2018 01:52:46 PM: [ Loading model /usr/work/DrQA/data/reader/single.mdl ]
09/17/2018 01:53:02 PM: [ Initializing tokenizer... ]
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pexpect/expect.py", line 99, in expect_loop
incoming = spawn.read_nonblocking(spawn.maxread, timeout)
File "/usr/local/lib/python3.5/dist-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "scripts/reader/interactive.py", line 53, in
normalize=not args.no_normalize)
File "/usr/work/DrQA/drqa/reader/predictor.py", line 84, in init
self.tokenizer = tokenizer_class(annotators=annotators)
File "/usr/work/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 33, in init
self._launch()
File "/usr/work/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 61, in _launch
self.corenlp.expect_exact('NLP>', searchwindowsize=100)
File "/usr/local/lib/python3.5/dist-packages/pexpect/spawnbase.py", line 390, in expect_exact
return exp.expect_loop(timeout)
File "/usr/local/lib/python3.5/dist-packages/pexpect/expect.py", line 107, in expect_loop
return self.timeout(e)
File "/usr/local/lib/python3.5/dist-packages/pexpect/expect.py", line 70, in timeout
raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0xb38a4850>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'sifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [10.2 sec].\r\n'
before (last 100 chars): b'sifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [10.2 sec].\r\n'
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 29943
child_fd: 5
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
0: "b'NLP>'"

logging in through SSH into docker container

@ajfisch
Copy link
Contributor

ajfisch commented Sep 20, 2018

Oy. Which version of CoreNLP and/or pexpect are you using? The NER module of the versions past 2017-06-09 loads several gazetteers which breaks this implementation. If you can't get it to work, I'd recommend using the spacy tokenizer. (This should be replaced with a more robust way of using CoreNLP efficiently.)

@mazzzystar
Copy link

@RitwikGopi
I have the sample issue with you and follow your steps untill run

java -cp "/home/ritwik/rd/DrQA/data/corenlp/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit

I find I can run in command line:

(dl) testMacBook-Pro:DrQA ke$ java -cp "/Users/test/Documents/QA/DrQA/data/corenlp/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit

Entering interactive shell. Type q RETURN or EOF to quit.
NLP> hello world
Sentence #1 (2 tokens):
hello world
[Text=hello CharacterOffsetBegin=0 CharacterOffsetEnd=5]
[Text=world CharacterOffsetBegin=6 CharacterOffsetEnd=11]
NLP> how are you ?
Sentence #1 (4 tokens):
how are you ?
[Text=how CharacterOffsetBegin=0 CharacterOffsetEnd=3]
[Text=are CharacterOffsetBegin=4 CharacterOffsetEnd=7]
[Text=you CharacterOffsetBegin=8 CharacterOffsetEnd=11]
[Text=? CharacterOffsetBegin=12 CharacterOffsetEnd=13]
NLP> 

But when I run a python script as @ajfisch mentioned:

from drqa.tokenizers import CoreNLPTokenizer
tok = CoreNLPTokenizer()
tok.tokenize('hello world').words()  # Should complete immediately

Everything still the same, namely after a long long while, it occurs mistakes as:

Traceback (most recent call last):
  File "/Users/ke/miniconda3/envs/dl/lib/python3.6/site-packages/pexpect/expect.py", line 99, in expect_loop
    incoming = spawn.read_nonblocking(spawn.maxread, timeout)
  File "/Users/ke/miniconda3/envs/dl/lib/python3.6/site-packages/pexpect/pty_spawn.py", line 462, in read_nonblocking
    raise TIMEOUT('Timeout exceeded.')
pexpect.exceptions.TIMEOUT: Timeout exceeded.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ke/Documents/QA/DrQA/test.py", line 2, in <module>
    tok = CoreNLPTokenizer()
  File "/Users/ke/Documents/QA/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 33, in __init__
    self._launch()
  File "/Users/ke/Documents/QA/DrQA/drqa/tokenizers/corenlp_tokenizer.py", line 61, in _launch
    self.corenlp.expect_exact('NLP>', searchwindowsize=100)
  File "/Users/ke/miniconda3/envs/dl/lib/python3.6/site-packages/pexpect/spawnbase.py", line 390, in expect_exact
    return exp.expect_loop(timeout)
  File "/Users/ke/miniconda3/envs/dl/lib/python3.6/site-packages/pexpect/expect.py", line 107, in expect_loop
    return self.timeout(e)
  File "/Users/ke/miniconda3/envs/dl/lib/python3.6/site-packages/pexpect/expect.py", line 70, in timeout
    raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x10087dac8>
command: /bin/bash
args: ['/bin/bash']
buffer (last 100 chars): b'\r\nCaused by: java.lang.ClassNotFoundException: edu.stanford.nlp.pipeline.StanfordCoreNLP\r\nbash-3.2$ '
before (last 100 chars): b'\r\nCaused by: java.lang.ClassNotFoundException: edu.stanford.nlp.pipeline.StanfordCoreNLP\r\nbash-3.2$ '
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 6773
child_fd: 7
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 100000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_string:
    0: "b'NLP>'"

@nikhileshp
Copy link

If you want to add java directly from your jupyter notebook (if you are using google colab/cloud notebook)


import os 
#importing os to set environment variable
!sudo apt-get update
def install_java():
  !sudo apt-get install -y openjdk-8-jdk-headless -qq > /dev/null      #install openjdk
  os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"     #set environment variable
  !java -version       #check java version
install_java()

@hinaayousaf
Copy link

@ajfisch Can you please guide me how to replace corenlp tokenizer with the spacy in DRQA, i'm having lot of trouble in doing all this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants