Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preporcessing issue #24

Open
Akshaysharma29 opened this issue Sep 16, 2020 · 7 comments
Open

preporcessing issue #24

Akshaysharma29 opened this issue Sep 16, 2020 · 7 comments

Comments

@Akshaysharma29
Copy link

Akshaysharma29 commented Sep 16, 2020

Previous related issue
#21

My command line output:

DB connections: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 166/166 [00:00<00:00, 297.03it/s]
train section:   1%|█▏                                                                                                      | 99/8659 [00:02<03:46, 37.76it/s]100 sample done at c= 100
train section:   1%|█▏                                                                                                      | 99/8659 [00:02<04:07, 34.59it/s]
DB connections: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 166/166 [00:00<00:00, 281.66it/s]
val section:   9%|█████████▉                                                                                                | 97/1034 [00:02<00:28, 32.90it/s]100 sample done at c= 200
val section:   9%|█████████▉                                                                                                | 97/1034 [00:02<00:22, 41.70it/s]
87 words in vocab
Exception ignored in: <function CoreNLP.__del__ at 0x7efde4998560>
Traceback (most recent call last):
  File "/app/ratsql/resources/corenlp.py", line 24, in __del__
  File "/root/.local/lib/python3.7/site-packages/corenlp/client.py", line 83, in stop
  File "/opt/conda/lib/python3.7/subprocess.py", line 1790, in kill
AttributeError: 'NoneType' object has no attribute 'SIGKILL'

I have also tried to increase docker memory to 8gb.
Any suggestion?

@zsLin177
Copy link

zsLin177 commented Oct 9, 2020

yeah, i have the same problem 👍

@zsLin177
Copy link

zsLin177 commented Oct 9, 2020

yeah, i have the same problem 👍
I have tried to increase docker memory to 32gb. But, got this:
train section: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8659/8659 [1:04:40<00:00, 2.23it/s]
DB connections: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 166/166 [00:00<00:00, 267.34it/s]
val section: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1034/1034 [10:14<00:00, 1.68it/s]
1580 words in vocab
Exception ignored in: <function CoreNLP.del at 0x7fc9f7051290>
Traceback (most recent call last):
File "/app/ratsql/resources/corenlp.py", line 24, in del
File "/root/.local/lib/python3.7/site-packages/corenlp/client.py", line 83, in stop
File "/opt/conda/lib/python3.7/subprocess.py", line 1790, in kill
AttributeError: 'NoneType' object has no attribute 'SIGKILL'

So, i think it is not the memory problem. Have you solve the problem? Please give some suggestion,thanks.

@Akshaysharma29
Copy link
Author

Hi @zsLin177
I have not solved this.

@Tabish06
Copy link

Even I'm facing the same issue

@hclent
Copy link

hclent commented Oct 14, 2020

I also have this same issue on an instance with 45G memory, so I do not think it is a memory issue. I am using the Dockerfile provided by the repo, so there should be no dependency problems.

I am pretty convinced that this is an error with the way the corenlp server is supposed to shut down. I was also able to "force" this error to happen when I keyboard interrupt the program early, when the "train" section is being loaded into a registry.

train section:   6%|██████████▋     
[pretrained_embeddings.py] tokenize() method called ... 
[pretrained_embeddings.py] tokenize() method called ... 
[pretrained_embeddings.py] tokenize() method called ... 
^CTraceback (most recent call last):
  File "run.py", line 109, in <module>
    main()
  File "run.py", line 73, in main
    preprocess.main(preprocess_config)
  File "/app/ratsql/commands/preprocess.py", line 56, in main
    preprocessor.preprocess()
  File "/app/ratsql/commands/preprocess.py", line 35, in preprocess
    self.model_preproc.add_item(item, section, validation_info)
  File "/app/ratsql/models/enc_dec.py", line 44, in add_item
    self.enc_preproc.add_item(item, section, enc_info)
  File "/app/ratsql/models/spider/spider_enc.py", line 168, in add_item
    preprocessed = self.preprocess_item(item, validation_info)
  File "/app/ratsql/models/spider/spider_enc.py", line 203, in preprocess_item
    cv_link = compute_cell_value_linking(question, item.schema)
  File "/app/ratsql/models/spider/spider_match_utils.py", line 123, in compute_cell_value_linking
    ret = db_word_match(word, column.orig_name, column.table.orig_name, schema.connection)
  File "/app/ratsql/models/spider/spider_match_utils.py", line 91, in db_word_match
    cursor.execute(p_str)
KeyboardInterrupt
train section:   6%|██████████▋                                                                                                                                                                   | 533/8659 [01:31<23:21,  5.80it/s]
Exception ignored in: <function CoreNLP.__del__ at 0x7f62693dbef0>
Traceback (most recent call last):
  File "/app/ratsql/resources/corenlp.py", line 24, in __del__
  File "/root/.local/lib/python3.7/site-packages/corenlp/client.py", line 83, in stop
  File "/opt/conda/lib/python3.7/subprocess.py", line 1790, in kill
AttributeError: 'NoneType' object has no attribute 'SIGKILL'

Still don't know the fix, and I'm experimenting with some things. If I get the answer I'll post it here, but otherwise this information might be useful to you all :)

Edit- just following up. I did a full run of the preprocessing with L#24 of corenlp.py commented out (just replaced with with some kind of print statement for debugging). The code will reach its end, with no error, and you will get the preprocessing files that you need (check your data/ directory!). In conclusion, this corenlp error should not affect the data preprocessing at all. The error is just from the <class 'corenlp.client.CoreNLPClient'> object terminating incorrectly :) Hope that helps!

@zsLin177
Copy link

Thank you for your comment @hclent.
I change the 23-25 lines to:
def del(self):
# self.client.stop()
pass
and it works too.

@manzambi11
Copy link

Thank you @hclent for your comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants