New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch processing #6
Comments
There is a way I think by using the 'split-eolonly' property and joining the text with \n but I didn't get it to work yet but you can try investigate, One other way to get speed improvements would be use some async post requests to send which works like about 6/7x faster for me but still a little slightly slow, might be throttled at the server side also so could be even quicker if you have multiple. |
@Subh1m Thanks for your advice. I have tested the following code: # coding=utf-8
import time
from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP(r'G:/JavaLibraries/stanford-corenlp-full-2016-10-31/')
sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.'
begin = time.time()
nlp.dependency_parse(sentence)
print(time.time() - begin)
corpus = [sentence] * 1000
begin = time.time()
for sent in corpus:
nlp.dependency_parse(sent)
print(time.time() - begin)
It means it takes about 26 seconds to load the model, and about 24 seconds to parse 1000 sentences. The project is just a wraper to parse the json data requested from the Java Server backend, and the "async request" method suggested by @scottwthompson may be a good way to speed up. |
Thanks for the code @Lynten . Just had a question. I think in the line (NLP.dependency_parse(sentence)), we can use anything in the place of sentence, right? Or do we need to pass the sentence in order to load the model? |
@Subh1m The Java Server initializes the model the first time you call nlp.dependency_parse(sentence), and then it starts to work fastly. Of course we can use anything in place of the "sentence". |
Thanks @Lynten, this helped a lot. |
It is a great wrapper.
Can you make it run as a batch process as it is too slow to run this each time for a new sentence?
I need to make it dependency parse several sentences within seconds.
Please look into the issue.
The text was updated successfully, but these errors were encountered: