New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very long texts #13
Comments
Hi Michele - You're on the right track, but you also have to change the default timeout in the JSON Client library, |
By modifying
I am also changing the timeout variable (in If I try to run this code on the text described before, I will have a Is there something else I should change? Thanks a lot, |
I followed steps mentioned by michele, and encountered a ' jsonrpc.RPCTransportError: [Errno 104] Connection reset by peer ' , my file size is approximately 40,000 characters. Running the stanford coreNLP on this file takes about ~300 seconds. When i first encountered the timed out exception, i followed your advice and set my timeout to 1800. Thank You |
It can solve by changing the 'limit' and 'timeout' parameters in the jsonrpc.py by increasing their value. |
The problem is still there. This is what I did. In corenlp.py (server)I modified lines #256 and #257, by adding the server = jsonrpc.Server(jsonrpc.JsonRpc20(),
jsonrpc.TransportTcpIp(addr=(options.host, int(options.port)), limit=50000, timeout=2000.0)) I run the server on the default port In my client.py (client)I wrote a simple client.py (very similar to your client.py) but I added here the same import json
from jsonrpc import ServerProxy, JsonRpc20, TransportTcpIp
from pprint import pprint
nlp = ServerProxy(JsonRpc20(), TransportTcpIp(addr=("127.0.0.1", 8080), limit=50000, timeout=2000.0))
doc = "\\n\\n A New York man who was accused of faking his death last summer pleaded guilty to a conspiracy charge Thursday, Nassau County District Attorney Kathleen Rice announced.\\n\\nRaymond Roth, 48, of Massapequa, New York, was first reported missing in the waters off Jones Beach late last July by his 22-year-old son, Jonathan Roth. Several days into an extensive search involving multiple agencies, New York State Park Police said, authorities learned the missing man was in South Carolina, where he had been pulled over for speeding.\\n\\nThe day before Raymond Roth was pulled over, his wife, Evana, showed authorities e-mails she had discovered that appeared to detail a plan between him and his son to fake his death. Raymond Roth wanted his wife and son to collect at least $410,000 in life insurance benefits while he started a new life in Florida, Rice said.\\n\\nState police arrested both men in early August on charges of insurance fraud, conspiracy and filing a false report. Raymond Roth on Thursday agreed to plead guilty to the conspiracy charge in exchange for a sentence of 90 days in jail and five years\\' probation, the district attorney\\'s office said. He also must pay restitution for the cost of the search -- $27,445 to the U.S. Coast Guard and $9,109 to the Nassau County Police Department.\\n\\nEvana Roth told CNN in August she thought her husband devised the plan after he was fired from his job in July. Her attorney, Lenard Leeds, said she had been unaware of the ruse before she uncovered the e-mail correspondence.\\n\\n\"There needs to be a way for me to find out how things are going. Call me Sunday night at 8 PM at the resort,\" Raymond Roth wrote in an e-mail to his son the day before the son reported him missing.\\n\\nThe son\\'s case is still pending, the district attorney said. Jonathan Roth\\'s attorney, Joey Jackson, defended his client after his arrest, saying, \"There was abuse here, manipulation here, coercion here\" from the father.\\n\\nRaymond Roth\\'s attorney, Brian Davis, denied in August that Roth had involved his son in the scheme.\\n\\n\"We had issues concerning the facts people had whether (Roth) had an agreement with his son,\" Davis told CNN on Thursday. \"He\\'s admitted it now. He\\'s accepted responsibility.\"\\n\\nDavis added that his client has been under treatment for bipolar disorder in recent weeks.\\n\\nDuring plea negotiations, Raymond Roth asked the district attorney\\'s office not to give his son jail time, Davis said.\\n\\nOn the advice of both their attorneys, father and son have not been in contact since their arrests, Davis said.\\n\\n\"He would like to straighten things out with (Jonathan) when the time comes,\" he said.\\n\\n\\n\\n\\n"
print 'Doc lenght: {}'.format(len(doc))
pprint(json.loads(nlp.parse(doc))) ResultThis is what I get: Doc lenght: 2682
{u'error': u'timed out after 137.100000 seconds'} Notice that the
I think there's something in the PS. Apologies for the late reply (for some reasons I didn't receive any notification via e-mail). |
Any solution to this? |
Confirmed that changing default timeout at https://github.com/dasmith/stanford-corenlp-python/blob/master/jsonrpc.py#L746 to something like 200 seconds works for pathologically long sentences. |
in jsonrpc.py, string search for "5.0" and change all those to a larger number, and it solved the problem for me |
This is a late follow-up, but this suggested fix does not work for me. I've set the timeouts to about 300s, but it appears that corenlp still times out. Did anybody have a different solution? In particular, if I try to parse a chunk of 1000 chars it goes through fine, but trying to parse 1024 chars breaks. Edit: It appears that the issue comes from CoreNLP itself - the command line interface there uses a buffer which only reads in 1024 chars at a time, and since this library essentially uses the shell, it will break accordingly on longer strings. |
@dasmith Could you please look into this? It would be great if you could respond to this, thanks. |
Dustin passed away Feb. 2015
…On Wed, Feb 21, 2018 at 8:32 AM Sriharsh Bhyravajjula < ***@***.***> wrote:
@dasmith <https://github.com/dasmith> Could you please look into this?
There is a corresponding set of stackoverflow questions, too, but they
aren't working for me.
https://stackoverflow.com/questions/32550162/
https://stackoverflow.com/questions/41260313/
It would be great if you could respond to this, thanks.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAOZWzm_9oQV90xRusxNo5fM8ebIsTWuks5tXBr8gaJpZM4C4PRB>
.
|
I was checking my notifications after a long time, and I just saw this. I have been thinking of a reply for the past half an hour, I don't know what to say. I did not expect to find something so deafeningly loud when I clicked on that bell icon. I don't even know how I feel overwhelmed by those five words, despite being sensitive, I am just a random stranger on the internet who happened to use Dustin's code. But for the last half an hour, I've been sitting in silence. I've tried to read about him; I found his obituary, online, and I imagine a very smart and kind man, someone who might have chuckled a bit at an NLP question I might have asked at the end of a class. 'Beloved son, brother, uncle and friend. Known for his incredible mind and infectious sense of humor, Dustin leaves a lasting legacy of compassion and tenderness.', the words go. His friends said that 'Dustin’s bright blue eyes could light up any room'. And in one of his pictures Dustin was glad to share on his MIT page, I can see the warm smile which they were talking about, a smile which once perhaps spontaneously broke out in the middle of a gathering and spread around like a warm breeze on a spring morning. The world is a lesser place for your loss, Dustin.
|
I am trying to parse a text which is 1297 characters long but it returns an empty sentence. If I use a different
timeout
value in the fileclient.py
, let's say 200.0, after that time passes the code raises anjsonrpc.RPCTransportError: timed out
exception.Could you tell me what I am supposed to modify in the code to make
client.py
work with longer texts?Thanks,
michele.
The text was updated successfully, but these errors were encountered: