New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep getting Net::HTTP::Persistent::Error: too many connection resets (due to end of file reached - EOFError) #116
Comments
I keep getting these errors during my testing... has any one else seen this and what needs to be done to resolve it? Here is another example of the error with more info: /Library/Ruby/Gems/1.8/gems/net-http-persistent-1.7/lib/net/http/persistent.rb:426:in `request': too many connection resets (due to end of file reached - |
After a quick peek at the code, my understanding is that when some (possibly temporary) error occurred, it would attempt to retry for a few more times, and if it still fails then it throws this error. So this probably has little to do with too many dangling connections - and my impression from scanning the code is that it already manages the connection(s?) somewhat intelligently, including reusing them and shutting down those are not needed, but I'm not sure. My guess is this is a server-specific issue? |
You will get an EOFError when the connection is closed before all the data is retrieved. Since this is happening at the network layer it's probably an issue with the server closing the connection before sending all the data. If you run with debug mode enabled and paste the output we may be able to learn more. |
Sorry to open this back up again but I have the same problem. Here's my code: require 'rubygems' require 'mechanize' require 'logger' agent = Mechanize.new agent.keep_alive = true agent.log = Logger.new(STDOUT) page = agent.get('http://www.lexisnexis.com/lawschool/login.aspx') login_form = page.form('form1') login_form.txtLoginID = 'email@example.com' login_form.TextBox1 = 'password1234' page = agent.submit(login_form) # searchPage is where the error occurs # searchPage = agent.get('https://www.lexis.com/research/xlink') pp searchPage I'm not quite sure what to do so I've created a gist of my output https://gist.github.com/2496352. |
Looking at your output, mechanize thinks it got two HTTP responses for only one request. Can you send me a log with the raw socket output to drbrain@segment7.net? The raw socket debugging will include your password unless you edit it out. Here is an example:
By default stderr is not buffered, unlike stdout. |
I got your log, here's the important part, edited for clarity and brevity:
The second-to-last line is the problem, RFC 2616 states that I'll look into fixing mechanize to raise the proper error and after that try to add detection for this type of error. |
Thanks for the clarification. Does this mean, ultimately, that the page can't be scraped? Best regards, On Friday, April 27, 2012 at 8:59 PM, Eric Hodel wrote:
|
I need to fix some bugs, but it will be scrapable after that. |
Cool. Thanks again! |
…::ResponseReadError in case of bad servers. Issue #116
@metalfingers the following should work for you after @dd65e11 mech = Mechanize.new
mech.ignore_bad_chunking = true
# … your script Note that # … your script
begin
page = agent.get('https://www.lexis.com/research/xlink')
rescue Mechanize::ChunkTerminationError => e
# check e.body_io for completeness
page = e.force_parse
end Mechanize::ChunkTerminationError is a subclass of Mechanize::ResponseReadError so you can handle it the same as a content-length error: http://mechanize.rubyforge.org/Mechanize.html#label-Problems+with+content-length |
Thanks. I'll give it a go! |
I don't think the above exception handling helps with form submissions receiving this very error. When I encounter the EOF exception after a form submit, I receive the following backtrace :
And the exception object is of type Net::HTTP::Persistent::Error, so you can't call e.force_parse . Still looking into this. |
By disabling keep_alive, we were able to alleviate the EOFError's that we were receiving. We found the fix outlined here : http://rubyforge.org/pipermail/mechanize-users/2010-January/000486.html
|
Mechanize 2.5.1? I can.
Most of my form submissions go through, it's just a few that don't, so I don't think it's the same bug. Trialling disabling keep_alive now. |
What is the keep-alive timeout for this host? Does reducing the idle_timeout to 0.5 help? |
Disabling keep_alive worked. The header is:
I'll check the idle_timeout as soon as I can. |
Setting
|
Correction: The form submissions do appear to be going through; I misread my logs. |
Since the "last used" value is 0.254 seconds, setting your idle_timeout to 0.25 may be better. It seems your server has a particularly aggressive idle timeout. |
I am processing multiple pages on a site for a payment processor and I run into errors like:
Net::HTTP::Persistent::Error: too many connection resets (due to end of file reached - EOFError) after 3 requests on 2195783400
I am thinking that it is because the processing is happening faster than the Net connection can be closed or released
Is this the case and how to force close or wait until close before continuing - or some better way to handle this?
The text was updated successfully, but these errors were encountered: