Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - Errno::ECONNRESET) after 2 requests on 14759220 #123

Closed
bjoseph opened this Issue Jul 3, 2011 · 102 comments

Projects

None yet
@bjoseph
bjoseph commented Jul 3, 2011

I am trying to do some simple screen scraping on etrade's website but am getting a similar issue that was reported by someone earlier. I read through that message thread but it doesn't look like it was ever resolved.

Here is the error I am getting back:

Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - Errno::ECONNRESET) after 2 requests on 14759220
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/net-http-persistent-1.8/lib/net/http/persistent.rb:446:in `rescue in request'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/net-http-persistent-1.8/lib/net/http/persistent.rb:422:in `request'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize/http/agent.rb:204:in `fetch'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize.rb:628:in `post_form'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize.rb:520:in `submit'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize/form.rb:167:in `submit'
    from (irb):74
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/railties-3.0.9/lib/rails/commands/console.rb:44:in `start'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/railties-3.0.9/lib/rails/commands/console.rb:8:in `start'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/railties-3.0.9/lib/rails/commands.rb:23:in `<top (required)>'
    from script/rails:6:in `require'
    from script/rails:6:in `<main>'

Here is my code:

require 'rubygems'
require 'mechanize'

agent = Mechanize.new
login_page = agent.get("https://www.etrade.com")
form = login_page.form_with(:action => '/login.fcc') 
form.USER     = "test"
form.PASSWORD = "test12"
form.submit

Any ideas?

Thanks

@chip
chip commented Jul 8, 2011

I am getting the same error when attempting an http POST, but it only occurs intermittently.

@shaiguitar

Same here.

@lankz
lankz commented Jul 28, 2011

I encountered this too after upgrading from 1.0.0, along with this one in a few other places:

too many connection resets (due to end of file reached - EOFError) after 3 requests on 70046648458580

Easily resolved by sticking with 1.0.0, which seems much more stable compared to 2.x at the moment.

@styx
styx commented Aug 1, 2011

I'll try to find out when the regression came up.

@styx
styx commented Aug 1, 2011

Net::Http::Persistent introduced in 4d074f4

Some stuff from debug log:

I, [2011-08-01T11:01:20.015280 #7916]  INFO -- : Net::HTTP::Post: /login.fcc
D, [2011-08-01T11:01:20.015352 #7916] DEBUG -- : request-header: accept => */*
D, [2011-08-01T11:01:20.015384 #7916] DEBUG -- : request-header: user-agent => WWW-Mechanize/1.0.0 (http://rubyforge.org/projects/mechanize/)
D, [2011-08-01T11:01:20.015415 #7916] DEBUG -- : request-header: connection => keep-alive
D, [2011-08-01T11:01:20.015449 #7916] DEBUG -- : request-header: keep-alive => 300
D, [2011-08-01T11:01:20.015479 #7916] DEBUG -- : request-header: accept-encoding => gzip,identity
D, [2011-08-01T11:01:20.015509 #7916] DEBUG -- : request-header: accept-language => en-us,en;q=0.5
D, [2011-08-01T11:01:20.015540 #7916] DEBUG -- : request-header: host => us.etrade.com
D, [2011-08-01T11:01:20.015570 #7916] DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7
D, [2011-08-01T11:01:20.015600 #7916] DEBUG -- : request-header: cookie => WRC_ID=93.125.111.29-1312185679759; TB=8785
D, [2011-08-01T11:01:20.015630 #7916] DEBUG -- : request-header: referer => https://us.etrade.com/e/t/home
D, [2011-08-01T11:01:20.015665 #7916] DEBUG -- : request-header: content-type => application/x-www-form-urlencoded
D, [2011-08-01T11:01:20.015695 #7916] DEBUG -- : request-header: content-length => 54
E, [2011-08-01T11:01:20.306272 #7916] ERROR -- : Rescuing EOF error
E, [2011-08-01T11:01:21.374473 #7916] ERROR -- : Rescuing EOF error
E, [2011-08-01T11:01:22.435259 #7916] ERROR -- : Rescuing EOF error

I also checked headers in Firefox plugin: Tamper Data
It shows that the size of 3 first consecutive requests(POST, GET, GET) is -1.

@bhochhi
bhochhi commented Aug 5, 2011

Any solution to this issue? I am getting similar error with no effect increasing keep_alive_time or open_time:
too many connection resets (due to end of file reached - EOFError) after 3 requests on 3084924
Net::HTTP::Persistent::Error

@simonmd
simonmd commented Aug 6, 2011

Same here, seems to fluctuate between EOF and ECONNRESET

@jg
jg commented Aug 9, 2011

Confirmed with Mechanize 2.0.1. Please fix this issue.

@GBH
GBH commented Aug 16, 2011

+1 I'm seeing the same thing

@drbrain drbrain was assigned Aug 22, 2011
@knu
Member
knu commented Aug 24, 2011

Same here. It occurs regardless of https or http.

If you know re-posting is harmless (as in a login form) you can temporarily set:

  agent.agent.http.retry_change_request = true

To force a re-post, and seems it works for me.

@knu
Member
knu commented Aug 24, 2011

Another ugly workaround is to manually reset the connection before posting.

  #...
  agent.agent.http.tap { |http|
    http.reset http.connection_for(login_page.uri + form.action)
  }
  form.submit
@matthewbjones

I'm also seeing this quite frequently on 2.0.1, going to look into downgrading to 1.0.0 as a short-term solution to the problem.

@jinschoi
Contributor

Here is what looks to be going on:

I see this error when I run a request, then after a brief pause, run another request, using SSL. It does not happen when I run the requests back to back. I got a full backtrace from line 460 of persistent.rb, and that showed that the IOError was being raised from buffering.rb:145 in read_nonblock at a call to sysread_nonblock, called from net/protocol.rb:135 in rbuf_fill.

Digging down, it looks like sysread_nonblock is implemented (in 1.9.2) in the core file ossl_ssl.c, and only returns an IOError if one of two errors occurred, SSL_ERROR_ZERO_RETURN or SSL_ERROR_SYSCALL. The documentation here: http://www.openssl.org/docs/ssl/SSL_get_error.html
indicates that either the connection has been closed, or an EOF was read that violates the SSL protocol. I'm guessing it is the first case because of the behavior with delays. So it looks like the problem is that regardless of the Connection: Keep-Alive setting, the SSL connection can go down and you get an EOFError. In the non-SSL situation, there is a similar problem with IO.read_nonblock() also having the possibility of returning EOFError.

The problem is going to be figuring out when that is the case so you can retry safely for non-idempotent queries. knu's workaround will work if you know it is safe to do so. You can't just always retry on EOFError because other levels can throw EOFError for different reasons.

@dgmdan
dgmdan commented Sep 9, 2011

I'm getting a similar error using 2.0.1. It happens on a POST request in which the server takes about 2-3 min to respond. The error I get is a little different though: "too many connection resets (due to Resource temporarily unavailable - Timeout::Error)" Fixed it temporarily by reverting back to 1.0.0 and setting high agent.read_timeout and agent.open_timeout settings.

@lxcid
lxcid commented Sep 23, 2011

we are facing similar problem though. i think we will choose the downgrade path and test out 2.0.2 again when its released.

@mohamedhafez

anybody also getting occasional SocketError's about getaddrinfo failing because it couldnt find the remote host in addition to the ECONNRESET errors because of this bug? or is this a different problem that i'm having?

@woto
woto commented Oct 3, 2011

Having same problem :(

@madsheep
madsheep commented Oct 5, 2011

any updates?

@ghost
ghost commented Oct 7, 2011

Hey any updates... same here...

@bhochhi
bhochhi commented Oct 7, 2011

I started using selenium web driver.

@cantonic
cantonic commented Oct 8, 2011

same here... this issue is 4 months old now. can we await any updates?

@cantonic
cantonic commented Oct 8, 2011

I don't know what happened, but it is working for me now...

things i have done:
installed watir-webdriver
installed mechanize 1.0.0
uninstalled mechanize 1.0.0 and installed 2.0.1 again

@drbrain
Member
drbrain commented Oct 8, 2011

The issue is related to:

  1. The server you are connecting to
  2. The types of requests you are making (idempotent GET requests vs non-idemponent POST requests)

Without example scripts to illustrate and reproduce your specific problem it is difficult to find a "fix" for your specific program.

@jinschoi
Contributor
jinschoi commented Oct 8, 2011

Here is one example of a server I've come across that triggers this behavior:

a = Mechanize.new
result = a.get('https://junecloud.com/sync/deliveries/') do |page|
  sleep 10
  page.form_with(:action => "./").submit
end

It works without the sleep, which makes me think it has something to do with the keep alive behavior of the server.

@nahi
nahi commented Oct 12, 2011

I get the same error with @jinschoi's script.

And here's a similar trace from httpclient. Relevant part is 'KeepAliveDisconnected'. HTTPClient tries to re-post under some condition since we might not be able to detect a socket disconnection by peer.

EDIT: I forgot to add this URL: https://gist.github.com/1280318

@drbrain
Member
drbrain commented Oct 25, 2011

I've released net-http-persistent 2.2 which will reset connections that have been idle for 5 seconds. Can some of you try your scripts with master @1fd7c77 or newer?

@nahi
nahi commented Oct 25, 2011

From my investigation at that time, the server for 'https://junecloud.com/sync/deliveries/' seems to have 1 or 2 sec as KeepAliveTimeout.

I'm writing this because I just thought that the second access in 1~4 sec might raise an error as same as before.

@drbrain
Member
drbrain commented Oct 25, 2011

I picked 5s as the default because Apache uses it. If there's a better default I can change it, but I need feedback first.

I can have net-http-persistent display the idle time for a socket that needs to be reset. That might help.

Right now users can adjust the timeout through Mechanize#idle_timeout=

@nahi
nahi commented Oct 25, 2011

It's good to be able to configure.

But I thought that net-http-persistent might want to reopen the transport connection and retransmit the aborted sequence of requests when a peer disconnected the connection, according to

8.1.4 Practical Considerations
9.1.2 Idempotent Methods

of RFC2616.

httpclient always do that without user interaction even if the request is not idempotent. I don't know how browsers are doing. Mechanize might want to behave more like browsers.

@drbrain
Member
drbrain commented Oct 25, 2011

On Oct 25, 2011, at 9:47 AM, alcalaerick86 wrote:

I used the latest net-http-persistent , and still get same error, after many times trying some of the times login does work, some do not.

Can you post your a script that I can use to reproduce?

@drbrain
Member
drbrain commented Oct 25, 2011

@nahi net-http-persistent implements 8.1.4 and 9.1.2 and allows overriding per paragraph 4 via #retry_change_requests which is just now exposed in mechanize.

Browsers usually display a dialog box like "Do you want to resubmit this form?" when a POST needs to be resubmitted.

@drbrain
Member
drbrain commented Oct 25, 2011

Modifying @jinschio's script like so:

require 'mechanize'

a = Mechanize.new
a.agent.set_http
a.agent.http.debug_output = $stderr

result = a.get('https://junecloud.com/sync/deliveries/') do |page|
  sleep 10
  page.form_with(:action => "./").submit
end

From the debug output from Net::HTTP I see:

$ ruby19 -Ilib t.rb 
opening connection to junecloud.com...
opened
<- "GET /sync/deliveries/ HTTP/1.1[…]"
-> "HTTP/1.1 200 OK\r\n"
[…]
read 5403 bytes
Conn keep-alive
[pause for 10 seconds]
opening connection to junecloud.com… [new connection created here due to idle timeout]
opened
<- "POST /sync/deliveries/ HTTP/1.1[…]"
<- "cmd=login&type=web&email=&password=&newpassword=&confirmpass=&name="
-> "HTTP/1.1 200 OK\r\n"
read 5435 bytes
Conn keep-alive

Reducing the sleep value below 5s (the default idle timeout) the script fails with "too many connection resets" until it is reduced below 1s (0.9s works):

require 'mechanize'

a = Mechanize.new
a.idle_timeout = 0.9

result = a.get('https://junecloud.com/sync/deliveries/') do |page|
  sleep 1
  page.form_with(:action => "./").submit
end

If you've commented on this issue can you report a value of idle_timeout that works for your application?

@alcalaerick86
require 'mechanize'
require 'logger'
agent = Mechanize.new{|a| a.log = Logger.new(STDERR) }
agent.read_timeout = 60
def add_cookie(agent, uri, cookie)
  uri = URI.parse(uri)
  Mechanize::Cookie.parse(uri, cookie) do |cookie|
    agent.cookie_jar.add(uri, cookie)
  end
end
page = agent.get "http://www.sistemasaplicados.com.mx"
form = page.forms.first
form.correo_ingresar = "ing.alcala@ofixcomp.com"
form.password = "ofixcomp"
page = agent.submit form

It worked, had to do a gem clean, to wipe out the old 1.9 nthttppersister. I dont know if it has to do anything with this, but it does not forward my mechanize page to the one its supposed to.,

@drbrain
Member
drbrain commented Oct 25, 2011

@alcalaerick86 that's good. It also looks like www.sistemasaplicados.com.mx has a keep-alive timeout of at least 10 seconds (but less than 15).

@nahi
nahi commented Oct 25, 2011

@drbrain Ah, I understood. Sorry for the noise.

Browsers usually display a dialog box like "Do you want to resubmit this form?" when a POST needs to be resubmitted.

Yes, but have you ever seen it by the reason that the server disconnected a connection by KeepAliveTimeout? It must look 'Dialog popup just after pushing [submit] button'...

@jinschoi
Contributor

The idle timeout appears to work, and is a fine workaround, but it is very dependent on the actual timeouts involved. A better solution might be to modify net/http/persistent.rb to reopen a connection when an EOFError is thrown due to OpenSSL. The snippet included by @nahi on Oct 12 seems to suggest that httpclient does exactly this.

@drbrain
Member
drbrain commented Oct 25, 2011

@jinschio The HTTP spec doesn't allow mechanize to do that by default. Read RFC 2616 section 8.1.4 paragraph 4:

This means that clients, servers, and proxies MUST be able to recover from asynchronous close events. Client software SHOULD reopen the transport connection and retransmit the aborted sequence of requests without user interaction so long as the request sequence is idempotent (see section 9.1.2). Non-idempotent methods or sequences MUST NOT be automatically retried, although user agents MAY offer a human operator the choice of retrying the request(s). Confirmation by user-agent software with semantic understanding of the application MAY substitute for user confirmation. The automatic retry SHOULD NOT be repeated if the second sequence of requests fails.

So for a GET it is OK to retry once, but not for a POST. This is to prevent duplicate records from being changed or modified in an unintended way.

Without modifying net/http I don't think there's a way to detect the socket close without attempting to make a request, and the error doesn't occur until after the request body has been sent, so I can't tell if the request was received or not.

You may set agent.retry_change_requests = true (per "semantic understanding of the application") if you know this won't cause problems for your application (like a login or search form).

@nahi
nahi commented Oct 25, 2011

@drbrain is right. So the resolution should be; Non idempotent request must be done with fresh (non keep-alive) connection. Though I still haven't checked browsers.

@drbrain
Member
drbrain commented Oct 25, 2011

@nahi I'm wondering if the browsers have a better way of detecting a server-closed socket than net/http does. Would SO_KEEPALIVE help? Some other socket option I don't know about?

I tried running dtruss on Safari, but it didn't reveal any use of the BSD socket API. I haven't pulled the firefox code to dig through, either.

@nahi
nahi commented Oct 26, 2011

@drbrain You can detect TCP connection close by doing IO multiplexing (Rewriting net/http with IO.select) but I don't think it's a case. We cannot detect without actually sending a packet.

@jinschoi
Contributor

Okay. How about some way to always ignore keep alive and always use a new connection? Will setting the timeout to 0 accomplish that?

@drbrain
Member
drbrain commented Oct 26, 2011

@jinschoi I will make @mechanize.keep_alive = false work again tomorrow, but setting the idle timeout to 0 (or -1) will also accomplish that.

@drbrain
Member
drbrain commented Oct 27, 2011

@jinschoi can you try @839c008:

require 'mechanize'
a = Mechanize.new
a.keep_alive = false

result = a.get('https://junecloud.com/sync/deliveries/') do |page|
  sleep 10
  page.form_with(:action => "./").submit
end
@drbrain
Member
drbrain commented Oct 27, 2011

@bjoseph I can't get your etrade example to work at all, I fear that they have decided I am a hacker ☹

@drbrain
Member
drbrain commented Nov 3, 2011

I think it is safe to close this issue now. If you have issues with mechanize trunk and neither the new idle_timeout setting nor disabling keep_alive fix the issue please comment!

@drbrain drbrain closed this Nov 3, 2011
@drbrain
Member
drbrain commented Nov 4, 2011
gem install hoe
git clone git://github.com/tenderlove/mechanize.git
cd mechanize
rake package
gem install pkg/mechanize-2.1.gem
@manuelmeurer

That returns an error: https://gist.github.com/1340608

After executing, there is a mechanize-2.1.gem in /pkg.
Is that usable?

@drbrain
Member
drbrain commented Nov 4, 2011

Ugh, check_extra_deps depends on a rubygems feature that doesn't exist yet.

Yes, you can run gem install pkg/mechanize-2.1.gem and it will work.

I've updated my comment above to have working instructions.

Also, I will release a prerelease version of mechanize 2.1 on Monday.

@manuelmeurer

Alright, I think I'll wait till Monday then. :)

@todddickerson

Would love that pre-release version, having trouble getting the method mentioned the other day working on engineyard

@drbrain
Member
drbrain commented Nov 8, 2011

I need to finish #163 before a pre-release, please be patient.

@todddickerson

No problem, thanks for making Mechanize awesome! I also got this working on engine yard in the mean time by unpacking and using it as a plugin, and manually including all the dependency gems... incase anyone else needs the same!

@jtwalters

I'm also getting a similar error (related?):

/Users/joel/.rvm/gems/ruby-1.9.1-p378/gems/net-http-persistent-2.5.2/lib/net/http/persistent.rb:821:in `rescue in request': too many connection resets (due to closed stream - IOError) after 0 requests on 70275381732612, last used 1330528924.64938 seconds ago (Net::HTTP::Persistent::Error)

I tried various settings for keep_alive and idle_timeout to no avail. Downgrading to Mechanize 1.0.0 resolves the issue for me.

@drbrain
Member
drbrain commented Mar 19, 2012

@jtwalters what did you set your idle_timeout to? Does it work with an idle_timeout of 0? Does it work with an idle_timeout of 1?

Can you provide a script to reproduce or at least the host you are connecting to? If so, please open a new ticket.

Without such information it is impossible to determine where the problem lies.

@jtwalters

I tried idle_timeout 0, 1, 5 I believe but that didn't resolve the issue I was having. I am pretty sure it has something to do with the host I am connecting to being flaky, since I often have latency and issues viewing the website in my own Internet browser. Recently, I haven't had the issues so I can't reproduce the issue anymore.

@theglauber

I'm having this problem today, also, and my code worked before, with the same server. Setting idle_timeout to 1, 0, 0.5, 2, didn't help. Setting keep_alive to false didn't help either.

@KronicDeth

Upgrading from Mechanize 2.1 to 2.4. Previously setting agent.keep_alive = false worked to allow submitting the login form at https://alist.traackr.com/users/login. After upgrade I got the connection reset error. Changed from agent.keep_alive = false to using knu's manual reset fix (https://github.com/tenderlove/mechanize/issues/123#issuecomment-1887211) before calling submit on the login form

@drbrain drbrain reopened this May 1, 2012
@drbrain
Member
drbrain commented May 4, 2012

I would like to fix this bug but I need a script I can run to reproduce it.

If you have a script I can run on my local machine please include it.

If the site requires login credentials and you can provide me with a temporary account please email the script to drbrain@segment7.net

Without a way to reproduce the issue you are seeing I can't fix this bug.

@mamantoha

error after upgrading from 2.4 to 2.5

/home/mama/.rvm/gems/ruby-1.9.3-p194/gems/net-http-persistent-2.6/lib/net/http/persistent.rb:839:in `rescue in request': too many connection resets (due to Timeout::Error - Timeout::Error) after 0 requests on 86714710, last used 1336679019.672163 seconds ago (Net::HTTP::Persistent::Error)
        from /home/mama/.rvm/gems/ruby-1.9.3-p194/gems/net-http-persistent-2.6/lib/net/http/persistent.rb:848:in `request'
        from /home/mama/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5/lib/mechanize/http/agent.rb:258:in `fetch'
        from /home/mama/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5/lib/mechanize/http/agent.rb:944:in `response_redirect'
        from /home/mama/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5/lib/mechanize/http/agent.rb:299:in `fetch'
        from /home/mama/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5/lib/mechanize.rb:1229:in `post_form'
        from /home/mama/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5/lib/mechanize.rb:515:in `submit'
        from /home/mama/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5/lib/mechanize/form.rb:178:in `submit'
        from /media/system/ruby/github/vkontakte/lib/vkontakte/client.rb:59:in `login!'
        from get_online_friends.rb:24:in `<main>'
@mamantoha

as well as an example of authorization on the Rubyforge(https://github.com/tenderlove/mechanize/blob/master/examples/rubyforge.rb) does not work with 2.5

@drbrain
Member
drbrain commented May 11, 2012

@mamantoha Your error is due to a timeout (Timeout::Error) not a connection reset (Errno::ECONNRESET), you will need to open a separate issue.

In that issue please include a mechanize log of your client's interaction with the server.

In that issue please include the steps I need to run to reproduce your failure on my machine.

In that issue please include instructions on setting up any accounts needed on the remote servers to reproduce your failure on my machine.

Without these steps I cannot investigate your issue.

PS: I ran the rubyforge example and it is failing due to bad chunking on the server:

-> "HTTP/1.1 200 OK\r\n"
-> "Date: Fri, 11 May 2012 19:42:07 GMT\r\n"
-> "Server: Apache\r\n"
-> "X-Powered-By: PHP/4.4.9\r\n"
-> "Cache-Control: private\r\n"
-> "Content-Encoding: gzip\r\n"
-> "Vary: Accept-Encoding\r\n"
-> "Keep-Alive: timeout=3, max=100\r\n"
-> "Connection: Keep-Alive\r\n"
-> "Transfer-Encoding: chunked\r\n"
-> "Content-Type: text/html\r\n"
-> "\r\n"
-> "26f1\r\n"
reading 9969 bytes…
[…]
read 9969 bytes
reading 2 bytes...
-> "\r\n"
read 2 bytes
Exception `Errno::EAGAIN' at /usr/local/lib/ruby/2.0.0/net/protocol.rb:153 - Resource temporarily unavailable - read would block
Exception `Net::ReadTimeout' at /usr/local/lib/ruby/2.0.0/net/protocol.rb:158 - Net::ReadTimeout
Conn close because of error Net::ReadTimeout

The last-chunk is missing entirely. Unfortunately this is not detectable by #ignore_bad_chunking which depends on a last-chunk without the terminating CRLF.

@mamantoha

@drbrain how to logging what Mechanize does?

@drbrain
Member
drbrain commented May 11, 2012

@mamantoha:

require 'logger'
require 'mechanize'

mech = Mechanize.new
mech.log = Logger.new $stderr
@mamantoha

@drbrain
and

agent = Mechanize.new
agent.agent.http.debug_output = $stderr
@wilkerlucio

having similar problem here:

 Net::HTTP::Persistent::Error:
   too many connection resets (due to HTTP session not yet started - IOError) after 0 requests on 2155378560, last used 1338607416.911736 seconds ago

this error doesn't happens using net-http-persistent 1.9 (downgrading Mechanize).

@wilkerlucio

Fixed here by removing Webmock from my gems, I'm actually using VCR, so, replaced Webmock with Fakeweb and it's working now :)

@alup
alup commented Jun 19, 2012

having the same error after upgrading

too many connection resets (due to end of file reached - EOFError) after 1 requests on 34751720, last used 0.014726077 seconds ago

I am facing this error in production while I am not having this in development. The main difference is that I am using a proxy in production.

@alup
alup commented Jun 19, 2012

Fixed by setting agent.keep_alive = false. That was problematic in conjunction with the usage of tiny proxy.

@treeder
treeder commented Jul 26, 2012

Having the same problem, seems to be after some large number of requests. If I reduce the request counts, it doesn't occur. This is during load testing with many threads, runs for a bit, then starts throwing these errors.

@subelsky

I stumbled onto this but fixed it using @wilkerlucio's suggestion to remove webmock. Maybe some weird interaction between Mechanize and webmock. /cc @bblimke

@bblimke
bblimke commented Dec 28, 2012

@subelsky WebMock is currently known not to work with Net::HTTP::Persistent and there is no fix yet.

@subelsky

@bblimke got it, thanks for clarifying!

@subelsky

@myronmarston just a heads-up this issue here is the only thing keeping me on FakeWeb for certain projects, where I'm faking out a Mechanize connection that uses Net::HTTP::Persistent. Webmock can't do it so I fall back to fakeweb.

@mrbrdo
mrbrdo commented Apr 14, 2013

I am using ruby-1.9.3-p374 (not using webmock and it's not a dependency to anything I have) and I keep getting these errors all the time (using mechanize 2.5.1, net-http-persistent 2.8). It's becoming so annoying sometimes I want to scream. I even tried calling mechanize.agent.http.shutdown but it doesn't really mitigate the problem.

I keep getting these errors:

Net::HTTP::Persistent::Error: too many connection resets (due to end of file reached - EOFError) after 1 requests on -614078038, last used 1.896244976 seconds ago
Net::HTTP::Persistent::Error: connection refused: domain:443
Mechanize::ChunkedTerminationError: end of file reached (Mechanize::ChunkedTerminationError)
Errno::ETIMEDOUT: Connection timed out - connect(2) in net-http-persistent-2.8/lib/net/http/persistent/ssl_reuse.rb:29→ initialize
OpenSSL::SSL::SSLError: SSL_connect SYSCALL returned=5 errno=0 state=SSLv2/v3 read server hello A in net-http-persistent-2.8/lib/net/http/persistent/ssl_reuse.rb:70→ connect
Timeout::Error: Timeout::Error in net-http-persistent-2.8/lib/net/http/persistent.rb:570→ connection_for

I mean this is a bit ridiculous, why should I care about all this stuff as an end-user. I really don't care, it's just some low-level stuff that is completely irrelevant to me. I have to build a whole system around mechanize just to cope with this stuff.

@leejarvis
Member

@mrbrdo Can you show me the code you're using to reproduce these problems? I'd like to tackle this issue and get it closed asap

@bsgreenb

Hey guys I intermittently experience this bug. Any updates?

@mrbrdo
mrbrdo commented Jun 22, 2013

Well I am beginning to think that this error could be due to bad proxies. I am using proxies and I was getting the error, I though it was related to mechanize but it could be just bad proxies. I will go through my logs later and let you know.

@mrbrdo
mrbrdo commented Jun 22, 2013

Hm well it's hard to say. I do sometimes get through after a retry or two, but I don't know if that's because of the proxy or because of Mechanize.

@jeroeningen

I use Mechanize 2.7.1 and net-http-persistent 2.8 and I got the following error in my Rspec tests:

     Net::HTTP::Persistent::Error:
       too many connection resets (due to closed stream - IOError) after 0 requests on 26130520, last used 1372252918.7879653 seconds ago

I also use VCR (https://rubygems.org/gems/vcr). Disabling VCR for the specific tests was for me the solution. I disavled VCR as follows:

  before(:all) do
    VCR.turn_off!
  end
  after(:all) do
    VCR.turn_on!
  end

@bblimke
bblimke commented Jun 26, 2013

@jeroeningen that's probably due to WebMock. I'm still trying to figure out how to solve net-http-persistent compatibility.

@bsgreenb
bsgreenb commented Jul 3, 2013

I also use Mechanize 2.7.1 and net-http-persistent 2.8, but don't use VCR
and I intermittently get it

On Wed, Jun 26, 2013 at 12:36 PM, Bartosz Blimke
notifications@github.comwrote:

@jeroeningen https://github.com/jeroeningen that's probably due to
WebMock. I'm still trying to figure out how to solve net-http-persistent
compatibility.


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-20074091
.

@bblimke
bblimke commented Jul 3, 2013

@bsgreenb can you also confirm that WebMock is not loaded?

@bsgreenb
bsgreenb commented Jul 3, 2013

I don't have the webmock gem installed. I do have fakeweb but im not
enabling it when this bug occurs

On Wed, Jul 3, 2013 at 12:07 PM, Bartosz Blimke notifications@github.comwrote:

@bsgreenb https://github.com/bsgreenb can you also confirm that WebMock
is not loaded?


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-20438247
.

@bblimke
bblimke commented Jul 5, 2013

For those using WebMock (or VCR with WebMock), version 1.13.0 fixes Net::HTTP::Persistent compatibility.

@aquasync

This has been failing for me with a mixture of "too many connection resets" and plain timeouts pretty consistently for the last few months. Attempts to set idle_timeout, keep_alive and retry_change_requests doesn't help.

client = Mechanize.new
page = client.get 'http://www.us.hsbc.com'

Meanwhile (in the same irb window) this will generally work fine:

html = open('http://www.us.hsbc.com') { |f| f.read }
@bsgreenb

This bug still exists. It's the biggest problem with mechanize. I really wish it could be resolved. Here is another example:

agent = Mechanize.new
agent.get('https://site.com/whatever') 
#=> Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0 requests on 70362676419080, last used 1379112036.219295 seconds ago

agent.verify_mode = OpenSSL::SSL::VERIFY_NONE
#=> 0
agent.get('https://site.com/whatever') 
#=> Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0 requests on 70362676321280, last used 1379112135.65024 seconds ago\

Here is the agent:

#<Mechanize:0x007ffd2ca88f50 @agent=#<Mechanize::HTTP::Agent:0x007ffd2ca88ed8 @allowed_error_codes=[], @conditional_requests=true, @context=#<Mechanize:0x007ffd2ca88f50 ...>, @content_encoding_hooks=[], @cookie_jar=#<Mechanize::CookieJar:0x007ffd2ca88e60 @store=#<HTTP::CookieJar::HashStore:0x007ffd2ca88de8 @mon_owner=nil, @mon_count=0, @mon_mutex=#<Mutex:0x007ffd2ca88d98>, @logger=nil, @gc_threshold=150, @jar={}, @gc_index=0>>, @follow_meta_refresh=false, @follow_meta_refresh_self=false, @gzip_enabled=true, @history=[], @ignore_bad_chunking=false, @keep_alive=true, @max_file_buffer=100000, @open_timeout=nil, @post_connect_hooks=[], @pre_connect_hooks=[], @read_timeout=nil, @redirect_ok=true, @redirection_limit=20, @request_headers={}, @robots=false, @user_agent="Mechanize/2.7.2 Ruby/2.0.0p247 (http://github.com/sparklemotion/mechanize/)", @webrobots=nil, @auth_store=#<Mechanize::HTTP::AuthStore:0x007ffd2ca88b68 @auth_accounts={}, @default_auth=nil>, @authenticate_parser=#<Mechanize::HTTP::WWWAuthenticateParser:0x007ffd2ca88a78 @scanner=nil>, @authenticate_methods={}, @digest_auth=#<Net::HTTP::DigestAuth:0x007ffd2ca889d8 @mon_owner=nil, @mon_count=0, @mon_mutex=#<Mutex:0x007ffd2ca88988>, @nonce_count=-1>, @digest_challenges={}, @pass=nil, @scheme_handlers={"http"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>, "https"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>, "relative"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>, "file"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>}, @http=#<Net::HTTP::Persistent:0x007ffd2ca88578 @name="mechanize", @debug_output=nil, @proxy_uri=nil, @no_proxy=[], @headers={}, @override_headers={}, @http_versions={}, @keep_alive=300, @open_timeout=nil, @read_timeout=nil, @idle_timeout=5, @max_requests=nil, @socket_options=[[6, 1, 1]], @generation_key=:net_http_persistent_mechanize_generations, @ssl_generation_key=:net_http_persistent_mechanize_ssl_generations, @request_key=:net_http_persistent_mechanize_requests, @timeout_key=:net_http_persistent_mechanize_timeouts, @certificate=nil, @ca_file=nil, @private_key=nil, @ssl_version=nil, @verify_callback=nil, @verify_mode=1, @cert_store=nil, @generation=1, @ssl_generation=1, @reuse_ssl_sessions=true, @retry_change_requests=false, @ruby_1=false, @retried_on_ruby_2=true>>, @log=nil, @watch_for_set=nil, @history_added=nil, @pluggable_parser=#<Mechanize::PluggableParser:0x007ffd2ca880c8 @parsers={"text/html"=>Mechanize::Page, "application/xhtml+xml"=>Mechanize::Page, "application/vnd.wap.xhtml+xml"=>Mechanize::Page, "image"=>Mechanize::Image, "text/xml"=>Mechanize::XmlFile, "application/xml"=>Mechanize::XmlFile}, @default=Mechanize::File>, @keep_alive_time=0, @proxy_addr=nil, @proxy_port=nil, @proxy_user=nil, @proxy_pass=nil, @html_parser=Nokogiri::HTML, @default_encoding=nil, @force_default_encoding=false>

I'm able to Curl the https url just fine and get the HTML response. It's clearly something in the ruby code that's to blame.

@drbrain
Member
drbrain commented Sep 13, 2013

Disabling the verify_mode does nothing if the remote end hangs up on you.

To diagnose this problem you'll need to show the URL you are connecting to. https://site.com does not exist:

$ ruby -rmechanize -e 'Mechanize.new.get "https://site.com/whatever"'
/usr/local/lib/ruby/2.0.0/net/http.rb:878:in `initialize': Operation timed out - connect(2) (Errno::ETIMEDOUT)

PS: Use code fences:

```
your code here
```
@bsgreenb

https://isapps.acxiom.com/optout/optout.aspx

On Fri, Sep 13, 2013 at 3:53 PM, Eric Hodel notifications@github.comwrote:

Disabling the verify_mode does nothing if the remote end hangs up on you.

To diagnose this problem you'll need to show the URL you are connecting
to. https://site.com does not exist:

$ ruby -rmechanize -e 'Mechanize.new.get "https://site.com/whatever"'
/usr/local/lib/ruby/2.0.0/net/http.rb:878:in `initialize': Operation timed out - connect(2) (Errno::ETIMEDOUT)

PS: Use code fenceshttps://help.github.com/articles/github-flavored-markdown#fenced-code-blocks
:

your code here


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-24429975
.

@drbrain
Member
drbrain commented Sep 13, 2013

@bsgreenb your problem is not the problem described in this issue, but is a server negotiation problem as I can't connect using SSLSocket defaults:

$ ruby -rsocket -ropenssl -e 'io = TCPSocket.open "isapps.acxiom.com", 443; OpenSSL::SSL::SSLSocket.new(io).connect'
-e:1:in `connect': Connection reset by peer - SSL_connect (Errno::ECONNRESET)
    from -e:1:in `<main>'

Setting the ssl_version in the mechanize agent allows you to connect, but shows the certificate is missing (for me):

$ ruby -rmechanize -e 'm = Mechanize.new; m.agent.ssl_version = :TLSv1; m.get "https://isapps.acxiom.com/optout/optout.aspx"'
/usr/local/lib/ruby/2.0.0/net/http.rb:918:in `connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)

Here's the certificate chain which should help you track down the right certificate if you are also missing it:

Certificate chain
 0 s:/C=US/ST=Arkansas/L=Conway/O=Acxiom Corporation/CN=isapps.acxiom.com
   i:/C=US/O=Entrust, Inc./OU=www.entrust.net/rpa is incorporated by reference/OU=(c) 2009 Entrust, Inc./CN=Entrust Certification Authority - L1C

This was from openssl s_client -host isapps.acxiom.com -port 443

I strongly recommend you DO NOT SET verify_mode = OpenSSL::SSL::VERIFY_NONE.

@bsgreenb

Thanks.

What are the steps involved in adding a certificate that mechanize can
read? Any docs on this?

On Fri, Sep 13, 2013 at 4:25 PM, Eric Hodel notifications@github.comwrote:

@bsgreenb https://github.com/bsgreenb your problem is not the problem
described in this issue, but is a server negotiation problem as I can't
connect using SSLSocket defaults:

$ ruby -rsocket -ropenssl -e 'io = TCPSocket.open "isapps.acxiom.com", 443; OpenSSL::SSL::SSLSocket.new(io).connect'
-e:1:in connect': Connection reset by peer - SSL_connect (Errno::ECONNRESET) from -e:1:in

'

Setting the ssl_version in the mechanize agent allows you to connect, but
shows the certificate is missing (for me):

$ ruby -rmechanize -e 'm = Mechanize.new; m.agent.ssl_version = :TLSv1; m.get "https://isapps.acxiom.com/optout/optout.aspx"'
/usr/local/lib/ruby/2.0.0/net/http.rb:918:in `connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)

Here's the certificate chain which should help you track down the right
certificate if you are also missing it:

Certificate chain
0 s:/C=US/ST=Arkansas/L=Conway/O=Acxiom Corporation/CN=isapps.acxiom.com
i:/C=US/O=Entrust, Inc./OU=www.entrust.net/rpa is incorporated by reference/OU=(c) 2009 Entrust, Inc./CN=Entrust Certification Authority - L1C

This was from openssl s_client -host isapps.acxiom.com -port 443

I strongly recommend you DO NOT SET verify_mode =
OpenSSL::SSL::VERIFY_NONE.


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-24431143
.

@drbrain
Member
drbrain commented Sep 14, 2013

You should be able to retrieve the CA cert from your browser (export it in PEM format) and add it to a OpenSSL::X509::Store then set Mechanize#cert_store=

@bsgreenb

So when I access the url it's a chain of certificates that goes:
Entrust.net Certification Authority (2048) > Entrust Certification
Authority - L1C > isapps.acxiom.com

Which one should I export?

On Fri, Sep 13, 2013 at 5:08 PM, Eric Hodel notifications@github.comwrote:

You should be able to retrieve the CA cert from your browser (export it in
PEM format) and add it to a OpenSSL::X509::Storehttp://www.ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL/X509/Store.htmlthen set
Mechanize#cert_store=http://mechanize.rubyforge.org/Mechanize.html#method-i-cert_store-3D


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-24432482
.

@drbrain
Member
drbrain commented Sep 14, 2013

Try both Entrust.net certificates, but you may be able to get away with the "(2048)" one alone. You'll need to add each separately to the Store.

@aquasync

Hmm, thanks for the reference to ssl_version. Setting that to :TLSv1 fixes access to us.hsbc.com also. It seems kind of odd that open-uri works fine - I guess some difference in the defaults?

@bsgreenb

Eric,

I manually set the cert_store to the exported .pem file and I still get the
error.

Here's what I did:
-Went to Chrome and exported the Entrust.net Certification Authority (2048)
self-signed certificate to a .pem.
-Ran from console:
cert_store = OpenSSL::X509::Store.new
cert_store.add_file '/Users/ben/sources/test.pem'
agent = Mechanize.new
agent.cert_store = cert_store
agent.get 'https://isapps.acxiom.com/optout/optout.aspx'

And I get:Net::HTTP::Persistent::Error: too many connection resets (due to
Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0
requests on 70362622244720, last used 1379358504.9168751 seconds ago

On Fri, Sep 13, 2013 at 11:17 PM, aquasync notifications@github.com wrote:

Hmm, thanks for the reference to ssl_version. Setting that to :TLSv1 fixes
access to us.hsbc.com also. It seems kind of odd that open-uri works fine

  • I guess some difference in the defaults?


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-24437781
.

@bsgreenb

curl --cacert test.pem -v
https://isapps.acxiom.com/optout/optout.aspxworks though:

$ curl --cacert test.pem -v https://isapps.acxiom.com/optout/optout.aspx

  • About to connect() to isapps.acxiom.com port 443 (#0)
  • Trying 198.160.97.162...
  • connected
  • Connected to isapps.acxiom.com (198.160.97.162) port 443 (#0)
  • successfully set certificate verify locations:
  • CAfile: test.pem
    CApath: none
  • SSLv3, TLS handshake, Client hello (1):
  • SSLv3, TLS handshake, Server hello (2):
  • SSLv3, TLS handshake, CERT (11):
  • SSLv3, TLS handshake, Server finished (14):
  • SSLv3, TLS handshake, Client key exchange (16):
  • SSLv3, TLS change cipher, Client hello (1):
  • SSLv3, TLS handshake, Finished (20):
  • SSLv3, TLS change cipher, Client hello (1):
  • SSLv3, TLS handshake, Finished (20):
  • SSL connection using RC4-MD5
  • Server certificate:
  • subject: C=US; ST=Arkansas; L=Conway; O=Acxiom Corporation; CN=
    isapps.acxiom.com
  • start date: 2013-04-03 20:08:38 GMT
  • expire date: 2014-04-23 01:43:34 GMT
  • subjectAltName: isapps.acxiom.com matched
  • issuer: C=US; O=Entrust, Inc.; OU=www.entrust.net/rpa is incorporated by
    reference; OU=(c) 2009 Entrust, Inc.; CN=Entrust Certification Authority -
    L1C
  • SSL certificate verify ok.

GET /optout/optout.aspx HTTP/1.1
User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0
OpenSSL/0.9.8r zlib/1.2.5
Host: isapps.acxiom.com
Accept: /

< HTTP/1.1 200 OK

On Mon, Sep 16, 2013 at 12:11 PM, Ben Greenberg bsgreenb@gmail.com wrote:

Eric,

I manually set the cert_store to the exported .pem file and I still get
the error.

Here's what I did:
-Went to Chrome and exported the Entrust.net Certification Authority
(2048) self-signed certificate to a .pem.
-Ran from console:
cert_store = OpenSSL::X509::Store.new
cert_store.add_file '/Users/ben/sources/test.pem'
agent = Mechanize.new
agent.cert_store = cert_store
agent.get 'https://isapps.acxiom.com/optout/optout.aspx'

And I get:Net::HTTP::Persistent::Error: too many connection resets (due to
Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0
requests on 70362622244720, last used 1379358504.9168751 seconds ago

On Fri, Sep 13, 2013 at 11:17 PM, aquasync notifications@github.comwrote:

Hmm, thanks for the reference to ssl_version. Setting that to :TLSv1
fixes access to us.hsbc.com also. It seems kind of odd that open-uri
works fine - I guess some difference in the defaults?


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-24437781
.

@bsgreenb

Think I've finally resolved the issues I was running into with these errors. It comes down to this characteristic:

"Mechanize defaults to validating SSL certificates using the default CA certificates for your platform."

Mechanize might be fine at request SSL sites from one machine and failing from another for this reason. That's why I recommend everyone specify a cert file that's in version control and can be used across environments.

Also, to the maintainers of Mechanize,

  1. do you think you guys could make it actually point to the default CA path being used when the user doesn't provide it? This makes debugging issues with certs easier across different machines.
  2. you guys should have a wiki on this, which troubleshoots these types of errors. based on the comments here there are some standard solutions
@drbrain
Member
drbrain commented Sep 24, 2013

Mechanize uses the default CA path, see:

https://github.com/drbrain/net-http-persistent/blob/master/lib/net/http/persistent.rb#L1171

If you don't have the necessary root certificates in your default CA path there's not much we can do about that. Clearer errors need to happen in openssl and net/http over in https://bugs.ruby-lang.org

I'll see what I can do for better documentation for the "too many connection resets" problems.

@bsgreenb

Also wanted to add that I had to set agent.ssl_version = :TLSv1 for it to
work with the acxiom url as you mentioned earlier. Any idea why a site
would only work with that version of SSL? Would love to get debug info on
the exact SSL handshake thats going on so I could submit a bug report.

On Mon, Sep 23, 2013 at 9:53 PM, Eric Hodel notifications@github.comwrote:

Mechanize uses the default CA path, see:

https://github.com/drbrain/net-http-persistent/blob/master/lib/net/http/persistent.rb#L1171

If you don't have the necessary root certificates in your default CA path
there's not much we can do about that. Clearer errors need to happen in
openssl and net/http over in https://bugs.ruby-lang.org

I'll see what I can do for better documentation for the "too many
connection resets" problems.


Reply to this email directly or view it on GitHubhttps://github.com/sparklemotion/mechanize/issues/123#issuecomment-24974630
.

@scottwb
scottwb commented Nov 9, 2013

Certs, CA paths, ssl_version, keep_alive, idle_timeout, retry_change_requests, etc...none of the suggestions above have had any effect at eliminating this bug from intermittently occurring in our high-volume production scrapers based on Mechanize.

As an experiment, I created a monkey-patch that simply shuts down the underlying persistent connection and tries again with a new one, whenever this error is caught. I've now been using this in production for high-volume scrapers for a few months. It has 100% eliminated this problem with no observable negative side-effects. YMMV.

Here's my post about this workaround, along with the code:

http://scottwb.com/blog/2013/11/09/defeating-the-infamous-mechanize-too-many-connection-resets-bug/

@drbrain
Member
drbrain commented Nov 12, 2013

@scottwb what is your request mix?

The HTTP spec only allows a retry of GET, HEAD and other idempotent methods. POST, PUT and other methods that modify data cannot be retried by a library like net-http-persistent, but can if you have application-specific knowledge. That's up to you to add to your library, I'm not comfortable adding such a thing to net-http-persistent because it may lead to a double-POST (which is bad) in untrained hands.

@leejarvis
Member

I'm going to close this for now. I think it's hard to come up with a literal fix that satisfies everything in this issue. Happy to discuss any problems in fresh issues.

@leejarvis leejarvis closed this Mar 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment