Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - Errno::ECONNRESET) after 2 requests on 14759220 #123

Closed
bjoseph opened this issue Jul 3, 2011 · 104 comments
Assignees
Milestone

Comments

@bjoseph
Copy link

@bjoseph bjoseph commented Jul 3, 2011

I am trying to do some simple screen scraping on etrade's website but am getting a similar issue that was reported by someone earlier. I read through that message thread but it doesn't look like it was ever resolved.

Here is the error I am getting back:

Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - Errno::ECONNRESET) after 2 requests on 14759220
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/net-http-persistent-1.8/lib/net/http/persistent.rb:446:in `rescue in request'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/net-http-persistent-1.8/lib/net/http/persistent.rb:422:in `request'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize/http/agent.rb:204:in `fetch'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize.rb:628:in `post_form'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize.rb:520:in `submit'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/mechanize-2.0.1/lib/mechanize/form.rb:167:in `submit'
    from (irb):74
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/railties-3.0.9/lib/rails/commands/console.rb:44:in `start'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/railties-3.0.9/lib/rails/commands/console.rb:8:in `start'
    from /Users/benny/.rvm/gems/ruby-1.9.2-p180@taxhaven/gems/railties-3.0.9/lib/rails/commands.rb:23:in `<top (required)>'
    from script/rails:6:in `require'
    from script/rails:6:in `<main>'

Here is my code:

require 'rubygems'
require 'mechanize'

agent = Mechanize.new
login_page = agent.get("https://www.etrade.com")
form = login_page.form_with(:action => '/login.fcc') 
form.USER     = "test"
form.PASSWORD = "test12"
form.submit

Any ideas?

Thanks

@chip
Copy link

@chip chip commented Jul 8, 2011

I am getting the same error when attempting an http POST, but it only occurs intermittently.

@shaiguitar
Copy link

@shaiguitar shaiguitar commented Jul 14, 2011

Same here.

@lankz
Copy link

@lankz lankz commented Jul 28, 2011

I encountered this too after upgrading from 1.0.0, along with this one in a few other places:

too many connection resets (due to end of file reached - EOFError) after 3 requests on 70046648458580

Easily resolved by sticking with 1.0.0, which seems much more stable compared to 2.x at the moment.

@styx
Copy link

@styx styx commented Aug 1, 2011

I'll try to find out when the regression came up.

@styx
Copy link

@styx styx commented Aug 1, 2011

Net::Http::Persistent introduced in 4d074f4

Some stuff from debug log:

I, [2011-08-01T11:01:20.015280 #7916]  INFO -- : Net::HTTP::Post: /login.fcc
D, [2011-08-01T11:01:20.015352 #7916] DEBUG -- : request-header: accept => */*
D, [2011-08-01T11:01:20.015384 #7916] DEBUG -- : request-header: user-agent => WWW-Mechanize/1.0.0 (http://rubyforge.org/projects/mechanize/)
D, [2011-08-01T11:01:20.015415 #7916] DEBUG -- : request-header: connection => keep-alive
D, [2011-08-01T11:01:20.015449 #7916] DEBUG -- : request-header: keep-alive => 300
D, [2011-08-01T11:01:20.015479 #7916] DEBUG -- : request-header: accept-encoding => gzip,identity
D, [2011-08-01T11:01:20.015509 #7916] DEBUG -- : request-header: accept-language => en-us,en;q=0.5
D, [2011-08-01T11:01:20.015540 #7916] DEBUG -- : request-header: host => us.etrade.com
D, [2011-08-01T11:01:20.015570 #7916] DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7
D, [2011-08-01T11:01:20.015600 #7916] DEBUG -- : request-header: cookie => WRC_ID=93.125.111.29-1312185679759; TB=8785
D, [2011-08-01T11:01:20.015630 #7916] DEBUG -- : request-header: referer => https://us.etrade.com/e/t/home
D, [2011-08-01T11:01:20.015665 #7916] DEBUG -- : request-header: content-type => application/x-www-form-urlencoded
D, [2011-08-01T11:01:20.015695 #7916] DEBUG -- : request-header: content-length => 54
E, [2011-08-01T11:01:20.306272 #7916] ERROR -- : Rescuing EOF error
E, [2011-08-01T11:01:21.374473 #7916] ERROR -- : Rescuing EOF error
E, [2011-08-01T11:01:22.435259 #7916] ERROR -- : Rescuing EOF error

I also checked headers in Firefox plugin: Tamper Data
It shows that the size of 3 first consecutive requests(POST, GET, GET) is -1.

@bhochhi
Copy link

@bhochhi bhochhi commented Aug 5, 2011

Any solution to this issue? I am getting similar error with no effect increasing keep_alive_time or open_time:
too many connection resets (due to end of file reached - EOFError) after 3 requests on 3084924
Net::HTTP::Persistent::Error

@simonmd
Copy link

@simonmd simonmd commented Aug 6, 2011

Same here, seems to fluctuate between EOF and ECONNRESET

@jg
Copy link

@jg jg commented Aug 9, 2011

Confirmed with Mechanize 2.0.1. Please fix this issue.

@GBH
Copy link

@GBH GBH commented Aug 16, 2011

+1 I'm seeing the same thing

@ghost ghost assigned drbrain Aug 22, 2011
@knu
Copy link
Member

@knu knu commented Aug 24, 2011

Same here. It occurs regardless of https or http.

If you know re-posting is harmless (as in a login form) you can temporarily set:

  agent.agent.http.retry_change_request = true

To force a re-post, and seems it works for me.

@knu
Copy link
Member

@knu knu commented Aug 24, 2011

Another ugly workaround is to manually reset the connection before posting.

  #...
  agent.agent.http.tap { |http|
    http.reset http.connection_for(login_page.uri + form.action)
  }
  form.submit
@matthewbjones
Copy link

@matthewbjones matthewbjones commented Aug 24, 2011

I'm also seeing this quite frequently on 2.0.1, going to look into downgrading to 1.0.0 as a short-term solution to the problem.

@jinschoi
Copy link
Contributor

@jinschoi jinschoi commented Aug 27, 2011

Here is what looks to be going on:

I see this error when I run a request, then after a brief pause, run another request, using SSL. It does not happen when I run the requests back to back. I got a full backtrace from line 460 of persistent.rb, and that showed that the IOError was being raised from buffering.rb:145 in read_nonblock at a call to sysread_nonblock, called from net/protocol.rb:135 in rbuf_fill.

Digging down, it looks like sysread_nonblock is implemented (in 1.9.2) in the core file ossl_ssl.c, and only returns an IOError if one of two errors occurred, SSL_ERROR_ZERO_RETURN or SSL_ERROR_SYSCALL. The documentation here: http://www.openssl.org/docs/ssl/SSL_get_error.html
indicates that either the connection has been closed, or an EOF was read that violates the SSL protocol. I'm guessing it is the first case because of the behavior with delays. So it looks like the problem is that regardless of the Connection: Keep-Alive setting, the SSL connection can go down and you get an EOFError. In the non-SSL situation, there is a similar problem with IO.read_nonblock() also having the possibility of returning EOFError.

The problem is going to be figuring out when that is the case so you can retry safely for non-idempotent queries. knu's workaround will work if you know it is safe to do so. You can't just always retry on EOFError because other levels can throw EOFError for different reasons.

@dgmdan
Copy link

@dgmdan dgmdan commented Sep 9, 2011

I'm getting a similar error using 2.0.1. It happens on a POST request in which the server takes about 2-3 min to respond. The error I get is a little different though: "too many connection resets (due to Resource temporarily unavailable - Timeout::Error)" Fixed it temporarily by reverting back to 1.0.0 and setting high agent.read_timeout and agent.open_timeout settings.

@lxcid
Copy link

@lxcid lxcid commented Sep 23, 2011

we are facing similar problem though. i think we will choose the downgrade path and test out 2.0.2 again when its released.

@mohamedhafez
Copy link

@mohamedhafez mohamedhafez commented Sep 24, 2011

anybody also getting occasional SocketError's about getaddrinfo failing because it couldnt find the remote host in addition to the ECONNRESET errors because of this bug? or is this a different problem that i'm having?

@woto
Copy link

@woto woto commented Oct 3, 2011

Having same problem :(

@madsheep
Copy link

@madsheep madsheep commented Oct 5, 2011

any updates?

@ghost
Copy link

@ghost ghost commented Oct 7, 2011

Hey any updates... same here...

@bhochhi
Copy link

@bhochhi bhochhi commented Oct 7, 2011

I started using selenium web driver.

@cantonic
Copy link

@cantonic cantonic commented Oct 8, 2011

same here... this issue is 4 months old now. can we await any updates?

@cantonic
Copy link

@cantonic cantonic commented Oct 8, 2011

I don't know what happened, but it is working for me now...

things i have done:
installed watir-webdriver
installed mechanize 1.0.0
uninstalled mechanize 1.0.0 and installed 2.0.1 again

@drbrain
Copy link
Member

@drbrain drbrain commented Oct 8, 2011

The issue is related to:

  1. The server you are connecting to
  2. The types of requests you are making (idempotent GET requests vs non-idemponent POST requests)

Without example scripts to illustrate and reproduce your specific problem it is difficult to find a "fix" for your specific program.

@jinschoi
Copy link
Contributor

@jinschoi jinschoi commented Oct 8, 2011

Here is one example of a server I've come across that triggers this behavior:

a = Mechanize.new
result = a.get('https://junecloud.com/sync/deliveries/') do |page|
  sleep 10
  page.form_with(:action => "./").submit
end

It works without the sleep, which makes me think it has something to do with the keep alive behavior of the server.

@nahi
Copy link

@nahi nahi commented Oct 12, 2011

I get the same error with @jinschoi's script.

And here's a similar trace from httpclient. Relevant part is 'KeepAliveDisconnected'. HTTPClient tries to re-post under some condition since we might not be able to detect a socket disconnection by peer.

EDIT: I forgot to add this URL: https://gist.github.com/1280318

@drbrain
Copy link
Member

@drbrain drbrain commented Oct 25, 2011

I've released net-http-persistent 2.2 which will reset connections that have been idle for 5 seconds. Can some of you try your scripts with master @1fd7c77 or newer?

@nahi
Copy link

@nahi nahi commented Oct 25, 2011

From my investigation at that time, the server for 'https://junecloud.com/sync/deliveries/' seems to have 1 or 2 sec as KeepAliveTimeout.

I'm writing this because I just thought that the second access in 1~4 sec might raise an error as same as before.

@aquasync
Copy link

@aquasync aquasync commented Sep 10, 2013

This has been failing for me with a mixture of "too many connection resets" and plain timeouts pretty consistently for the last few months. Attempts to set idle_timeout, keep_alive and retry_change_requests doesn't help.

client = Mechanize.new
page = client.get 'http://www.us.hsbc.com'

Meanwhile (in the same irb window) this will generally work fine:

html = open('http://www.us.hsbc.com') { |f| f.read }
@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 13, 2013

This bug still exists. It's the biggest problem with mechanize. I really wish it could be resolved. Here is another example:

agent = Mechanize.new
agent.get('https://site.com/whatever') 
#=> Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0 requests on 70362676419080, last used 1379112036.219295 seconds ago

agent.verify_mode = OpenSSL::SSL::VERIFY_NONE
#=> 0
agent.get('https://site.com/whatever') 
#=> Net::HTTP::Persistent::Error: too many connection resets (due to Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0 requests on 70362676321280, last used 1379112135.65024 seconds ago\

Here is the agent:

#<Mechanize:0x007ffd2ca88f50 @agent=#<Mechanize::HTTP::Agent:0x007ffd2ca88ed8 @allowed_error_codes=[], @conditional_requests=true, @context=#<Mechanize:0x007ffd2ca88f50 ...>, @content_encoding_hooks=[], @cookie_jar=#<Mechanize::CookieJar:0x007ffd2ca88e60 @store=#<HTTP::CookieJar::HashStore:0x007ffd2ca88de8 @mon_owner=nil, @mon_count=0, @mon_mutex=#<Mutex:0x007ffd2ca88d98>, @logger=nil, @gc_threshold=150, @jar={}, @gc_index=0>>, @follow_meta_refresh=false, @follow_meta_refresh_self=false, @gzip_enabled=true, @history=[], @ignore_bad_chunking=false, @keep_alive=true, @max_file_buffer=100000, @open_timeout=nil, @post_connect_hooks=[], @pre_connect_hooks=[], @read_timeout=nil, @redirect_ok=true, @redirection_limit=20, @request_headers={}, @robots=false, @user_agent="Mechanize/2.7.2 Ruby/2.0.0p247 (http://github.com/sparklemotion/mechanize/)", @webrobots=nil, @auth_store=#<Mechanize::HTTP::AuthStore:0x007ffd2ca88b68 @auth_accounts={}, @default_auth=nil>, @authenticate_parser=#<Mechanize::HTTP::WWWAuthenticateParser:0x007ffd2ca88a78 @scanner=nil>, @authenticate_methods={}, @digest_auth=#<Net::HTTP::DigestAuth:0x007ffd2ca889d8 @mon_owner=nil, @mon_count=0, @mon_mutex=#<Mutex:0x007ffd2ca88988>, @nonce_count=-1>, @digest_challenges={}, @pass=nil, @scheme_handlers={"http"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>, "https"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>, "relative"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>, "file"=>#<Proc:0x007ffd2ca887a8@/Users/me/.rvm/gems/ruby-2.0.0-p247/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:172 (lambda)>}, @http=#<Net::HTTP::Persistent:0x007ffd2ca88578 @name="mechanize", @debug_output=nil, @proxy_uri=nil, @no_proxy=[], @headers={}, @override_headers={}, @http_versions={}, @keep_alive=300, @open_timeout=nil, @read_timeout=nil, @idle_timeout=5, @max_requests=nil, @socket_options=[[6, 1, 1]], @generation_key=:net_http_persistent_mechanize_generations, @ssl_generation_key=:net_http_persistent_mechanize_ssl_generations, @request_key=:net_http_persistent_mechanize_requests, @timeout_key=:net_http_persistent_mechanize_timeouts, @certificate=nil, @ca_file=nil, @private_key=nil, @ssl_version=nil, @verify_callback=nil, @verify_mode=1, @cert_store=nil, @generation=1, @ssl_generation=1, @reuse_ssl_sessions=true, @retry_change_requests=false, @ruby_1=false, @retried_on_ruby_2=true>>, @log=nil, @watch_for_set=nil, @history_added=nil, @pluggable_parser=#<Mechanize::PluggableParser:0x007ffd2ca880c8 @parsers={"text/html"=>Mechanize::Page, "application/xhtml+xml"=>Mechanize::Page, "application/vnd.wap.xhtml+xml"=>Mechanize::Page, "image"=>Mechanize::Image, "text/xml"=>Mechanize::XmlFile, "application/xml"=>Mechanize::XmlFile}, @default=Mechanize::File>, @keep_alive_time=0, @proxy_addr=nil, @proxy_port=nil, @proxy_user=nil, @proxy_pass=nil, @html_parser=Nokogiri::HTML, @default_encoding=nil, @force_default_encoding=false>

I'm able to Curl the https url just fine and get the HTML response. It's clearly something in the ruby code that's to blame.

@drbrain
Copy link
Member

@drbrain drbrain commented Sep 13, 2013

Disabling the verify_mode does nothing if the remote end hangs up on you.

To diagnose this problem you'll need to show the URL you are connecting to. https://site.com does not exist:

$ ruby -rmechanize -e 'Mechanize.new.get "https://site.com/whatever"'
/usr/local/lib/ruby/2.0.0/net/http.rb:878:in `initialize': Operation timed out - connect(2) (Errno::ETIMEDOUT)

PS: Use code fences:

```
your code here
```
@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 13, 2013

https://isapps.acxiom.com/optout/optout.aspx

On Fri, Sep 13, 2013 at 3:53 PM, Eric Hodel notifications@github.comwrote:

Disabling the verify_mode does nothing if the remote end hangs up on you.

To diagnose this problem you'll need to show the URL you are connecting
to. https://site.com does not exist:

$ ruby -rmechanize -e 'Mechanize.new.get "https://site.com/whatever"'
/usr/local/lib/ruby/2.0.0/net/http.rb:878:in `initialize': Operation timed out - connect(2) (Errno::ETIMEDOUT)

PS: Use code fenceshttps://help.github.com/articles/github-flavored-markdown#fenced-code-blocks
:

your code here


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24429975
.

@drbrain
Copy link
Member

@drbrain drbrain commented Sep 13, 2013

@bsgreenb your problem is not the problem described in this issue, but is a server negotiation problem as I can't connect using SSLSocket defaults:

$ ruby -rsocket -ropenssl -e 'io = TCPSocket.open "isapps.acxiom.com", 443; OpenSSL::SSL::SSLSocket.new(io).connect'
-e:1:in `connect': Connection reset by peer - SSL_connect (Errno::ECONNRESET)
    from -e:1:in `<main>'

Setting the ssl_version in the mechanize agent allows you to connect, but shows the certificate is missing (for me):

$ ruby -rmechanize -e 'm = Mechanize.new; m.agent.ssl_version = :TLSv1; m.get "https://isapps.acxiom.com/optout/optout.aspx"'
/usr/local/lib/ruby/2.0.0/net/http.rb:918:in `connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)

Here's the certificate chain which should help you track down the right certificate if you are also missing it:

Certificate chain
 0 s:/C=US/ST=Arkansas/L=Conway/O=Acxiom Corporation/CN=isapps.acxiom.com
   i:/C=US/O=Entrust, Inc./OU=www.entrust.net/rpa is incorporated by reference/OU=(c) 2009 Entrust, Inc./CN=Entrust Certification Authority - L1C

This was from openssl s_client -host isapps.acxiom.com -port 443

I strongly recommend you DO NOT SET verify_mode = OpenSSL::SSL::VERIFY_NONE.

@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 13, 2013

Thanks.

What are the steps involved in adding a certificate that mechanize can
read? Any docs on this?

On Fri, Sep 13, 2013 at 4:25 PM, Eric Hodel notifications@github.comwrote:

@bsgreenb https://github.com/bsgreenb your problem is not the problem
described in this issue, but is a server negotiation problem as I can't
connect using SSLSocket defaults:

$ ruby -rsocket -ropenssl -e 'io = TCPSocket.open "isapps.acxiom.com", 443; OpenSSL::SSL::SSLSocket.new(io).connect'
-e:1:in connect': Connection reset by peer - SSL_connect (Errno::ECONNRESET) from -e:1:in

'

Setting the ssl_version in the mechanize agent allows you to connect, but
shows the certificate is missing (for me):

$ ruby -rmechanize -e 'm = Mechanize.new; m.agent.ssl_version = :TLSv1; m.get "https://isapps.acxiom.com/optout/optout.aspx"'
/usr/local/lib/ruby/2.0.0/net/http.rb:918:in `connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)

Here's the certificate chain which should help you track down the right
certificate if you are also missing it:

Certificate chain
0 s:/C=US/ST=Arkansas/L=Conway/O=Acxiom Corporation/CN=isapps.acxiom.com
i:/C=US/O=Entrust, Inc./OU=www.entrust.net/rpa is incorporated by reference/OU=(c) 2009 Entrust, Inc./CN=Entrust Certification Authority - L1C

This was from openssl s_client -host isapps.acxiom.com -port 443

I strongly recommend you DO NOT SET verify_mode =
OpenSSL::SSL::VERIFY_NONE.


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24431143
.

@drbrain
Copy link
Member

@drbrain drbrain commented Sep 14, 2013

You should be able to retrieve the CA cert from your browser (export it in PEM format) and add it to a OpenSSL::X509::Store then set Mechanize#cert_store=

@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 14, 2013

So when I access the url it's a chain of certificates that goes:
Entrust.net Certification Authority (2048) > Entrust Certification
Authority - L1C > isapps.acxiom.com

Which one should I export?

On Fri, Sep 13, 2013 at 5:08 PM, Eric Hodel notifications@github.comwrote:

You should be able to retrieve the CA cert from your browser (export it in
PEM format) and add it to a OpenSSL::X509::Storehttp://www.ruby-doc.org/stdlib-2.0.0/libdoc/openssl/rdoc/OpenSSL/X509/Store.htmlthen set
Mechanize#cert_store=http://mechanize.rubyforge.org/Mechanize.html#method-i-cert_store-3D


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24432482
.

@drbrain
Copy link
Member

@drbrain drbrain commented Sep 14, 2013

Try both Entrust.net certificates, but you may be able to get away with the "(2048)" one alone. You'll need to add each separately to the Store.

@aquasync
Copy link

@aquasync aquasync commented Sep 14, 2013

Hmm, thanks for the reference to ssl_version. Setting that to :TLSv1 fixes access to us.hsbc.com also. It seems kind of odd that open-uri works fine - I guess some difference in the defaults?

@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 16, 2013

Eric,

I manually set the cert_store to the exported .pem file and I still get the
error.

Here's what I did:
-Went to Chrome and exported the Entrust.net Certification Authority (2048)
self-signed certificate to a .pem.
-Ran from console:
cert_store = OpenSSL::X509::Store.new
cert_store.add_file '/Users/ben/sources/test.pem'
agent = Mechanize.new
agent.cert_store = cert_store
agent.get 'https://isapps.acxiom.com/optout/optout.aspx'

And I get:Net::HTTP::Persistent::Error: too many connection resets (due to
Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0
requests on 70362622244720, last used 1379358504.9168751 seconds ago

On Fri, Sep 13, 2013 at 11:17 PM, aquasync notifications@github.com wrote:

Hmm, thanks for the reference to ssl_version. Setting that to :TLSv1 fixes
access to us.hsbc.com also. It seems kind of odd that open-uri works fine

  • I guess some difference in the defaults?


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24437781
.

@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 16, 2013

curl --cacert test.pem -v
https://isapps.acxiom.com/optout/optout.aspxworks though:

$ curl --cacert test.pem -v https://isapps.acxiom.com/optout/optout.aspx

  • About to connect() to isapps.acxiom.com port 443 (#0)
  • Trying 198.160.97.162...
  • connected
  • Connected to isapps.acxiom.com (198.160.97.162) port 443 (#0)
  • successfully set certificate verify locations:
  • CAfile: test.pem
    CApath: none
  • SSLv3, TLS handshake, Client hello (1):
  • SSLv3, TLS handshake, Server hello (2):
  • SSLv3, TLS handshake, CERT (11):
  • SSLv3, TLS handshake, Server finished (14):
  • SSLv3, TLS handshake, Client key exchange (16):
  • SSLv3, TLS change cipher, Client hello (1):
  • SSLv3, TLS handshake, Finished (20):
  • SSLv3, TLS change cipher, Client hello (1):
  • SSLv3, TLS handshake, Finished (20):
  • SSL connection using RC4-MD5
  • Server certificate:
  • subject: C=US; ST=Arkansas; L=Conway; O=Acxiom Corporation; CN=
    isapps.acxiom.com
  • start date: 2013-04-03 20:08:38 GMT
  • expire date: 2014-04-23 01:43:34 GMT
  • subjectAltName: isapps.acxiom.com matched
  • issuer: C=US; O=Entrust, Inc.; OU=www.entrust.net/rpa is incorporated by
    reference; OU=(c) 2009 Entrust, Inc.; CN=Entrust Certification Authority -
    L1C
  • SSL certificate verify ok.

GET /optout/optout.aspx HTTP/1.1
User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0
OpenSSL/0.9.8r zlib/1.2.5
Host: isapps.acxiom.com
Accept: /

< HTTP/1.1 200 OK

On Mon, Sep 16, 2013 at 12:11 PM, Ben Greenberg bsgreenb@gmail.com wrote:

Eric,

I manually set the cert_store to the exported .pem file and I still get
the error.

Here's what I did:
-Went to Chrome and exported the Entrust.net Certification Authority
(2048) self-signed certificate to a .pem.
-Ran from console:
cert_store = OpenSSL::X509::Store.new
cert_store.add_file '/Users/ben/sources/test.pem'
agent = Mechanize.new
agent.cert_store = cert_store
agent.get 'https://isapps.acxiom.com/optout/optout.aspx'

And I get:Net::HTTP::Persistent::Error: too many connection resets (due to
Connection reset by peer - SSL_connect - Errno::ECONNRESET) after 0
requests on 70362622244720, last used 1379358504.9168751 seconds ago

On Fri, Sep 13, 2013 at 11:17 PM, aquasync notifications@github.comwrote:

Hmm, thanks for the reference to ssl_version. Setting that to :TLSv1
fixes access to us.hsbc.com also. It seems kind of odd that open-uri
works fine - I guess some difference in the defaults?


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24437781
.

@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 23, 2013

Think I've finally resolved the issues I was running into with these errors. It comes down to this characteristic:

"Mechanize defaults to validating SSL certificates using the default CA certificates for your platform."

Mechanize might be fine at request SSL sites from one machine and failing from another for this reason. That's why I recommend everyone specify a cert file that's in version control and can be used across environments.

Also, to the maintainers of Mechanize,

  1. do you think you guys could make it actually point to the default CA path being used when the user doesn't provide it? This makes debugging issues with certs easier across different machines.
  2. you guys should have a wiki on this, which troubleshoots these types of errors. based on the comments here there are some standard solutions
@drbrain
Copy link
Member

@drbrain drbrain commented Sep 24, 2013

Mechanize uses the default CA path, see:

https://github.com/drbrain/net-http-persistent/blob/master/lib/net/http/persistent.rb#L1171

If you don't have the necessary root certificates in your default CA path there's not much we can do about that. Clearer errors need to happen in openssl and net/http over in https://bugs.ruby-lang.org

I'll see what I can do for better documentation for the "too many connection resets" problems.

@bsgreenb
Copy link

@bsgreenb bsgreenb commented Sep 24, 2013

Also wanted to add that I had to set agent.ssl_version = :TLSv1 for it to
work with the acxiom url as you mentioned earlier. Any idea why a site
would only work with that version of SSL? Would love to get debug info on
the exact SSL handshake thats going on so I could submit a bug report.

On Mon, Sep 23, 2013 at 9:53 PM, Eric Hodel notifications@github.comwrote:

Mechanize uses the default CA path, see:

https://github.com/drbrain/net-http-persistent/blob/master/lib/net/http/persistent.rb#L1171

If you don't have the necessary root certificates in your default CA path
there's not much we can do about that. Clearer errors need to happen in
openssl and net/http over in https://bugs.ruby-lang.org

I'll see what I can do for better documentation for the "too many
connection resets" problems.


Reply to this email directly or view it on GitHubhttps://github.com//issues/123#issuecomment-24974630
.

@scottwb
Copy link

@scottwb scottwb commented Nov 9, 2013

Certs, CA paths, ssl_version, keep_alive, idle_timeout, retry_change_requests, etc...none of the suggestions above have had any effect at eliminating this bug from intermittently occurring in our high-volume production scrapers based on Mechanize.

As an experiment, I created a monkey-patch that simply shuts down the underlying persistent connection and tries again with a new one, whenever this error is caught. I've now been using this in production for high-volume scrapers for a few months. It has 100% eliminated this problem with no observable negative side-effects. YMMV.

Here's my post about this workaround, along with the code:

http://scottwb.com/blog/2013/11/09/defeating-the-infamous-mechanize-too-many-connection-resets-bug/

@drbrain
Copy link
Member

@drbrain drbrain commented Nov 12, 2013

@scottwb what is your request mix?

The HTTP spec only allows a retry of GET, HEAD and other idempotent methods. POST, PUT and other methods that modify data cannot be retried by a library like net-http-persistent, but can if you have application-specific knowledge. That's up to you to add to your library, I'm not comfortable adding such a thing to net-http-persistent because it may lead to a double-POST (which is bad) in untrained hands.

@leejarvis
Copy link
Member

@leejarvis leejarvis commented Mar 28, 2014

I'm going to close this for now. I think it's hard to come up with a literal fix that satisfies everything in this issue. Happy to discuss any problems in fresh issues.

@Mifrill
Copy link

@Mifrill Mifrill commented Mar 26, 2018

I get this error:

I, [2018-03-25 14:56:07 +0000#7230]  INFO -- : Scheduler: visiting = http://gebrauchtmaschinen-bodensee.de/seiten/maschinenblatt.php?artikel_ID=278, last_request = 1521989767
E, [2018-03-25 14:56:07 +0000#7230] ERROR -- : Scheduler: `Net::HTTP::Persistent::Error: too many connection resets (due to end of file reached - EOFError) after 8 requests on 70058197957180, last used 3.017852706 seconds ago`
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/protocol.rb:189:in `rbuf_fill'
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/protocol.rb:157:in `readuntil'
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/protocol.rb:167:in `readline'
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/http/response.rb:40:in `read_status_line'
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/http/response.rb:29:in `read_new'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/aws-sdk-core-3.13.0/lib/seahorse/client/net_http/patches.rb:30:in `block in new_transport_request'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/aws-sdk-core-3.13.0/lib/seahorse/client/net_http/patches.rb:27:in `catch'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/aws-sdk-core-3.13.0/lib/seahorse/client/net_http/patches.rb:27:in `new_transport_request'
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/http.rb:1464:in `request'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:999:in `request'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/mechanize-2.7.5/lib/mechanize/http/agent.rb:274:in `fetch'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/mechanize-2.7.5/lib/mechanize.rb:464:in `get'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/capybara-mechanize-1.5.0/lib/capybara/mechanize/browser.rb:115:in `process_remote_request'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/capybara-mechanize-1.5.0/lib/capybara/mechanize/browser.rb:43:in `block (2 levels) in <class:Browser>'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/capybara-2.17.0/lib/capybara/rack_test/browser.rb:69:in `process'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/capybara-2.17.0/lib/capybara/rack_test/browser.rb:41:in `process_and_follow_redirects'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/capybara-2.17.0/lib/capybara/rack_test/browser.rb:22:in `visit'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/capybara-2.17.0/lib/capybara/rack_test/driver.rb:44:in `visit'
/root/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/capybara-2.17.0/lib/capybara/session.rb:274:in `visit'

Еhis happens in a long process. The number of requests and url of error may vary.

I, [2018-03-26 04:55:47 +0000#16779]  INFO -- : Scheduler: visiting = http://gebrauchtmaschinen-bodensee.de/seiten/maschinenblatt.php?artikel_ID=280, last_request = 1522040147
E, [2018-03-26 04:55:48 +0000#16779] ERROR -- : Scheduler: `Net::HTTP::Persistent::Error: too many connection resets (due to end of file reached - EOFError) after 1 requests on 69821907747740, last used 2.998685685 seconds ago`
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/protocol.rb:189:in `rbuf_fill'
/root/.rbenv/versions/2.5.0/lib/ruby/2.5.0/net/protocol.rb:157:in `readuntil'

I tried different settings, including headers, but the error could not be fixed.

headers "Keep-Alive" => "timeout=5, max=150"

    { idle_timeout: 4,
      read_timeout: 40,
      ignore_bad_chunking: true,
      retry_change_requests: true }.each do |parameter, value|
      page.driver.browser.agent.send("#{parameter}=", value)
    end

any ideas? Thank you

snehankekre pushed a commit to snehankekre/the_hills1 that referenced this issue Jul 24, 2018
Snehan Kekre
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.