-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unclosed connections with SPARQL::CLIENT #86
Comments
The connection is opened in The gem uses Net::HTTP::Persistent, which does support a We could also add a I would expect that the connection would be closed when the client instance is deallocated. |
I would expect this kind of garbage collection as well, The docs say as much
...which is why I am confused that the CLOSE_WAITS never go away but keep piling up. But it really appears that these connections are not getting garbage collected. My rails app runs for about 12 hours, but then reliably crashes and return "Too many files open" errors for every subsequent attempt to open a new connection. As mentioned above running Is there anything I can do, at the present, to force close the connection with modifying the sparql-client library directly? Here's the code where I'm instantiating the SPARQL::CLIENT
I would love to force close the connection with something like sparql.close() Or is there way to force close all open Net::HTTP::Persistent connections. Basically force garbage collection. |
@jeffreycwitt can you try the "finish-connection" branch to see if it fixes your problem, it's in PR #87. |
@gkellogg I'm now using the finish-connection branch now and the problem still seems to be persisting. I've added a rails profiler. And you can see here that via the sparql-client / net-persistant libraries 4 http calls are being made per page load to my sparql endpoint Here you can see the output of After a little more time, and no further requests, all "ESTABLISHED" requests move to "CLOSE_WAIT" but the On production server, this will continue until it reaches the open file limit, which on Heroku is 10,000 open files. Then I have to restart the server. If you have any other ideas, I would really love to hear them. Thanks again for all the help thus far. P.s. here are the links to my implementation of the sparql/client code. First in my ruby gem (lbp.rb) in the Lbp::Query class https://github.com/lombardpress/lbp.rb/blob/master/lib/lbp/query.rb#L33 The Lbp::Query class is then extended in the the "LombardPress-Web App", but the same https://github.com/lombardpress/lombardpress-web/blob/master/app/models/misc_query.rb |
Can yo provide me with. Minimal example that reprocessing the problem? I beeeieve the issue lies in |
Ok, if you can clone this branch of the lbp.rb gem https://github.com/lombardpress/lbp.rb/tree/feature/close-wait-test Then run Then run This will run 100 requests, delayed every second Once that is started, find the port of the running process and run
You will start to a list of connections like so but keep running the Eventually connections will start changing to P.s. this has already helped me confirm that the problem is not in the rails app, since i'm reproducing the problem here just in the tests of the lbp.rb gem. |
I just ran this against dbpedia's SPARQL endpoint as well to make sure my SPARQL endpoint wasn't doing something weird. I get the same result, established connections, turning to close_and_wait, that don't seem to every fully close. I pushed the modifications for this as well. If you want to run this test, see the comments in the query_load_spec for the ENV variables that need to be set. |
Hi @gkellogg, |
I’ve been at W3C TPAC all week, so haven’t had a chance to follow up. When I can reproduce, perhaps we’ll be able to see if Net::HTTP::Persistent is indeed implicated. We’d need a reproducible use case. |
Ok no worries whatsoever. I think have a created a reproducible use case. Whenever you have time see my post above: #86 (comment). Let me know if you don't think this is sufficient or can't reproduce yourself. I was able to reduce the problem with both my SPARQL endpoint and the dbpedia SPARQL endpoint . Best, |
* Update dependencies for RDF 1.1. Use json gem instead of json_pure, as json is built into all rubies now. * Update to RSpec expect syntax and use webmock for remote resources. * Update travis to run unrestricted version of JRuby * Add rdf-aggregate-repo to Gemfile. * Revert use of Enumerator for solutions. * Update Travis test matrix to use rbx instead of rbx-mode19 * Use Nokogiri instead of REXML for parsing results. Nokogiri is significantly faster, and now has a pure-java backend for use on JRuby. This patch potentially breaks backwards compatibility, as parse_xml_bindings no longer accepts REXML elements. * Add Nokogiri dep to README. * Add Nokogiri as development dependency. * Use Nokogiri instead of REXML for parsing results. Nokogiri is significantly faster, and now has a pure-java backend for use on JRuby. This patch potentially breaks backwards compatibility, as parse_xml_bindings no longer accepts REXML elements. * Add Nokogiri dep to README. * Add Nokogiri as development dependency. * Add tests for XML results parsing. * Simplified xml tests slightly. * Update node hash when parsing json. * Add tests for json parsing. * Add csv parsing test. * Minor fixes to TSV parsing. - save blank nodes in nodes table. - Don't crash when columns are empty. * Add test for TSV parsing. * Make nokogiri really be a soft dependency and fallback to REXML if it doesn't load. * add read_timeout support for requests * Test both with and without nokogiri. * Version 1.1.1. * Remove json dependency * Add Ruby 2.1.0 to Travis CI mix. * Use develop version of sxp. * Add rubinius dependencies to Gemfile. * Require json gem for rbx. * Move .gemspec to sparql-client.gemspec and update Gemfile. * Move .gemspec to sparql-client.gemspec and update Gemfile. * distinct should work with the "*" select form `&&` and `and` are not identical `a = true and false` results in `a` = `true` `a = true && false` results in `a` = `false` as intended in order to achieve the same results with `and` you must indicate the precedence, e.g. `a = (true and false)` message for your changes. Lines starting * Adding support for property paths. See spec for DSL syntax and examples. * Add more specs for query builder. * Update Travis Ruby versions. * Finish WritableRepository * Integrated @curoverse on a Repository using SPARQL::Client. * Use Dydra repository for testing. * Some tests involving matching doubles don't pass. This closes issue ruby-rdf#45 * Version 1.1.2. * Added :endpoint option for #update method to specify an alternative update endpoint * Be more descriminating on Accept headers sent based on different queries. Queries build using the DSL use either RDF content types, or SPARQL Results content types, not both. Those using RDF content types: * CONSTRUCT, DESCRIBE, DELETE DATA, LOAD, CREATE Those using SPARQL Results content types: * ASK, SELECT, INSERT DATA, CLEAR, DROP Hopefully, this makes issues such as come up in ruby-rdf#51 less likely to happen. * Ensured that SPARQL 1.1 JSON typed literals are parsed correctly. The parsing code now supports both SPARQL 1.0 and 1.1 JSON results: {"type": "literal", "value": "S", "datatype": "D"} # SPARQL 1.1 JSON {"type": "typed-literal", "value": "S", "datatype": "D"} # SPARQL 1.0 JSON See: http://www.w3.org/TR/sparql11-results-json/#select-encode-terms See: http://www.w3.org/TR/rdf-sparql-json-res/#variable-binding-results * Update code setting appropriate Accept header, and also add */*;q=0.1 to every request as a fallback. * Only use DELETE DATA for #delete_statements if the statement is both constant, and contains no BNodes, otherwise, it falls back to DELETE/INSERT. * Check error response outputs query. When doing updates, change BNodes to Variables. * Add around block with response delegation to capture query and report on queries run in this example if the example fails. (Not perfect, but still useful). * Remove debug point. * Version 1.1.3. * Follow redirects when querying sparql Some SPARQL servers redirect requests. This allows the library to follow redirects, and raise an error in case of a large number of redirects. * Update dependency on net-http-persistent to ~> 2.9. Disable Repository tests, as Dydra is really just too slow for remote testing, and WebMock continues to be enabled, causing errors. This relates to issue ruby-rdf#18. * disable webmock for rdf-spec repo tests and reenable repo tests * Version 1.1.3.1. * Update build matrix to run both with and without nokogiri. * Gemfile-pure, not Gemfile.pure * Update README.md Change account for Christoph Badura from @b4d to @bad * Option for separate update_endpoint for Client::Repositories * Add a mechanism to override the HTTP verb Marmotta 3.3.0 requires GET for DELETE requests, but can accept POST for INSERT This enables rdf-marmotta to override less of #request. See jcoyne/rdf-marmotta@9b1cf62 * Change Client#method to Client#request_method, as #method replaces Object#method, which is over-broad and broke the repository specs. This fixes ruby-rdf#57. * Version 1.1.4. * When using the SPARQL gem as an update endpoint, use the `:update` option to invoke the proper parser path and catch malformed queries. * Updates to allow a native RDF::Repository instance to be used for SPARQL::Client::Repository URL, which will use the SPARQL gem for doing updates. * Version 1.1.5. * Added support for specifying the sort order in SPARQL::Client::Query.order(*variables). * Update documentation. Add support for `#asc` and `#desc` modifiers. * Version 1.1.6. * Add link to coveralls coverage report. * Improve code coverage. * Use mri 2.2.1, instead of 2.2; this defaults to 2.2.0p0, because of old version of rvm on Travis-CI. * When calling @http.request, use ::URI, not ::RDF::URI. Fixes ruby-rdf#29. * Add that Repository does not support graph_name in addition to context for RDF.rb 2.0. * Don't insert incomplete statement in `Repository::insert_statement(s)`. * Minor updates for RDF.rb 2.0. * Handle `Enumerable` values on `Repository#delete` A test for `RDF::Enumerable` inputs was added by ruby-rdf/rdf-spec#39. This adds conformance, and allows `SPARQL::Client::Repository` to use the new `Mutable#delete_insert` interface. It does not yet implement an effecient SPARQL `#delete_insert`. It may be possible to refactor `#delete_statements` in response to these changes, to remove the code smells called out in the comment in that method. * Update Ruby versions. * Updates for keyword arguments. * Update required ruby version >= 2.0. * Change gemspec dependencies to '>= 1.99', '< 3' in prep for 2.0.0.beta release. * Change calling sequence to Repository to use `uri` named parameter instead of fixed `endpoint` parameter. @no-reply, you might look at the Repository failures, as they related to Transaction changes. * Change pattern of uri named parameter for backwards compatibility with earlier versions of Ruby 2.x * Set version to 2.0.0.beta1 and change gemspec dependencies to '>= 2.0.0.beta', '< 3' until 2.0.0 is released. * Don't run coverage unless the gem is loaded. * Update gemspec. Remove README symlink. * Allow configuration of keep-alive * Add CONTRIBUTING.md. * Fix CONTRIBUTING typos. * Updates for release 2.0.0. * Update Gemfile dependencies. * content_type is always passed to RDF::Reader.for * raise error when no suitable rdf reader is found Currently sparql-client will silently fail with a nil if no suitable rdf reader is found. This can lead to the awkward situation where a nil is returned for the query and the user has to find out what is causing it. This commit tries to quicken this debugging proces by at least indicating the point of failure by raising a descriptive error. * set appropriate content types in mocks because an error is now raised when no rdf reader is found, the content type in the mocked responses needs to be correct * call original when rdf reader is mocked * require rdf/turtle to have a reader for turtle * Fixed quality values in generated Accept headers (closes ruby-rdf#69). * Added pre/post HTTP request hooks for monitoring/debugging. * Bumped the version to 2.0.1. * Update client_spec mock request expectations to expaect "q=.." instead of "p=..". * Update minimum ruby version, and change sxp repo location in Gemfiles. * Remove wirble from Gemfile, as dependency-ci objects that it has no license and it's not really neccessary. * Change Travis JRuby to default and allow failures. * Handle empty response body on update queries Update responses are implementation defined; some servers return an empty body and no content type. This handles that case. * Implements tests for update alternative endpoint * Fixed problems with the alternative endpoint The alternative endpoint is set optionally in the update method if an endpoint is provided in the options. If the next call doesn't provide an alternative endpoint, it should use the instance's configured endpoint. Also: a call to the query method may use the make_post_request method. We should make sure to not keep any previous alternative endpoint. Note: this commit not only fix the current issues with the alternative endpoints, it also implements a new functionality: the possibility to override the configured endpoint when calling the query method. * Improvement: don't parse update response The result of an update is not accessible to the caller. Therefore, there is no point of parsing it and it's a risk that the whole call fails if an error occurs during the parsing. Fixes ruby-rdf#71 * Skip depencency checking on rdf-isomorphic. * Version 2.1.0. * Remove require of 'sparql/client' from Rakefile. * Add block forms for `#where` and `#optional`, this allows sub-queries to be run within the block, and filters to be added to the OPTIONAL block. Fixes ruby-rdf#75. * Add support for UNION, either with triple patterns, subquery, or block form. Fixes ruby-rdf#32. * Add support for MINUS, pretty much equivalent to UNION support. Fixes ruby-rdf#65. * Add documentation for select `:count` option. Not quite what was requested in ruby-rdf#27, but probably good enough. * Add `Repository#each_statement` using a copy from `Enumerable`. This is because the `Dataset#each_statement` uses an internal instance variable which is not set generically. (We may want to think about this, as `Repository` subclasses `Dataset`, and we either need to give guidance to developers on what methods need to be implemented, or fallback to the correct behavior if `@statements` doesn't exist. `SPARQL::Client::Repository#initialize` calls super (`RDF::Repository#initialize`), which doesn't call `RDF::Dataset#initialize`, thus the `@statements` instance variable is never set. * Add supports `literal_equality` so that spec count passes. * Add support for net-http-persistent 3.x * Update webmock to ~> 2.3. Use `Solution#to_h` instead of `#to_hash`. * Use Travis "trusty" build and wildcard Ruby versions. * Fix broken Markdown headings * Remove rubyforge reference. * Fix comment Looks like the comment from the previous block was copy-and-pasted. * Fix example code [insert | delete]_data happen in the context of the sparql client. * Relax dependencies for 3.0 release. * Fix Gemfile. * Version 2.2.0. * Version 2.2.1. * Update yard ~> 0.9.12 due to vulnerability. * Allow VALUES to be specified using `Query#values`. * Update dependencies. * Version 3.0.0 * Improved error message * Remove gemspec deprecations. * use ruby syntax highlighting for readme add links to rubydocs add prefix example * remove prefix example, add as separate pr * add support for RDF::URI prefixes favor class Module::Class over module Module; class Class * match code style in spec ignore rbenv ruby-version file * support prefix hashes update tests leave docs alone since hash format is now supported * added a default graph option * support multiple default graphs * Added Tests for the default-graph feature * Version 3.0.1. * Update travis config to deal with rubygems not supporting ruby < 2.3 any longer. See rubygems/rubygems#2534. * Add 2.6 to travis RVM matrix. * Update Gemfile. * Add default for User-Agent HTTP header, and fix code that sets default headers when creating the client. Fixes ruby-rdf#94. * Add `Client#close` to shutdown any HTTP connection and object finalizer to do the same. Fixes ruby-rdf#86. * Run 2.2.2 on Travis. * Remove jruby-openssl from Gemfiles. * Updates for 3.1 release and Ruby 2.7 calling sequences. * Update URLs to use HTTPS, where possible. * Update doap:license to https://unlicense.org/. * Update doap:license. * Use `each_statement.count` instead of `each_statement {count+=1}`. as Enumerable#count will handle this case. * Use `optimize: true` for queries to the sparql gem. * Update PDD info in the README. * Update gem dependencies. * CI on GitHub. * Fix README badges. * Fix bug in values when value is not a literal. Fixes ruby-rdf#96 * Explicitlly require 'delegate' and add example for ruby-rdf#92. * Version 3.1.1. * Update net-http-dependency to >= 4.0.1 and enable CI on Ruby 3.0. * Change finalizer for closing HTTP connections to a class method. See https://www.mikeperham.com/2010/02/24/the-trouble-with-ruby-finalizers/ * Version 3.1.2. * Add .coveralls.yml. * Change Nokogiri dependency in Gemfile to ~> 1.10, as 1.11 no requires Ruby ~> 2.5. * Add triple forms parsing to XML and JSON results for RDF-star. * Fixes for REXML used in Gemfile-pure. * Update CI for coveralls. * Update CI. * Update documentation, dependencies, and version sync for 3.2. * CI on Ruby 3.1. * Force CI on Ruby "3.0". * Run client finalizer in a Thread (if possible) for emergent Ruby 3.1 issue with net-http-persistent. * * Force running REXML in addition to Nokogiri by adding `library` keyword argument to `parse_xml_bindings`. * Put documents on GH-pages. * Improve TSV parsing and CSV tests. * Set variable_names in solutions, which might not be the same as the actual variables used in the solutions. * Since cc9812c the interface to specify options has changed * Add test for correct hand over of the graph parameter * remove focus tag * Add WebMock around test for default query specifying default graph. * Version 3.2.1. * Tolerate an empty binding in XML. * CI on 3.2. * Update badges * Adds support for Federated Service with SERVICE keyword. Fixes ruby-rdf#99. * Update dependencies. * Version 3.2.2. * make it work with ruby <= 3.0 * add pry gem to use binding.pry * use INSERT { GRAPH ...} instead of INSERT GRAPH {...} to work for virtuoso * put again NCBO custom code * handle the insert data bug of Virtuoso * put again the fix of the string datatype for 4store and Virtuoso --------- Co-authored-by: Gregg Kellogg <gregg@kellogg-assoc.com> Co-authored-by: Tom Nixon <tom@tomn.co.uk> Co-authored-by: Christophe Desclaux <descl@zouig.org> Co-authored-by: Danny Tran <dannybtran@gmail.com> Co-authored-by: Gregg Kellogg <gregg@greggkellogg.net> Co-authored-by: Ben Peters <ben@bencpeters.com> Co-authored-by: Arto Bendiken <arto@bendiken.net> Co-authored-by: Tom Johnson <thomas.johnson@oregonstate.edu> Co-authored-by: Nick Gottlieb <ngottlieb@gmail.com> Co-authored-by: Marcel Otto <marcelotto@gmx.de> Co-authored-by: Justin Coyne <justin@curationexperts.com> Co-authored-by: Christoph Kindl <mail@ckristo.net> Co-authored-by: Tom Johnson <tom@dp.la> Co-authored-by: nielsv <niels.vandekeybus@tenforce.com> Co-authored-by: Cecile Tonglet <cecile.tonglet@gmail.com> Co-authored-by: Chris Beer <chris@cbeer.info> Co-authored-by: Santiago Castro <santi.1410@hotmail.com> Co-authored-by: David Rupp <david@ruppworks.com> Co-authored-by: Richard Degenne <richdeg2@gmail.com> Co-authored-by: conors_nli <conor.sheehan.2@ucdconnect.ie> Co-authored-by: Nime <sebastianz541@googlemail.com> Co-authored-by: Natanael Arndt <arndtn@gmail.com>
Hi there,
I've been struggling with a problem for a while now in my rails of unclosed connections leading to the app exceeding its limits and returning errors like:
Error during failsafe response: Too many open files @ rb_sysopen - /app/app/views/errors/internal_server_error.html.erb
After making sure every open-uri request is closed and no luck solving the problem I've started investigating deeper.
using lsof I've noticed each page request delivers the following
TCP 192.168.0.22:52703->144.126.4.43:http (CLOSE_WAIT)
ruby 8093 jcwitt 50u IPv4 0x47acd29a34c3cb9b 0t0 TCP 192.168.0.22:52800->144.126.4.43:http (CLOSE_WAIT)
ruby 8093 jcwitt 51u IPv4 0x47acd29a34bd9b9b 0t0 TCP 192.168.0.22:52803->144.126.4.43:http (CLOSE_WAIT)
ruby 8093 jcwitt 52u IPv4 0x47acd29a2bec3f7b 0t0 TCP 192.168.0.22:52804->144.126.4.43:http (CLOSE_WAIT)
ruby 8093 jcwitt 53u IPv4 0x47acd29a33acf23b 0t0 TCP 192.168.0.22:52704->144.126.4.43:http (CLOSE_WAIT)
ruby 8093 jcwitt 54u IPv4 0x47acd29a2df4723b 0t0 TCP 192.168.0.22:52705->144.126.4.43:http (CLOSE_WAIT)
and as the app gets used the number of (CLOSE_WAIT) files keeps climbing, about 5 per page requests.
Using the IP, I learned that these requests are being made with the following snippet of code
sparql = SPARQL::Client.new(sparqlendpoint) result = sparql.query(query) return result
BTW I'm using sparql-client (~> 2.0).
So I'm wondering if I'm doing something wrong with the way I'm using SPARQL::Client or perhaps I've found a bug and SPARQL::Client should be closing connections and is not.
https://github.com/ruby-rdf/sparql-client/blob/develop/lib/sparql/client.rb#L726-L731
Among other places a http request is being opened at line 729.
Is getting closed anywhere?
I appreciate any help you can provide.
The text was updated successfully, but these errors were encountered: