Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters #1022

Conversation

ajablonski
Copy link

@ajablonski ajablonski commented Nov 20, 2019

Description

Email thread starts here: http://mail-archives.apache.org/mod_mbox/lucene-dev/201911.mbox/%3cCAOz296DSV-tt7rWBirBZ+P4=vT5g29FZrR_2zHrHF084Xq+gyw@mail.gmail.com%3e . Jira issue here: https://issues.apache.org/jira/browse/SOLR-13953

For large (> 100 node) clusters, the exporter fails with "connection pool shut down" for certain nodes. This is caused by only the last 100 HttpSolrClients added to the cache still being open -- earlier additions are evicted and closed, even though the HttpSolrClients are still returned for use in metricsForAllHosts and pingAllCollections .

Solution

Change the cache configuration to evict clients from the cache after twice the scrape interval, instead of using a fixed size cache.

Would love opinions on:

  • any other approaches
  • whether we should use a different caching strategy
  • whether we should use some other timeout besides twice the scrape interval.

Tests

Can add a regression test if it seems valuable, but with past configuration, would need to be with > 100 nodes.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I am authorized to contribute this code to the ASF and have removed any code I do not have a license to distribute.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the master branch.
  • I have run ant precommit and the appropriate test suite.
  • I have added tests for my changes.
  • I have added documentation for the Ref Guide (for Solr changes only).

…ow for larger clusters

Co-authored-by: Serj Krasnov <sv.krasnov@gmail.com>
@ajablonski ajablonski force-pushed the solr-prometheus-change-cache-config branch from ec6e93d to 63ba4df Compare November 20, 2019 20:12
@ajablonski ajablonski changed the title CUpdate eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters Nov 20, 2019
@ajablonski ajablonski changed the title Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters Nov 21, 2019
@ErickErickson
Copy link

Forgot to close this when I fixed the JIRA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants