SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters #1022

ajablonski · 2019-11-20T20:03:35Z

Description

Email thread starts here: http://mail-archives.apache.org/mod_mbox/lucene-dev/201911.mbox/%3cCAOz296DSV-tt7rWBirBZ+P4=vT5g29FZrR_2zHrHF084Xq+gyw@mail.gmail.com%3e . Jira issue here: https://issues.apache.org/jira/browse/SOLR-13953

For large (> 100 node) clusters, the exporter fails with "connection pool shut down" for certain nodes. This is caused by only the last 100 HttpSolrClients added to the cache still being open -- earlier additions are evicted and closed, even though the HttpSolrClients are still returned for use in metricsForAllHosts and pingAllCollections .

Solution

Change the cache configuration to evict clients from the cache after twice the scrape interval, instead of using a fixed size cache.

Would love opinions on:

any other approaches
whether we should use a different caching strategy
whether we should use some other timeout besides twice the scrape interval.

Tests

Can add a regression test if it seems valuable, but with past configuration, would need to be with > 100 nodes.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I am authorized to contribute this code to the ASF and have removed any code I do not have a license to distribute.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
I have developed this patch against the master branch.
I have run ant precommit and the appropriate test suite.
I have added tests for my changes.
I have added documentation for the Ref Guide (for Solr changes only).

…ow for larger clusters Co-authored-by: Serj Krasnov <sv.krasnov@gmail.com>

ErickErickson · 2020-02-15T01:21:57Z

Forgot to close this when I fixed the JIRA.

CUpdate eviction behavior of cache in Solr Prometheus exporter to all…

63ba4df

…ow for larger clusters Co-authored-by: Serj Krasnov <sv.krasnov@gmail.com>

ajablonski force-pushed the solr-prometheus-change-cache-config branch from ec6e93d to 63ba4df Compare November 20, 2019 20:12

ajablonski changed the title ~~CUpdate eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters~~ Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters Nov 20, 2019

dsmiley requested a review from kojisekig November 21, 2019 14:44

ajablonski changed the title ~~Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters~~ SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters Nov 21, 2019

ErickErickson closed this Feb 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters #1022

SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters #1022

ajablonski commented Nov 20, 2019 •

edited

Loading

ErickErickson commented Feb 15, 2020

SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters #1022

SOLR-13953: Update eviction behavior of cache in Solr Prometheus exporter to allow for larger clusters #1022

Conversation

ajablonski commented Nov 20, 2019 • edited Loading

Description

Solution

Tests

Checklist

ErickErickson commented Feb 15, 2020

ajablonski commented Nov 20, 2019 •

edited

Loading