Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17084: Don't return full list of zombie in Exception #2097

Merged
merged 4 commits into from
Nov 30, 2023

Conversation

gbellaton
Copy link
Contributor

https://issues.apache.org/jira/browse/SOLR-17084

Description

Exceptions returned by LBSolrClient can become very large. Building and returning them can consume large amount of ressources.

Solution

Exceptions should not contain the full list of zombie replicas but only its count.

Tests

No tests added since the change is really simpl

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I imagine seeing this error printed somewhere, I'd be confused as to what this number is/means. I might think; is this the number of live Solr Servers? (I would be wrong).

"Maybe print "No live SolrServers/cores available to handle this request. (Tracking 999 not live)"
WDYT?

@gbellaton
Copy link
Contributor Author

I agree. I changed the PR accordingly.

@dsmiley
Copy link
Contributor

dsmiley commented Nov 28, 2023

Cool; can you please add a CHANGES.txt entry to the 9.5 section... I suppose as a Bug but I could also see this as an Improvement or Optimization if you prefer. It's sort of a "scale bug" but one might say there is no such thing; it's an optimization. Shrug. I guess I prefer Optimization but I'll let you pick.

Proposed text: "LBSolrClient (used by CloudSolrClient) can potentially return a VERY long error message if it can't route a request when there are many cores tracked as not live AKA zombies. Just return the count of such cores instead of printing them.

@gbellaton
Copy link
Contributor Author

update the CHANGES.txt. My PR is against main. I suppose you want to also merge in 9.5 right ?

@dsmiley
Copy link
Contributor

dsmiley commented Nov 30, 2023

I looked into the one test failure and I filed an issue to track: https://issues.apache.org/jira/browse/SOLR-17092

@dsmiley dsmiley merged commit 0b95a6e into apache:main Nov 30, 2023
2 of 3 checks passed
dsmiley pushed a commit that referenced this pull request Dec 1, 2023
…s" (#2097)

 LBSolrClient (used by CloudSolrClient) now returns the count of core tracked as not live AKA zombies
  instead of the full list of cores. This list is potentially VERY long which was causing high CPU and memory usage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants