Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-15081: Metrics for core: isLeader, status #2198

Merged
merged 2 commits into from Jan 19, 2021

Conversation

dsmiley
Copy link
Contributor

@dsmiley dsmiley commented Jan 12, 2021

https://issues.apache.org/jira/browse/SOLR-15081
Copying the description:

The core level metrics hold some interesting information, but I don't see information pertaining to the SolrCloud status of the core. In particular, I'd like to see the leader status here, and also the replica state. The use-case I have in mind is enabling the Prometheus Exporter to get the doc count (and maybe other basics) of only the leader replicas, thereby counting unique documents instead of a fully replicated figure. This is an approximation to doing a match-all-docs query on all collections, but is a more sound approach when one has orders of magnitude more collections than nodes.

// TODO
parentContext.gauge(cd::getCollectionName, true, "collection", Category.CORE.toString());
parentContext.gauge(() -> Objects.requireNonNullElse(cd.getShardId(), parentContext.nullString()), true, "shard", Category.CORE.toString());
//TODO should this instead be in a core status, or a metric? When do we use which?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote these notes and code months ago, shelved it and nearly forgotten, and today I remembered it and submitted it. The TODO here was kind of a note-to-self that can be removed. I'd welcome anyone's thoughts on this though. There appears, to me, to be overlap in scope between metrics and "status" type requests. Years ago I thought Metrics was just numbers, but lately I've seen it can have all sorts of strings and basically completely compete with "status".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, metrics today overlap a lot with "status" requests... something to clean up in 9x.

Initially the metrics API wasn't able to properly report complex values (esp. when reported via JMX) but this has been fixed around 7.0 or so - support for non-numeric values had to be added specifically to report things like paths, non-numeric state, etc. and for complex properties like eg. system properties, caches, etc. Now it can report basically anything you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's inherently wrong to have overlap between status and metrics, as long as the overlapping information is coming from the same source.

Note that getLastPublished returns an Enum type.  TextWriter.writeVal should probably support Enums, which would simplify.
@dsmiley dsmiley merged commit a233ed2 into apache:master Jan 19, 2021
@dsmiley dsmiley deleted the metricsIsLeader branch January 19, 2021 21:43
dsmiley added a commit that referenced this pull request Jan 19, 2021
Note that getLastPublished returns an Enum type.  TextWriter.writeVal should probably support Enums, which would simplify this code.

(cherry picked from commit a233ed2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants