Skip to content

Pull metrics out of Clickhouse, expose 'em through Nexus' API #1131

@smklein

Description

@smklein

Here's the end-user flow we'd like:

  • Through the console (or perhaps the CLI?) a user can view metrics for some category of information. For example: "show me the metrics for HTTP endpoint latency. Show me metrics for disk/network usage. etc.
    • Point to consider: "operator" usage vs "end-user" usage -- each may see different metrics. We will want a different set of ACLs, at bare minimum, even if the Nexus implementation is mechanically similar.
    • Open question: how many endpoints? what query parameters are exposed? What would be useful for console?
  • This should trigger a request to the external Nexus API, which itself should be able to make requests to Clickhouse
    • Presumably, Nexus will act as an ACL validator + proxy to Clickhouse. Hopefully not too much post-processing of data is necessary.

What already exists:

  • There's machinery around oximeter to collect metrics from services, and store such information within Clickhouse itself. Although we should definitely add more metrics here (see: Upstairs disk stats -> Oximeter crucible#341 as an example), this half of the problem space is considered out-of-scope for this issue.
  • Since we already have HTTP endpoint latency wired up and dumped into Clickhouse, this may be an easy "first target". For utility, however, user-visible metrics (instance stats, disk/networking metrics, etc) will be high-value targets.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions