-
Notifications
You must be signed in to change notification settings - Fork 62
Description
#1348 provides an initial implementation of metrics, but there are a couple areas where we'd like to be able to improve in future iterations. This issue documents those improvements.
Although the current design is resource-centric (to query for metrics on a disk, an endpoint filtering by org/project/disk_name is used), it may make sense to migrate to a metric-centric approach where filters can be applied. Prior art.
Route
Concretely, where the current route is:
/organizations/{organization_name}/projects/{project_name}/disks/{disk_name}/metrics/{metric_name}
We should consider an route like the following:
/organizations/{organization_name}/projects/{project_name}/metrics/{metric_name}
Where filters like instance_id and/or disk_id may be supplied as query parameters.
An important use case is an "instance-centric flow", where a user can query for information about their particular instance. This becomes feasible by directly being able to filter on instance_id. This is not yet feasible today without oxidecomputer/crucible#375 , but is a worthwhile goal.
Org/Project Scoping
Additionally, there's some consideration whether we'd like to add an endpoint to view metrics "globally", e.g., outside the context of an organization / project. This view may be useful for operators who which to analyze performance across a sled / rack / AZ, as opposed to a user aiming for a more instance-centric flow.
Lifetimes
It's worth considering how we'd like to enable users to query for metrics of objects that have been deleted. Use-cases like a "short-lived instance" are still valid, and have measurement information stored within Clickhouse.
If we enable "query-by-name", this is more complicated, as names may be re-used after deletion of resources. However, if we provide "query-by-ID", this seems like less of an issue.