Convert bee_broker_* gauges to async Int64ObservableGauge

## Context

Follow-up to #388. That PR silences `fly-autoscaler`'s `empty prometheus result` error by wrapping the PromQL in `or on() vector(0)`. The underlying cause is that the broker gauges go sparse in Fly's managed Prometheus during idle, and the PR papers over it at the autoscaler layer.

## Root cause

OTel Go SDK synchronous `Int64Gauge` / `Float64Gauge` use `LastValue` aggregation. The aggregator only emits a sample at collect time if `Record()` was called at least once during the collect interval. During long idle periods the broker probe stops landing `Record()` calls inside every collect window (whether because of the `anySuccess` gate at [`internal/broker/probe.go:150`](https://github.com/Good-Native/hover/blob/main/internal/broker/probe.go#L150), an empty `ActiveJobIDs` path that doesn't fall through as cleanly as expected, or other early returns), so the Prom series goes stale.

Grafana Cloud confirms the pattern across all three queried metrics — visible activity during work, ~20+ hours of empty/sparse data during idle, fresh instant value once the worker restarts.

## Affected instruments (all in `internal/observability/observability.go:959-1006`)

- `brokerStreamLengthGauge` — `Int64Gauge`
- `brokerScheduledDepthGauge` — `Int64Gauge`
- `brokerConsumerPendingGauge` — `Int64Gauge`
- `brokerUnclampedScaleTargetGauge` — `Float64Gauge`
- `brokerOutboxBacklogGauge` — `Int64Gauge`
- `brokerOutboxAgeGauge` — `Float64Gauge`

## Proposed fix

1. Replace the six synchronous gauges with `Int64ObservableGauge` / `Float64ObservableGauge`.
2. Maintain the latest values in atomic state (e.g. `atomic.Int64` per `(metric, stream_type)` key).
3. Register a single batch callback with `meter.RegisterCallback(...)` that reads the atomic state and emits to the observer at every collect.
4. Update `RecordBrokerStreamStats`, `RecordBrokerOutbox`, etc. to write to atomic state instead of calling `Gauge.Record()`.
5. Once the metric is continuous, the `or on() vector(0)` wrapper in `fly.autoscaler-*.toml` can be reverted to restore the Redis-outage series-gap behaviour described in `internal/broker/probe.go:130`.

## Verification

- `flyctl ssh console -a hover-worker` → `curl -s localhost:9464/metrics | grep bee_broker_scheduled_zset_depth` during an idle window should always return a fresh value.
- Grafana Cloud time-series for the six metrics should show continuous lines (flat at 0 during idle, no gaps).

## Estimated effort

~40–60 lines plus tests. The trickier piece is preserving the `anySuccess`-gated behaviour: probe success/failure should still influence whether the autoscaler sees a gap during a Redis outage with active jobs. Likely solved by exposing a `lastSuccessAt time.Time` and short-circuiting the callback if the last successful probe is too stale.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert bee_broker_* gauges to async Int64ObservableGauge #389

Context

Root cause

Affected instruments (all in `internal/observability/observability.go:959-1006`)

Proposed fix

Verification

Estimated effort

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Convert bee_broker_* gauges to async Int64ObservableGauge #389

Description

Context

Root cause

Affected instruments (all in internal/observability/observability.go:959-1006)

Proposed fix

Verification

Estimated effort

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Affected instruments (all in `internal/observability/observability.go:959-1006`)