Skip to content

feat: resourceType rename + queued time + ip dim + gauge fill#19

Merged
lohanidamodar merged 11 commits into
mainfrom
feat/queued-time-resource-type-ip-fill
Jul 2, 2026
Merged

feat: resourceType rename + queued time + ip dim + gauge fill#19
lohanidamodar merged 11 commits into
mainfrom
feat/queued-time-resource-type-ip-fill

Conversation

@lohanidamodar

Copy link
Copy Markdown
Contributor

Foundational library changes for the next round of usage warehouse
improvements. Downstream server-ce and cloud repos will land the
consumer-side migrations in follow-up PRs that depend on this one.

Summary of changes

  1. resource -> resourceType rename (BREAKING). The dimension
    previously called resource on the events, gauges, and events-daily
    tables is renamed to resourceType. The new name matches what the
    column actually stores (the type of resource: bucket, function,
    database, file, site) and stops colliding with the resourceId
    companion column. Metric column constants, schema definitions,
    indexes, projections (p_by_resource* -> p_by_resourceType*),
    materialized-view SELECT list, and the getter (getResource() ->
    getResourceType()) all follow.

  2. Queued timestamp threaded end-to-end. Accumulator::collect()
    accepts an optional ?\DateTime $time = null. When provided, the
    ClickHouse adapter writes the row with that timestamp; when omitted
    the adapter still stamps now() (backward-compatible). Events fold
    into the earliest supplied time for the same (tenant, metric, tags)
    bucket; gauges are last-write-wins on both value and time.

  3. ip dimension on events (only). New event-only column stored
    as LowCardinality(Nullable(String)) (size 45, IPv6 max), indexed
    with bloom_filter. Deliberately not added to the base-table
    primary key so per-metric time range scans keep skipping granules
    efficiently. Gauges unchanged.

  4. Gauge getTimeSeries fills empty buckets with the last known
    value.
    Sub-hour interval reads of gauges no longer collapse to
    zero between snapshots. Fill runs PHP-side after the aggregation
    query (LOCF via locfFillTimeSeries) rather than a ClickHouse
    WITH FILL ... INTERPOLATE clause, because the outer merge already
    normalizes bucket keys at that boundary and needs per-metric
    fill-type routing anyway. Events keep zero-fill.

Compatibility

  • Backward-compatible: the queued $time argument on
    Accumulator::collect(), the time field on addBatch() rows, and
    the ip tag key are all optional. Existing callers that don't set
    them keep the previous behaviour.
  • Breaking: resource -> resourceType rename. Callers writing
    the resource tag key will fail the strict tag validator in
    Metric::extractColumns() because it no longer recognises
    resource. Queries filtering / grouping by the old name will hit
    "Invalid attribute name". Metric::getResource() is removed in
    favour of Metric::getResourceType().

Migration

A one-time ClickHouse column rename plus data backfill is required for
existing deployments. That migration lives in the cloud repo, not
this library — the cloud PR will run the ALTER TABLE ... RENAME COLUMN and any downstream reads that need dual-name handling during
the switchover.

Consumers will migrate the write-side tag key and read-side query
names in follow-up PRs on appwrite/appwrite (server-ce) and the
cloud repo.

Verification

Ran locally (in the worktree):

  • composer format:check (via ./vendor/bin/pint --test) -> pass.
  • composer analyze (via ./vendor/bin/phpstan at level: max) ->
    no errors.
  • ./vendor/bin/phpunit --testsuite "Test Suite" --filter "MetricTest|AccumulatorTest|TenantTest|UsageQueryTest" -> 103
    tests, 323 assertions, all pass.

ClickHouse integration tests (ClickHouseTest, the routing tests,
DatabaseTest) were not run locally — they require the
docker-compose harness in this repo. New gauge LOCF fill and event
zero-fill assertions were added to ClickHouseTest and should light
up in CI.

The dimension previously called `resource` on the events, gauges, and
events-daily tables now becomes `resourceType`. The name matches what
the column actually stores — the type of the resource ('bucket',
'function', 'database', 'file', 'site') — and avoids collision with the
`resourceId` companion column.

Breaking change:

- Metric::EVENT_COLUMNS and Metric::GAUGE_COLUMNS entries renamed.
- Metric::getEventSchema() / getGaugeSchema() column definitions
  renamed.
- Metric::getEventIndexes() / getGaugeIndexes() cover the new name.
- Metric::getResource() renamed to Metric::getResourceType().
- ClickHouse adapter renames:
  * events table column: resource -> resourceType
  * gauges table column: resource -> resourceType
  * events-daily SummingMergeTree column + ORDER BY: same rename
  * MV projection SELECT dimension list
  * GAUGE_PROJECTIONS: p_by_resource -> p_by_resourceType,
    p_by_resource_resourceId -> p_by_resourceType_resourceId
  * ensureGaugeDimColumns() backfill ADD COLUMN targets resourceType
  * getColumnType() LowCardinality list updated
  * findDaily() GROUP BY dimension list updated

Consumers must migrate their write callers (change the `resource` tag
key to `resourceType`) and their queries. A one-time ClickHouse column
rename plus data backfill is required for existing deployments; that
migration lives in the cloud repo, not this library.

Tests updated to exercise the new column name throughout. All 97
non-ClickHouse unit tests pass; ClickHouse integration tests were not
run locally (require the docker-compose harness).
Adds an event-only `ip` dimension that records the caller IP address
associated with each event row. IPv4 and IPv6 forms fit into the 45-char
column (IPv6 dotted-quad max).

The dimension is not added to gauges — gauges are point-in-time
snapshots that already carry their identity via resourceType/resourceId
and don't need a per-caller attribution axis.

Details:

- Metric::EVENT_COLUMNS gains 'ip' and Metric::getEventSchema() adds
  the matching String column (size 45, nullable).
- Metric::getEventIndexes() indexes 'ip' with a bloom_filter — high
  cardinality per tenant makes a set-index a poor fit.
- Metric::getIp() typed accessor added.
- ClickHouse adapter: 'ip' joins the LowCardinality(Nullable(String))
  list so the physical column mirrors the other host/network dims.
- 'ip' is intentionally NOT added to the base-table ORDER BY / primary
  key — that shape stays (tenant, metric, time, id) so per-metric time
  range scans keep skipping granules efficiently.
- Backward-compatible: callers that never set 'ip' keep working; the
  column is nullable and Metric::extractColumns() maps missing tags to
  null.

Tests cover both event-only presence and the round-trip through
getIp() for IPv4 + IPv6 forms.
Callers can now pass an optional \DateTime to Accumulator::collect() so
the row lands at the moment the metric was originally emitted rather
than the moment the buffer happens to flush. Backward-compatible:
callers that don't pass a time keep the previous now()-at-write
behaviour, and the ClickHouse adapter still writes now() when no time
is threaded through.

Semantics:

- Events fold into the buffered entry as before; the earliest supplied
  time wins so a queue that fills a bucket over multiple collect()
  calls doesn't slide forward on late arrivals.
- Gauges are last-write-wins on value, and the queued time follows the
  same rule — the latest supplied time replaces the previous one along
  with the value.

Wire-up:

- Accumulator::collect() gains a nullable \DateTime $time param;
  entries carry an optional 'time' key in the buffer.
- Usage::addBatch phpdoc widened to advertise the optional time field
  on each metric row (DateTime|string, since the adapter already
  accepted strings elsewhere).
- ClickHouse::addBatch() reads $metricData['time'], validates the
  shape (DateTime|string), and threads it into formatDateTime();
  null / invalid falls back to now() as before.

Tests cover: omitted-time keeps the field out of the payload, threaded
DateTime survives to the adapter batch, event fold preserves the
earliest time, gauge last-write-wins picks the latest.
Gauge time-series queries now carry the last observed value forward
into buckets that have no snapshot inside the requested window, instead
of collapsing to zero. Event time series keep the previous zero-fill
semantics — additive metrics genuinely mean "no activity" when a
bucket has no rows, whereas gauge metrics represent point-in-time
state that persists across buckets.

Design choice: the fill runs in PHP after getTimeSeriesFromTable()
emits its `FORMAT JSON` result rather than as a ClickHouse `WITH FILL
... INTERPOLATE` clause on the aggregation query. Two reasons:

1. The outer getTimeSeries() already normalizes bucket keys before
   returning to the caller, so the fill has to run at that boundary
   regardless of what the SQL does.
2. Per-metric fill type (zero vs LOCF) depends on which table produced
   the row, which is tracked at PHP level via a metric->type map.
   Sinking that decision into SQL would require issuing one query
   per (metric, type) pair; keeping it in PHP costs O(buckets)
   iteration per metric and preserves the single-query read path.

Semantics of the LOCF pass (locfFillTimeSeries):

- Buckets before the first observed snapshot in the window get 0 —
  we haven't seen any value yet.
- Buckets aligned to a snapshot get that snapshot's value.
- Buckets between snapshots get the most recent earlier snapshot's
  value.
- Multiple snapshots landing in the same bucket collapse to the
  latest by date string, matching argMax(value, time) on the write
  path.

Type-map tracking: getTimeSeries() records which side (event / gauge)
returned data for each metric name, then routes each metric through
the matching fill helper. Metrics that returned no data fall back to
the caller-supplied $type (or zero-fill when $type is null) so the
existing shape of `data: [ {value:0, date:...}, ... ]` is preserved.

Return shape is unchanged: array<{value: float, date: string}> per
metric — the only observable difference is that gauge buckets now
carry values across gaps.
@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown

Greptile Summary

This PR lands four coordinated foundational changes for the next usage warehouse iteration: a breaking resourceresourceType column rename across events, gauges, and events-daily tables; an optional $time argument threaded from Accumulator::collect() through addBatch(); a new event-only ip dimension; and PHP-side LOCF fill for gauge time-series reads.

  • The ensureDimColumns refactor correctly backfills both resourceType and ip on existing events and gauge tables via ADD COLUMN IF NOT EXISTS, resolving the asymmetry between the two tables that existed in earlier drafts of this PR.
  • The time-merge logic in Accumulator now correctly keeps the earliest $time on event bucket merges ($time < $this->buffer[$key]['time']), with an explicit out-of-order test added to the suite.
  • The LOCF locfFillTimeSeries implementation correctly zero-fills pre-snapshot buckets, carries the last observed value forward through empty buckets, and routes per-metric by detected type when $type is null.

Confidence Score: 5/5

All four features are well-scoped and the previously-identified concerns with ensureEventDimColumns and the Accumulator time-merge order have been corrected in this revision.

The ensureDimColumns refactor now consistently covers both the events and gauge tables with their full column sets, so existing deployments that upgrade will automatically receive the resourceType and ip columns before any INSERT is attempted. The Accumulator time-merge correctly uses a strict less-than comparison so out-of-order re-delivery cannot slide a bucket's timestamp forward. The LOCF fill implementation handles pre-snapshot zero-fill, in-window carry-forward, and post-snapshot carry-forward correctly, with an explicit integration test confirming the expected bucket values. No issues were found that would block a merge.

No files require special attention. The daily-table and materialized-view migration for the resourceType rename is intentionally deferred to the cloud repo, as documented in the PR description.

Important Files Changed

Filename Overview
src/Usage/Accumulator.php Adds optional $time parameter; event merge now correctly keeps minimum time with explicit $time < existing comparison; gauge last-write-wins unchanged.
src/Usage/Adapter/ClickHouse.php Refactors ensureDimColumns to cover both event and gauge tables with all column sets (including resourceType and ip); adds locfFillTimeSeries for gauge fill with correct pre-snapshot zero behaviour; fill-type routing per metric is correct.
src/Usage/Metric.php Renames resource to resourceType in EVENT_COLUMNS/GAUGE_COLUMNS constants, schemas, and indexes; adds ip to event-only columns and getIp() accessor; removes getResource() in favour of getResourceType().
tests/Usage/AccumulatorTest.php Adds five new tests covering time threading, earliest-time merge (including out-of-order delivery), and gauge last-write-wins for time.
tests/Usage/Adapter/ClickHouseTest.php Adds gauge LOCF fill test and event zero-fill regression test; updates all resource to resourceType and getResource() to getResourceType() call sites.
tests/Usage/Adapter/ClickHouseSchemaTest.php Adds testSetupBackfillsIpOnLegacyEventsTable covering the new ensureEventDimColumns path; extracts showCreateFor helper; expectedDimAssertions mirrors getColumnType logic.
tests/Usage/Adapter/ClickHouseGaugeDimRoutingTest.php Updates seed data and projection assertions from resource to resourceType; routing test matrix updated with new projection names.
tests/Benchmark/fixtures/seed.sql Column rename from resource to resourceType in INSERT and SELECT; no logic changes.

Reviews (7): Last reviewed commit: "fix(schema): ensure every insert column,..." | Re-trigger Greptile

Comment thread src/Usage/Accumulator.php
…name

The rename shipped in this branch missed a handful of test-side
expected-DDL strings and one numeric-type assertion that changed
under the gauge LOCF path. Sync the test expectations to the current
library output.
Bring the implementation in line with the docblock/PR description
promise: when two collect() calls fold into the same buffer key with
different queued times, keep the earlier one. Previously the first
non-null time won regardless of value, so an out-of-order redelivery
could stamp the merged row with a later timestamp than the true
earliest one.
Comments were justifying design decisions and restating code behaviour;
the design rationale belongs in the commit bodies where it already is.
Kept only the non-obvious invariants (last-write-wins on write, LOCF
fallback for pre-window gauge buckets).
Existing deploys use CREATE TABLE IF NOT EXISTS, so an events table
created before this PR never gets the new ip column via setup(). Add
an ensureEventDimColumns() guard that mirrors the gauge equivalent -
one ADD COLUMN IF NOT EXISTS ip ALTER, run right after createTable().
Comment thread src/Usage/Adapter/ClickHouse.php Outdated
…s projections

ensureEventDimColumns() was only adding ip, but pre-existing events
tables may also lack path / country / service (the columns EVENT_PROJECTIONS
reference). setup() would then fail on ADD PROJECTION p_by_path because
the source column doesn't exist. Extend the guard to cover every dim
column the projection slate references, matching the gauge helper's
pattern.

Also extend ensureGaugeDimColumns() to add resourceId. GAUGE_PROJECTIONS
references resourceId (p_by_resourceId, p_by_resourceType_resourceId),
and resourceId was added to the gauge schema after the very first release
(pre-May 2026 deployments started with just metric/value/time/tags), so
the same drift window applies. Types match Metric.php: LowCardinality
Nullable(String) for the low-card dims, Nullable(String) for path and
resourceId.
ensureEventDimColumns and ensureGaugeDimColumns were only backfilling
the columns the projections referenced. But the insert column list is
broader — any missing column there fails every insert on legacy tables.
Iterate EVENT_COLUMNS / GAUGE_COLUMNS and add IF NOT EXISTS for each
optional dim; skip the base primary-key columns which must exist by
construction.
@lohanidamodar lohanidamodar merged commit ca50537 into main Jul 2, 2026
4 checks passed
@lohanidamodar lohanidamodar deleted the feat/queued-time-resource-type-ip-fill branch July 2, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant