Skip to content

fix(validate): skip gcp-gce revalidation (composite vs component SKU mismatch)#65

Merged
sofq merged 1 commit intomainfrom
fix/validate-skip-gcp-gce
May 5, 2026
Merged

fix(validate): skip gcp-gce revalidation (composite vs component SKU mismatch)#65
sofq merged 1 commit intomainfrom
fix/validate-skip-gcp-gce

Conversation

@sofq
Copy link
Copy Markdown
Owner

@sofq sofq commented May 5, 2026

Summary

  • Adds a per-shard `SKIP_REVALIDATION` dict in `pipeline/validate/driver.py`. Listed shards short-circuit with `exit="skipped"` and return 0.
  • Initial entry: gcp-gce. Ingest synthesizes machine totals from per-vCPU and per-GiB component prices (`pipeline/ingest/gcp_gce.py:140`); the validator reads the raw component `unitPrice` for the matching SKU id, so the two are not comparable. Result: 20/20 false-positive drift records on every run (Catalog drift in gcp-gce #62).
  • The other open drift issues (Catalog drift in azure-postgres #56-61) need live-API verification before action — not addressed here.

Test plan

  • `uv run pytest pipeline/tests/test_validate_driver.py` — new `test_driver_skips_listed_shard` passes; existing tests unchanged.
  • `uv run pytest pipeline/tests/test_validate_gcp.py pipeline/tests/test_validate_sampler.py` — green.
  • Next data-validate run for `gcp-gce` returns success → workflow auto-closes Catalog drift in gcp-gce #62.

Follow-up

A proper fix is a validation hint sidecar (ingest writes `expected_upstream_amount` per SKU; validator just compares those). Tracking issue to file once this lands.

Add SKIP_REVALIDATION dict in pipeline/validate/driver.py and short-circuit
those shards with exit="skipped" in the report. Listed shards return 0 so
the data-validate workflow closes any open drift issue.

Initial entry: gcp-gce. Ingest (pipeline/ingest/gcp_gce.py) synthesizes
per-machine totals from per-vCPU + per-GiB component prices, while the
validator reads the raw component unitPrice. The two are not comparable
without re-doing the ingest math, producing 20/20 false-positive drift
records on every run.

Closes #62.
@sofq sofq merged commit 1d92a4f into main May 5, 2026
20 checks passed
@sofq sofq deleted the fix/validate-skip-gcp-gce branch May 5, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Catalog drift in gcp-gce

1 participant