Skip to content

feat(graph): derive report lineage from metadata#150

Open
jonathanhaaswriter wants to merge 4 commits intofeat/graph-store-read-foundationfrom
feat/report-lineage-metadata-store
Open

feat(graph): derive report lineage from metadata#150
jonathanhaaswriter wants to merge 4 commits intofeat/graph-store-read-foundationfrom
feat/report-lineage-metadata-store

Conversation

@jonathanhaaswriter
Copy link
Copy Markdown
Collaborator

Summary

  • add a metadata-only report lineage helper in the reports package
  • derive platform report run lineage from graph metadata instead of snapshot materialization
  • cover the snapshot-unavailable path with report lineage regressions

Testing

  • go test ./internal/api ./internal/graph/reports -count=1
  • python3 ./scripts/devex.py run --mode changed --base-ref writer/feat/graph-store-read-foundation

Stacked on #147.

@jonathanhaaswriter jonathanhaaswriter force-pushed the feat/graph-store-read-foundation branch from b271a42 to f80edf0 Compare March 25, 2026 15:28
@jonathanhaaswriter jonathanhaaswriter force-pushed the feat/report-lineage-metadata-store branch from 9b5d757 to 050a755 Compare March 25, 2026 15:33
@jonathanhaaswriter
Copy link
Copy Markdown
Collaborator Author

One thing still looks off in the current diff:

  • internal/api/server_handlers_platform.go:currentPlatformReportLineage now builds lineage from currentOrStoredGraphMetadata(). On count-only stores, graphMetadataFromCounts() synthesizes BuiltAt from time.Now(), so an unchanged store-backed graph will produce a different graph_snapshot_id on every request.

Could we avoid deriving lineage from metadata unless the backend is returning a real persisted build/snapshot timestamp?

@jonathanhaaswriter
Copy link
Copy Markdown
Collaborator Author

I think we still have two issues here:

  • tenant-scoped Spanner snapshots recalculate node/edge counts but leave Metadata.Providers / Metadata.Accounts untouched, so tenant-scoped snapshot metadata can still expose foreign tenant identifiers
  • node IDs are trimmed on read, but edge Source / Target IDs are not, so padded endpoint IDs can make valid edges disappear from traversals/subgraphs

Can we recompute/clear tenant-scoped metadata from the filtered node set, and normalize edge endpoint IDs the same way as node IDs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant