Skip to content

feat(demo): 14-day seed with pipelines, nodes, metrics, alerts, anomalies#188

Merged
TerrifiedBug merged 1 commit intomainfrom
feat/demo-seed
Apr 27, 2026
Merged

feat(demo): 14-day seed with pipelines, nodes, metrics, alerts, anomalies#188
TerrifiedBug merged 1 commit intomainfrom
feat/demo-seed

Conversation

@TerrifiedBug
Copy link
Copy Markdown
Owner

Summary

Demo deployments were rendering empty dashboard / analytics / fleet graphs because no time-series rows existed. The only seed in the repo (`e2e/helpers/seed.ts`) creates a single bare pipeline for end-to-end tests and skips `PipelineMetric` / `NodeMetric` / `AnomalyEvent` / `CostRecommendation` entirely. This adds a proper demo seed.

What gets created

  • 1 demo user (`demo@demo.local` / `demo`, super-admin) + 1 team
  • 3 environments (Production, Staging, Development) with cost-per-GB
  • 8 pipelines with realistic source → transform → sink nodes, edges, v1 `PipelineVersion` (k8s-logs-to-s3, auth-events-to-elastic, metrics-aggregator, syslog-to-loki, app-logs-to-clickhouse, audit-trail-to-splunk, dev-firehose, trace-spans-to-tempo)
  • 12 vector nodes including one DEGRADED and one UNREACHABLE so the fleet page renders mixed health
  • 14 days of PipelineMetric at 5-min granularity (~32k rows) with sinusoidal daily curve, weekly weekend dip, and per-pipeline baselines for traffic / reduction / error / latency
  • 14 days of NodeMetric at 15-min granularity (~16k rows) with CPU, memory, disk, network, load averages
  • 6 alert rules + ~9 alert events (firing / resolved / acknowledged)
  • 14 anomaly events across severities and statuses
  • 5 cost recommendations driven by the seeded reduction / error rates
  • Initial NodeStatusEvent transitions for each node

Safety

  • Refuses to run unless `NEXT_PUBLIC_VF_DEMO_MODE=true` — this script is destructive and would wipe a real production database otherwise
  • Idempotent: deletes the existing demo user and its team(s), which cascades via Prisma `onDelete` to all environments, pipelines, nodes, metrics, alerts, anomalies, and cost recommendations, then recreates everything

How to run

```bash
NEXT_PUBLIC_VF_DEMO_MODE=true pnpm seed:demo
```

For the hosted demo: run after migrations on first deploy, then on a nightly cron to match the "data resets nightly at 03:00 UTC" copy in the demo banner. Cron wiring is intentionally out of scope for this PR — that's deployment plumbing, not application code.

Test plan

  • `pnpm tsc --noEmit` clean
  • `pnpm vitest run` -- 2532 tests pass (seed module not imported anywhere)
  • Run `pnpm seed:demo` locally against a fresh Postgres + applied migrations
  • Verify dashboard / analytics / fleet pages render populated graphs
  • Verify re-running the seed wipes prior demo data and recreates without errors

…ies, costs

Adds prisma/seed.demo.ts and a pnpm seed:demo script for the public
demo. The previous demo deployment showed empty dashboard / analytics
graphs because no time-series rows existed; the only seed in the repo
was e2e/helpers/seed.ts which intentionally creates a single bare
pipeline for end-to-end tests and skips PipelineMetric / NodeMetric /
AnomalyEvent / CostRecommendation entirely.

What gets created:
- 1 demo user (demo@demo.local / demo, super-admin) and 1 team
- 3 environments (Production, Staging, Development) with cost-per-GB
- 8 pipelines spread across the environments, each with realistic
  source -> transform -> sink nodes, edges, and a v1 PipelineVersion
- 12 vector nodes including one DEGRADED and one UNREACHABLE so the
  fleet page renders mixed health
- 14 days of PipelineMetric rows at 5-minute granularity (~32k rows)
  with a sinusoidal daily curve, weekly weekend dip, and per-pipeline
  baseline traffic / reduction / error / latency profiles
- 14 days of NodeMetric rows at 15-minute granularity (~16k rows)
  with CPU, memory, disk, network, and load averages
- 6 alert rules + ~9 alert events (firing, resolved, acknowledged) so
  the alerts panel is non-empty
- 14 anomaly events across all severities and statuses
- 5 cost recommendations driven by the seeded reduction / error rates
- Initial NodeStatusEvent transitions for each node

Safety:
- Refuses to run unless NEXT_PUBLIC_VF_DEMO_MODE=true so it cannot wipe
  a real production database by accident
- Idempotent: deletes the existing demo user and its team(s) (which
  cascades via Prisma onDelete to all environments, pipelines, nodes,
  metrics, alerts, anomalies, and cost recommendations) before
  recreating

Run: NEXT_PUBLIC_VF_DEMO_MODE=true pnpm seed:demo
@github-actions github-actions Bot added dependencies Pull requests that update a dependency file feature labels Apr 27, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 27, 2026

Greptile Summary

Adds a prisma/seed.demo.ts script that populates 14 days of realistic time-series data (pipeline metrics, node metrics, alerts, anomalies, cost recommendations) for demo deployments, guarded by NEXT_PUBLIC_VF_DEMO_MODE=true and designed to be idempotent.

  • P1 — generateCostRecommendations: type and title are derived solely from reductionRatio (< 0.1 → LOW_REDUCTION, else HIGH_ERROR_RATE), never from errorRate. Pipelines like metrics-aggregator (errorRate: 0.0001) and k8s-logs-to-s3 (errorRate: 0.002) are mislabeled HIGH_ERROR_RATE with title "… error rate above 1%", making the recommendation panel display contradictory data to demo users.

Confidence Score: 3/5

Safe to merge for non-demo deployments (script is gated); demo deployments will show incorrect cost recommendation types until the classification logic is fixed.

One P1 logic bug in generateCostRecommendations causes mislabeled recommendation cards in the demo UI. Capped at 4 for P1 presence; pulled to 3 because the bug directly corrupts visible demo content that the PR is specifically designed to produce.

prisma/seed.demo.ts — generateCostRecommendations type/title discriminator logic

Important Files Changed

Filename Overview
prisma/seed.demo.ts 749-line demo seed; one P1 logic bug in generateCostRecommendations where reductionRatio is used as the discriminator for HIGH_ERROR_RATE type/title instead of errorRate, producing misleading demo recommendation cards.
package.json Adds seed:demo script entry; straightforward, no issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[pnpm seed:demo] --> B{NEXT_PUBLIC_VF_DEMO_MODE=true?}
    B -- No --> Z[Exit with error]
    B -- Yes --> C[wipeDemoData\ndelete AuditLog + Team + User]
    C --> D[buildCore\nuser / team / envs / pipelines / nodes]
    D --> E[generatePipelineMetrics\n~32k rows, 5-min intervals]
    D --> F[generateNodeMetrics\n~16k rows, 15-min intervals]
    D --> G[generateAlerts\n6 rules + ~9 events]
    D --> H[generateAnomalies\n14 events]
    D --> I[generateCostRecommendations\n5 records ⚠️ type bug]
    D --> J[generateNodeStatusEvents\ninitial transitions]
    E & F & G & H & I & J --> K[Done]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: prisma/seed.demo.ts
Line: 696-700

Comment:
**Cost recommendation type classification uses wrong discriminator**

The `type` field is derived from `reductionRatio`, but the `HIGH_ERROR_RATE` branch never looks at `errorRate`. As a result, pipelines with high reduction but near-zero error rates — e.g. `metrics-aggregator` (`reductionRatio: 0.65`, `errorRate: 0.0001`) and `k8s-logs-to-s3` (`reductionRatio: 0.12`, `errorRate: 0.002`) — are tagged `HIGH_ERROR_RATE` with the title "… error rate above 1%" even though their actual error rates are 0.01% and 0.2%. The demo recommendation cards will display incorrect, contradictory information.

```suggestion
        type: p.template.errorRate >= 0.01 ? "HIGH_ERROR_RATE" : "LOW_REDUCTION",
        status: "PENDING",
        title: p.template.errorRate >= 0.01 ? `${p.name} error rate above 1%` : `${p.name} drops less than 10% of bytes`,
        description: p.template.errorRate >= 0.01
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat(demo): seed 14 days of pipelines, n..." | Re-trigger Greptile

Comment thread prisma/seed.demo.ts
Comment on lines +696 to +700
console.log(" generating node status events...");
for (const n of ctx.nodes) {
await prisma.nodeStatusEvent.create({
data: {
nodeId: n.id,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Cost recommendation type classification uses wrong discriminator

The type field is derived from reductionRatio, but the HIGH_ERROR_RATE branch never looks at errorRate. As a result, pipelines with high reduction but near-zero error rates — e.g. metrics-aggregator (reductionRatio: 0.65, errorRate: 0.0001) and k8s-logs-to-s3 (reductionRatio: 0.12, errorRate: 0.002) — are tagged HIGH_ERROR_RATE with the title "… error rate above 1%" even though their actual error rates are 0.01% and 0.2%. The demo recommendation cards will display incorrect, contradictory information.

Suggested change
console.log(" generating node status events...");
for (const n of ctx.nodes) {
await prisma.nodeStatusEvent.create({
data: {
nodeId: n.id,
type: p.template.errorRate >= 0.01 ? "HIGH_ERROR_RATE" : "LOW_REDUCTION",
status: "PENDING",
title: p.template.errorRate >= 0.01 ? `${p.name} error rate above 1%` : `${p.name} drops less than 10% of bytes`,
description: p.template.errorRate >= 0.01
Prompt To Fix With AI
This is a comment left during a code review.
Path: prisma/seed.demo.ts
Line: 696-700

Comment:
**Cost recommendation type classification uses wrong discriminator**

The `type` field is derived from `reductionRatio`, but the `HIGH_ERROR_RATE` branch never looks at `errorRate`. As a result, pipelines with high reduction but near-zero error rates — e.g. `metrics-aggregator` (`reductionRatio: 0.65`, `errorRate: 0.0001`) and `k8s-logs-to-s3` (`reductionRatio: 0.12`, `errorRate: 0.002`) — are tagged `HIGH_ERROR_RATE` with the title "… error rate above 1%" even though their actual error rates are 0.01% and 0.2%. The demo recommendation cards will display incorrect, contradictory information.

```suggestion
        type: p.template.errorRate >= 0.01 ? "HIGH_ERROR_RATE" : "LOW_REDUCTION",
        status: "PENDING",
        title: p.template.errorRate >= 0.01 ? `${p.name} error rate above 1%` : `${p.name} drops less than 10% of bytes`,
        description: p.template.errorRate >= 0.01
```

How can I resolve this? If you propose a fix, please make it concise.

@TerrifiedBug TerrifiedBug merged commit cb5647a into main Apr 27, 2026
13 checks passed
@TerrifiedBug TerrifiedBug deleted the feat/demo-seed branch April 27, 2026 08:22
TerrifiedBug added a commit that referenced this pull request Apr 27, 2026
…emo-ops) (#189)

Demo seed data is an ops concern, not an app concern. The vectorflow-demo-ops
repo already owns the demo lifecycle: scripts/seed.sql + scripts/nightly-reset.sh
+ compose.yaml + a cron entry that does compose-down/up + migrate + psql-seed
+ restart-web on a 03:00 UTC schedule.

Adding prisma/seed.demo.ts to the app repo in #188 duplicated work that
already lives where it belongs and put a destructive script (rm + rebuild
all data scoped to the demo user) into the application repository where it
has no callers — the demo deployment uses seed.sql via psql, not the
seed:demo npm script.

This reverts #188's seed.demo.ts and the seed:demo package.json script.
A follow-up against vectorflow-demo-ops will port the richer dataset
(3 environments, 12 nodes, 8 pipelines, 5-min PipelineMetric granularity,
14 anomaly events, 5 cost recommendations) into seed.sql so the demo
deployment actually benefits from that work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant