Skip to content

fix: optimize sp-performance-query.helper.ts#320

Merged
SgtPooki merged 1 commit intomainfrom
fix/long-running-migrations
Feb 27, 2026
Merged

fix: optimize sp-performance-query.helper.ts#320
SgtPooki merged 1 commit intomainfrom
fix/long-running-migrations

Conversation

@SgtPooki
Copy link
Copy Markdown
Collaborator

Summary

This PR rewrites the SQL generated by generateSpPerformanceQuery() in apps/backend/src/database/helpers/sp-performance-query.helper.ts to reduce migration-time cost when rebuilding SP performance materialized views.

This targets startup instability where pending migrations (1761500000000, 1761500000001) repeatedly execute expensive view creation.

Problem

pg_stat_statements shows CREATE MATERIALIZED VIEW sp_performance_all_time AS ... as a major hotspot (roughly ~100s per call, high temp block writes), which contributes to restart loops during migration retries.

What Changed

  • Replaced the previous COUNT(DISTINCT ...) + multi-join query shape with staged aggregation:
    • deals_filtered -> deal_metrics
    • retrievals_filtered -> retrieval_metrics
    • final join on storage_providers
  • Preserved existing output columns and semantics used by entities/migrations.
  • Kept the same helper entrypoint (generateSpPerformanceQuery), so migration call sites remain unchanged.

Expected Impact

  • Faster materialized view creation during migrations.
  • Lower temp I/O during migration execution.
  • Reduced startup crash-loop risk caused by long-running migration SQL.

Risk

  • No schema changes.
  • Query logic changed in a critical metrics path; staging verification is required.

Validation

  • pnpm typecheck && pnpm lint && pnpm check && pnpm format && pnpm build && pnpm test
  • pnpm -C apps/backend test -- migrations.e2e-spec.ts

Post-Deploy Verification

  1. Confirm migrations 1761500000000 and 1761500000001 are applied.
  2. Re-check pg_stat_statements for reduced runtime/temp I/O on CREATE MATERIALIZED VIEW sp_performance_all_time AS ....
  3. Confirm backend startup is stable (no repeated migration attempts).

Copilot AI review requested due to automatic review settings February 27, 2026 17:59
@FilOzzy FilOzzy added this to FOC Feb 27, 2026
@github-project-automation github-project-automation Bot moved this to 📌 Triage in FOC Feb 27, 2026
@SgtPooki SgtPooki merged commit 45d05d4 into main Feb 27, 2026
9 checks passed
@SgtPooki SgtPooki deleted the fix/long-running-migrations branch February 27, 2026 18:01
@github-project-automation github-project-automation Bot moved this from 📌 Triage to 🎉 Done in FOC Feb 27, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the SQL query generation for storage provider performance materialized views by replacing an inefficient multi-table LEFT JOIN approach with staged Common Table Expressions (CTEs). The optimization targets migrations 1761500000000 and 1761500000001, which were causing startup instability due to expensive view creation (roughly ~100s per call).

Changes:

  • Refactored query from cartesian product with COUNT(DISTINCT) to staged aggregation using CTEs
  • Preserved all existing output columns and semantic behavior
  • Eliminated the performance bottleneck caused by joining retrievals twice (r and r_ipfs)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@SgtPooki
Copy link
Copy Markdown
Collaborator Author

-- 2) Compare OLD baseline vs NEW snapshots captured in the last 10 minutes
WITH old_stats AS (
  SELECT
    count(*) AS samples,
    avg(mean_exec_time) AS mean_ms,
    avg(temp_blks_written) AS temp_blks_written_avg,
    avg(shared_blks_read) AS shared_blks_read_avg
  FROM public.perf_baseline_sp_perf
  WHERE query ILIKE '%CREATE MATERIALIZED VIEW sp_performance_all_time AS%'
    AND query ILIKE '%COUNT(DISTINCT d.id)%'    -- old query signature
),
new_stats_10m AS (
  SELECT
    count(*) AS samples,
    avg(mean_exec_time) AS mean_ms,
    avg(temp_blks_written) AS temp_blks_written_avg,
    avg(shared_blks_read) AS shared_blks_read_avg
  FROM public.perf_baseline_sp_perf
  WHERE captured_at >= now() - interval '10 minutes'
    AND query ILIKE '%CREATE MATERIALIZED VIEW sp_performance_all_time AS%'
    AND query ILIKE '%WITH deals_filtered AS%'  -- new query signature
)
SELECT
  round(o.mean_ms::numeric, 2) AS old_mean_ms,
  round(n.mean_ms::numeric, 2) AS new_mean_ms_last_10m,
  round((o.mean_ms / NULLIF(n.mean_ms, 0))::numeric, 2) AS speedup_x,
  round((100 * (1 - (n.mean_ms / NULLIF(o.mean_ms, 0))))::numeric, 2) AS exec_time_reduction_pct,
  round(o.temp_blks_written_avg::numeric, 2) AS old_temp_blks_written_avg,
  round(n.temp_blks_written_avg::numeric, 2) AS new_temp_blks_written_avg_last_10m,
  round(o.shared_blks_read_avg::numeric, 2) AS old_shared_blks_read_avg,
  round(n.shared_blks_read_avg::numeric, 2) AS new_shared_blks_read_avg_last_10m,
  o.samples AS old_samples,
  n.samples AS new_samples_last_10m
FROM old_stats o
CROSS JOIN new_stats_10m n;
old_mean_ms new_mean_ms_last_10m speedup_x exec_time_reduction_pct old_temp_blks_written_avg new_temp_blks_written_avg_last_10m old_shared_blks_read_avg new_shared_blks_read_avg_last_10m old_samples new_samples_last_10m
96195.96 219.08 439.09 99.77 746224.58 1258.00 172.35 6.00 40 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🎉 Done

Development

Successfully merging this pull request may close these issues.

3 participants