feat: add performance regression gate to CI (CI-03) by Chris0Jeky · Pull Request #918 · Chris0Jeky/Taskdeck

Chris0Jeky · 2026-04-22T19:43:48Z

Summary

Add k6 API performance thresholds enforced in CI (p95 < 2s, error rate < 1%) with near-threshold warnings
Add frontend bundle size checks (entry chunk < 150 KB, single chunk < 250 KB, total JS < 1200 KB)
Wire performance regression gate into ci-extended (label: performance) and nightly workflows
Both checks emit GitHub Actions ::warning / ::error annotations and produce JSON artifacts for historical trend tracking

Closes #872

Changes

File	Purpose
`scripts/ci/check-bundle-size.mjs`	Frontend bundle size threshold script
`scripts/ci/check-k6-thresholds.mjs`	k6 JSON summary parser with GHA annotations
`tests/load/k6/board-heavy-load.js`	Tighten error rate to 1%, add p95<2s hard gate
`.github/workflows/reusable-performance-regression-gate.yml`	Reusable workflow combining both checks
`.github/workflows/ci-extended.yml`	Add performance-regression-gate job
`.github/workflows/ci-nightly.yml`	Add performance-regression-gate job
`.github/workflows/ci-required.yml`	Update topology comment
`docs/PERFORMANCE_BUDGETS.md`	Document CI enforcement section

Acceptance Criteria Coverage

k6 thresholds enforced in ci-extended lane (fail on p95 > 2s, error rate > 1%)
Frontend build size threshold enforced (warn if main bundle exceeds limit)
Performance budget violations logged as CI step warnings
Historical trend tracking (JSON artifacts with timestamps for cross-run comparison)

Test plan

Verify node scripts/ci/check-bundle-size.mjs --dist frontend/taskdeck-web/dist passes locally after vite build
Verify workflow YAML is valid (actionlint in CI)
Trigger ci-extended with performance label to validate end-to-end
Verify k6 threshold tightening does not cause false positives on baseline load

Checks vite build output against configurable size thresholds (entry chunk, single chunk, total JS) and emits GitHub Actions annotations for warnings and errors. Produces a JSON report for trend tracking. Refs #872

Parses k6 --summary-export JSON and emits GitHub Actions annotations. Reports near-threshold warnings (within 20% of limit) and produces a JSON report for historical comparison. Refs #872

Lower error rate limit from 2% to 1% and add p95<2000ms hard gate alongside existing aspirational p95<1200ms target. Refs #872

Combines k6 API load testing with frontend bundle size checks in a single CI job. Emits GitHub Actions annotations for threshold violations and near-threshold warnings. Uploads JSON reports as artifacts for historical trend tracking. Refs #872

Add performance-regression-gate job to ci-extended (triggered by 'performance' label or manual dispatch) and ci-nightly. Update topology comment in ci-required.yml. Refs #872

Add section describing k6 API thresholds, frontend bundle size thresholds, override environment variables, and artifact outputs. Refs #872

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d5825e72f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-22T19:46:21Z

+          node scripts/ci/check-bundle-size.mjs \
+            --dist frontend/taskdeck-web/dist \
+            --output-json frontend/taskdeck-web/test-results/perf/bundle-size-report.json


Fail bundle-size gate on budget breaches

This workflow step invokes check-bundle-size.mjs without --fail-on-error, so bundle budget violations only emit ::error annotations but do not fail the job. Because the script defaults to non-failing mode unless that flag is set, PRs and nightly runs can exceed the configured entry/single/total JS limits while the performance gate still passes.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-22T19:46:21Z

    checks: ["rate>0.99"],
-    http_req_duration: ["p(95)<1200", "p(99)<2500"],
+    // CI gate: p95 must stay below 2000ms (issue #872); aspirational target p95<1200ms
+    http_req_duration: ["p(95)<2000", "p(95)<1200", "p(99)<2500"],


Remove hard p95<1200 threshold from CI gate

Adding both p(95)<2000 and p(95)<1200 makes k6 enforce the stricter 1200ms threshold as a hard failure, since k6 treats each threshold expression as pass/fail and a false expression fails the test. That means runs with p95 between 1200ms and 2000ms will still fail, which contradicts the stated intent that 2000ms is the gate and 1200ms is aspirational.

Useful? React with 👍 / 👎.

- Remove aspirational p95<1200ms from k6 thresholds (all k6 thresholds are enforced, so aspirational would cause false CI failures). Moved to check-k6-thresholds.mjs as a warning instead. - Add --fail-on-error to bundle size check in CI workflow so error thresholds actually fail the step. - Use path.dirname instead of string slicing for cross-platform path handling in both CI scripts. Refs #872

Chris0Jeky · 2026-04-22T19:46:59Z

Adversarial Self-Review

Issues Found and Fixed (`b3ee9fe`)

k6 aspirational threshold would cause false failures -- All k6 thresholds are hard-enforced (exit non-zero on breach). The original diff had p(95)<1200 alongside p(95)<2000 in the k6 thresholds array. If p95 lands between 1200-2000ms, k6 would fail the CI even though it's within the issue's hard gate. Fix: Removed p(95)<1200 from k6 thresholds; the aspirational target is now emitted as a ::warning by check-k6-thresholds.mjs instead.
Bundle size errors did not fail CI -- The check-bundle-size.mjs script only exits non-zero with --fail-on-error, which was not passed in the workflow. This meant error thresholds would only emit annotations without actually failing the step. Fix: Added --fail-on-error to the workflow invocation.
String-based path splitting -- Both scripts used outputJson.substring(0, outputJson.lastIndexOf("/")) which would fail on Windows paths. Fix: Switched to path.dirname().

Remaining Considerations

Duplicate k6 run: The load-concurrency-harness and performance-regression-gate jobs both run k6 when both are triggered. This is intentional -- the load harness tests concurrency semantics (Playwright + k6 together) while the performance gate focuses on threshold enforcement with richer analysis. They use different database files and artifact names. If CI time becomes a concern, the performance gate's k6 step could be conditioned on whether load-concurrency-harness already ran.
Nightly redundancy: The nightly workflow now runs both load-concurrency-harness and performance-regression-gate. The latter provides threshold analysis that the former lacks, so both add value. The ~15 min additional CI time on nightly is acceptable.
Bundle size thresholds are generous: Current entry chunk is 88 KB vs 150 KB limit; largest chunk is 152 KB vs 250 KB limit. These provide growth headroom. The warning thresholds at 80% of limits will catch gradual creep before hitting hard gates.
Flakiness risk: k6 results can vary with CI runner load. The 2000ms p95 hard gate has ~70% headroom over typical values (~1200ms), which should absorb CI noise. The existing load harness has been running without flakiness issues at even stricter thresholds.

gemini-code-assist

Code Review

This pull request introduces performance regression gates in the CI pipeline by adding documentation and scripts to monitor frontend bundle sizes and k6 API performance thresholds. Key feedback includes correcting the reported limit in bundle size warnings, fixing fragile command-line argument parsing in the k6 threshold script, and addressing a conflict in k6 threshold definitions where multiple p95 targets would cause premature test failures. Additionally, it was suggested to dynamically derive near-threshold warnings from the k6 summary data rather than using hardcoded constants.

gemini-code-assist · 2026-04-22T19:47:07Z

+  } else if (entryKB > WARN_ENTRY_KB) {
+    const msg = `Entry chunk ${entryChunk.name} is ${formatKB(entryChunk.sizeBytes)} KB, approaching limit of ${MAX_ENTRY_KB} KB (warn at ${WARN_ENTRY_KB} KB)`;
+    console.log(`::warning::${msg}`);
+    violations.push({ level: "warning", metric: "entry_chunk_kb", value: entryKB, limit: MAX_ENTRY_KB, message: msg });


In the warning case, the limit field in the violation object is set to MAX_ENTRY_KB (the error threshold), but the warning was triggered by exceeding WARN_ENTRY_KB. This makes the JSON report misleading, as the value will appear to be below the limit despite being flagged as a violation. The limit field should reflect the threshold that was actually breached for that specific finding.

Suggested change

violations.push({ level: "warning", metric: "entry_chunk_kb", value: entryKB, limit: MAX_ENTRY_KB, message: msg });

violations.push({ level: "warning", metric: "entry_chunk_kb", value: entryKB, limit: WARN_ENTRY_KB, message: msg });

gemini-code-assist · 2026-04-22T19:47:07Z

+import { readFileSync, writeFileSync, mkdirSync } from "node:fs";
+import { dirname } from "node:path";
+
+const args = process.argv.slice(2);


The logic for finding summaryPath is fragile. Using args.find((a) => !a.startsWith("--")) will incorrectly identify the value of a flag (e.g., the path provided to --output-json) as the summary file path if that flag appears before the positional argument. This will cause the script to fail or read the wrong file.

gemini-code-assist · 2026-04-22T19:47:07Z

+const args = process.argv.slice(2);
+const summaryPath = args.find((a) => !a.startsWith("--"));
+const failOnBreach = args.includes("--fail-on-breach");
+const outputJsonIdx = args.indexOf("--output-json");


There is no check to ensure that an argument actually follows the --output-json flag. If the flag is provided as the last argument, args[outputJsonIdx + 1] will be undefined, which will cause mkdirSync or writeFileSync to throw an error later in the script.

Suggested change

const outputJsonIdx = args.indexOf("--output-json");

const outputJson = (outputJsonIdx !== -1 && args[outputJsonIdx + 1]) ? args[outputJsonIdx + 1] : null;

gemini-code-assist · 2026-04-22T19:47:08Z

+// Check for near-threshold conditions and aspirational targets
+const p95Limit = 2000; // ms -- hard gate (issue #872)
+const p95Aspirational = 1200; // ms -- aspirational target (warning only)


Hardcoding these limits (2000ms, 1%) in the script creates a maintenance risk as they are likely to drift from the actual thresholds defined in the k6 test scripts. Since the k6 JSON summary already contains the threshold definitions and their results (which you are already parsing in lines 45-59), it would be more robust to derive the 'near-threshold' warnings dynamically from the limits found in the JSON data rather than using hardcoded constants.

gemini-code-assist · 2026-04-22T19:47:08Z

    checks: ["rate>0.99"],
-    http_req_duration: ["p(95)<1200", "p(99)<2500"],
+    // CI gate: p95 must stay below 2000ms (issue #872)
+    // Aspirational target p95<1200ms is enforced as a warning via check-k6-thresholds.mjs


The inclusion of "p(95)<1200" as a threshold here conflicts with the intended CI gate of 2000ms defined in the documentation and PR description. In k6, if any threshold in the list is breached, the entire test run is marked as failed (non-zero exit code). This effectively makes 1200ms the hard gate, rendering the 2000ms threshold redundant and causing CI to fail on performance levels that were intended to only trigger warnings or be 'aspirational'.

Suggested change

// Aspirational target p95<1200ms is enforced as a warning via check-k6-thresholds.mjs

http_req_duration: ["p(95)<2000", "p(99)<2500"],

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b3ee9fe13a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-22T19:51:58Z

+
+function isEntryChunk(name) {
+  // Vite names the main entry chunk as index-<hash>.js
+  return /^index-[A-Za-z0-9]+\.js$/.test(name);


Accept full Vite hash alphabet for entry chunk matching

The entry-chunk detector only matches index- names whose hash is strictly alphanumeric, but this repo’s Vite build already emits chunk hashes containing _ and - (for example OpsConsoleView-CgIU3_Wh.js and useWorkspaceOnboardingActions-sTMco-zd.js). If the index-* hash includes either character, entryChunk becomes null and the 150 KB entry budget is silently skipped, allowing regressions between the entry limit and the 250 KB single-chunk limit to pass.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-22T19:51:58Z

+const assetsDir = join(distDir, "assets");
+const jsFiles = collectJsFiles(assetsDir);


Include root-level JS files in total bundle-size budget

The script computes total_js_kb by scanning only dist/assets, but this frontend build also generates JavaScript at the dist root (notably sw.js and workbox-*.js from the PWA pipeline). Those files are shipped to users yet excluded from the total budget and JSON report, so CI can report “total JS size” as healthy even when root-level JS regresses significantly.

Useful? React with 👍 / 👎.

Chris0Jeky · 2026-04-22T21:36:41Z

Adversarial Review of PR #918 (Performance Regression Gate)

I have reviewed all 9 bot comments, all changed/new files at the latest commit (b3ee9fe), and performed my own adversarial analysis. Here are consolidated findings.

Bot Comments: Triage

Already fixed by self-review commit (b3ee9fe):

Codex: Missing --fail-on-error -- Fixed. The workflow now passes --fail-on-error to check-bundle-size.mjs.
Codex/Gemini: Dual p95 threshold -- Fixed. board-heavy-load.js now has only p(95)<2000 (the aspirational 1200ms target is handled as a warning via check-k6-thresholds.mjs).

Still open -- legitimate issues that need fixes:

Gemini: limit field in warning violations uses wrong threshold (check-bundle-size.mjs line 122) -- In the warning branch, violations push limit: MAX_ENTRY_KB but the threshold that was actually breached is WARN_ENTRY_KB. This makes the JSON report misleading (value appears below the reported limit). The same pattern repeats for single-chunk and total-JS warnings at lines 133 and 143. Verdict: legitimate, should fix all three.
Gemini: Fragile positional arg parsing (check-k6-thresholds.mjs line 18) -- args.find((a) => !a.startsWith("--")) would match the value of --output-json if it appears before the positional argument (e.g., --output-json report.json k6-summary.json). However, the CI invocation always puts the positional arg first, so this is low risk in practice. Verdict: legitimate but low priority. Worth a defensive fix.
Gemini: Missing guard on --output-json value (check-k6-thresholds.mjs line 21) -- If --output-json is the last argument, args[outputJsonIdx + 1] is undefined, causing mkdirSync(undefined) to throw. Verdict: legitimate, should fix.
Codex: Entry chunk regex too narrow (check-bundle-size.mjs line 61) -- The regex /^index-[A-Za-z0-9]+\.js$/ does not match Vite hashes containing _ or -. Vite's default hash uses Base64url encoding which includes _ and -. If the entry chunk hash contains these characters, entryChunk is null and the 150 KB entry budget check is silently skipped. Verdict: legitimate and important. The regex should be /^index-[A-Za-z0-9_-]+\.js$/.

Bot comments I disagree with / low priority:

Gemini: Hardcoded thresholds in check-k6-thresholds.mjs (lines 87-89) -- Suggests deriving limits dynamically from k6 JSON summary. I disagree: the k6 summary only tells you whether thresholds passed/failed, not their numeric limits. You cannot extract "2000" from p(95)<2000 reliably without parsing threshold expression strings. The hardcoded constants match the k6 thresholds and the documented budgets. Verdict: not actionable; a code comment noting the coupling would suffice.
Codex: Include root-level JS files (sw.js, workbox-*.js) in total -- These are service worker files generated by VitePWA, not part of the app bundle that gates initial page load performance. Mixing them in would inflate the total and provide misleading signals. Verdict: reasonable concern but intentionally out of scope. A code comment clarifying this would help.

My Own Findings (beyond bot comments)

7. ci-extended.yml: Missing needs: [backend-solution] -- The performance-regression-gate job in ci-extended has no needs dependency, unlike every other heavy job (load-concurrency-harness, visual-regression, container-integration). In ci-nightly it correctly has needs: [backend-solution]. Without this, the perf gate will run even if the basic build fails, wasting CI minutes on a doomed run.

8. check-k6-thresholds.mjs near-threshold/aspirational overlap -- If p95 is exactly at 1600ms (which is both > 1200ms aspirational AND > 1600ms = 80% of 2000ms), the script emits the aspirational warning but not the near-threshold warning. This is because the conditions use else if chains. The aspirational warning fires first. This is acceptable behavior (you'd want the more specific warning), but worth noting that the 80% near-threshold window (1600-2000ms) is partially eclipsed by the aspirational window (1200-2000ms).

9. No checks metric near-threshold analysis -- The k6 thresholds include checks: ["rate>0.99"], but check-k6-thresholds.mjs only does near-threshold analysis for http_req_duration and http_req_failed. If the check pass rate drops to 99.1% (dangerously close to the 99% gate), no warning is emitted. Low priority since the threshold parsing loop at lines 45-59 already catches actual breaches.

10. Workflow does dotnet restore then later dotnet build --no-restore -- This is correct, but note the workflow restores the full solution (backend/Taskdeck.sln) then only builds the Api project. This is fine because --no-restore on the API project will use the already-restored packages.

Summary of required fixes (ordered by severity)

#	Severity	File	Issue
4	High	`check-bundle-size.mjs:61`	Entry chunk regex misses `_` and `-` in hash, silently skipping entry budget
7	Medium	`ci-extended.yml`	Missing `needs: [backend-solution]` on performance-regression-gate job
1	Low	`check-bundle-size.mjs:122,133,143`	Warning violations report `limit: MAX__KB` instead of `limit: WARN__KB`
3	Low	`check-k6-thresholds.mjs:21`	No guard against undefined `--output-json` value
2	Low	`check-k6-thresholds.mjs:18`	Fragile positional arg parsing

Items 4 and 7 should be fixed before merge. Items 1, 2, 3 are worth fixing while we're here.

- Widen entry chunk regex to include _ and - (Base64url hash chars) - Fix warning violations to report WARN_*_KB limit instead of MAX_*_KB - Guard --output-json against missing value in k6 threshold script - Fix fragile positional arg parsing to skip flag values correctly - Add missing needs: [backend-solution] to ci-extended perf gate job

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 91a5d5e52a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-22T21:43:41Z

+    needs:
+      - backend-solution


Remove unmet needs from performance-gate trigger path

In ci-extended, this job is intended to run for PRs labeled performance, but it also declares needs: backend-solution. Because backend-solution itself only runs for testing-labeled PRs, a PR with only the performance label will skip backend-solution and therefore skip this new gate entirely. That silently disables the performance regression checks in the exact trigger path documented for this workflow.

Useful? React with 👍 / 👎.

Chris0Jeky added 6 commits April 22, 2026 20:42

feat: add frontend bundle size check script for CI

8c1839f

Checks vite build output against configurable size thresholds (entry chunk, single chunk, total JS) and emits GitHub Actions annotations for warnings and errors. Produces a JSON report for trend tracking. Refs #872

feat: add k6 threshold analysis script for CI

340e841

Parses k6 --summary-export JSON and emits GitHub Actions annotations. Reports near-threshold warnings (within 20% of limit) and produces a JSON report for historical comparison. Refs #872

fix: tighten k6 thresholds to match CI-03 gate requirements

777158a

Lower error rate limit from 2% to 1% and add p95<2000ms hard gate alongside existing aspirational p95<1200ms target. Refs #872

feat: wire performance regression gate into ci-extended and nightly

798b9ba

Add performance-regression-gate job to ci-extended (triggered by 'performance' label or manual dispatch) and ci-nightly. Update topology comment in ci-required.yml. Refs #872

docs: document CI performance regression gate in PERFORMANCE_BUDGETS.md

5d5825e

Add section describing k6 API thresholds, frontend bundle size thresholds, override environment variables, and artifact outputs. Refs #872

github-project-automation Bot added this to Taskdeck Execution Apr 22, 2026

github-project-automation Bot moved this to Pending in Taskdeck Execution Apr 22, 2026

chatgpt-codex-connector Bot reviewed Apr 22, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 22, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 22, 2026

View reviewed changes

Chris0Jeky mentioned this pull request Apr 22, 2026

docs: post-merge sweep for 10 PRs (2026-04-22) #925

Merged

3 tasks

Chris0Jeky merged commit c8925d4 into main Apr 22, 2026
14 checks passed

github-project-automation Bot moved this from Pending to Done in Taskdeck Execution Apr 22, 2026

Chris0Jeky deleted the feat/ci-03-performance-regression-gate branch April 23, 2026 22:54

	violations.push({ level: "warning", metric: "entry_chunk_kb", value: entryKB, limit: MAX_ENTRY_KB, message: msg });
	violations.push({ level: "warning", metric: "entry_chunk_kb", value: entryKB, limit: WARN_ENTRY_KB, message: msg });

	const outputJsonIdx = args.indexOf("--output-json");
	const outputJson = (outputJsonIdx !== -1 && args[outputJsonIdx + 1]) ? args[outputJsonIdx + 1] : null;

	// Aspirational target p95<1200ms is enforced as a warning via check-k6-thresholds.mjs
	http_req_duration: ["p(95)<2000", "p(99)<2500"],

		const assetsDir = join(distDir, "assets");
		const jsFiles = collectJsFiles(assetsDir);

Conversation

Chris0Jeky commented Apr 22, 2026

Summary

Changes

Acceptance Criteria Coverage

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Chris0Jeky commented Apr 22, 2026

Adversarial Self-Review

Issues Found and Fixed (b3ee9fe)

Remaining Considerations

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Chris0Jeky commented Apr 22, 2026

Adversarial Review of PR #918 (Performance Regression Gate)

Bot Comments: Triage

My Own Findings (beyond bot comments)

Summary of required fixes (ordered by severity)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Issues Found and Fixed (`b3ee9fe`)